Facebook
Twitterhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2
In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.
Facebook
Twitter
According to our latest research, the global Data De-Identification Platform market size reached USD 714.2 million in 2024, driven by the escalating need for data privacy and regulatory compliance across industries. The market is experiencing robust expansion, registering a CAGR of 18.7% from 2025 to 2033. By 2033, the market is forecasted to attain USD 3,276.9 million, reflecting the surging adoption of advanced data privacy solutions and the increasing volume of sensitive data handled by organizations worldwide. This remarkable growth trajectory is primarily fueled by stricter data protection laws, rising data breach incidents, and the imperative for organizations to leverage data analytics without compromising personal information.
The primary growth factor for the Data De-Identification Platform market is the intensification of global data privacy regulations such as the General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), and other region-specific mandates. Organizations are increasingly mandated to ensure that personally identifiable information (PII) is adequately protected or anonymized before use in analytics, research, or sharing with third parties. This regulatory landscape compels enterprises to integrate sophisticated de-identification platforms into their data management workflows. Furthermore, as digital transformation accelerates across sectors, the volume and variety of data being collected and processed have grown exponentially, creating new challenges and opportunities for data privacy management. The need to balance data utility with privacy has made automated, scalable de-identification solutions a top priority for businesses aiming to remain compliant and competitive.
Another significant driver is the rising frequency and sophistication of data breaches and cyberattacks, which have heightened organizational awareness regarding the risks associated with storing and processing sensitive information. As enterprises increasingly migrate to cloud environments and adopt big data analytics, the attack surface expands, making robust data de-identification tools essential for mitigating exposure. These platforms enable organizations to anonymize or pseudonymize data, reducing the risk of re-identification even in the event of a breach. The growing adoption of artificial intelligence (AI) and machine learning (ML) further necessitates de-identification, as these technologies often require access to large datasets that must be stripped of personal identifiers to ensure ethical and legal compliance. This confluence of factors is propelling the demand for advanced, user-friendly, and highly configurable de-identification platforms.
Moreover, the proliferation of data-driven business models in sectors such as healthcare, BFSI, government, retail, and IT & telecom is amplifying the need for secure data sharing and collaboration. In healthcare, for instance, the use of patient data for research, clinical trials, and population health management demands rigorous de-identification to protect patient privacy while enabling valuable insights. Similarly, financial institutions and government agencies are leveraging data to enhance service delivery and operational efficiency, necessitating robust privacy controls. The increasing recognition of data as a strategic asset, coupled with the imperative to safeguard individual privacy, is fostering a culture of proactive data governance and driving investments in de-identification technologies.
The integration of Data De-identification AI is revolutionizing the way organizations handle sensitive information. By leveraging AI technologies, businesses can automate the process of identifying and anonymizing personal data, ensuring compliance with stringent privacy regulations. This approach not only enhances data security but also allows for more efficient data processing and analysis. AI-driven de-identification tools can dynamically adapt to new data patterns, providing organizations with a robust mechanism to protect personal information while still extracting valuable insights. As AI continues to evolve, its role in data de-identification is expected to become even more pivotal, driving innovation and setting new standards in data privacy management.
From a regional perspective, North America currently dominates the Data De-Identification P
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global clinical data de-identification pipelines market size reached USD 425.8 million in 2024. The market is experiencing robust momentum, with a recorded CAGR of 17.9% driven by the increasing adoption of advanced data privacy solutions across the healthcare sector. By 2033, the market is projected to achieve a value of USD 1,541.3 million, underscoring the escalating need for secure data handling and compliance with stringent regulatory frameworks. The primary growth factor for this sector is the rising volume of healthcare data and the critical necessity to protect patient privacy while enabling data-driven research and innovation.
The surge in healthcare digitization, coupled with the proliferation of electronic health records (EHRs), has significantly contributed to the growth of the clinical data de-identification pipelines market. Healthcare organizations are increasingly leveraging digital platforms to store, share, and analyze sensitive patient data, which in turn amplifies the risk of data breaches and unauthorized access. This scenario has heightened the demand for robust de-identification solutions, ensuring that personal health information (PHI) is rendered anonymous before being used for research, analytics, or sharing with third parties. Regulatory mandates such as HIPAA in the United States and GDPR in Europe further reinforce the need for effective data de-identification, driving both innovation and adoption in this market.
Another critical growth driver is the expanding landscape of clinical research and real-world evidence (RWE) generation. Pharmaceutical and biotechnology companies, as well as academic research institutions, rely heavily on access to vast amounts of patient data to accelerate drug development, conduct population health studies, and improve clinical outcomes. However, the sensitive nature of this data necessitates sophisticated de-identification pipelines that can efficiently strip personally identifiable information (PII) while preserving the integrity and utility of the dataset. This balance between data utility and privacy protection is fueling investments in next-generation de-identification software and services, further propelling market expansion.
The integration of artificial intelligence (AI) and machine learning (ML) technologies into de-identification pipelines is also playing a pivotal role in market growth. Advanced algorithms enable more accurate and automated identification and removal of sensitive information from unstructured clinical narratives, images, and structured datasets. This technological evolution not only enhances the scalability and reliability of de-identification processes but also addresses the growing complexity of healthcare data formats. As a result, organizations can more confidently share anonymized datasets for collaborative research, secondary analytics, and public health monitoring, all while maintaining compliance with global privacy standards.
From a regional perspective, North America continues to dominate the clinical data de-identification pipelines market, accounting for the largest share in 2024. The region’s leadership is attributed to a robust healthcare infrastructure, widespread adoption of health IT solutions, and stringent regulatory requirements surrounding data privacy. Europe follows closely, propelled by comprehensive data protection laws and strong investments in healthcare digitalization. Meanwhile, the Asia Pacific region is witnessing the fastest growth, driven by burgeoning healthcare IT adoption, increasing clinical research activities, and rising awareness about patient data privacy. Latin America and the Middle East & Africa are emerging as promising markets, supported by gradual improvements in healthcare technology and regulatory frameworks.
The clinical data de-identification pipelines market by component is segmented into software and services, each playing a distinct yet complementary role in the ecosystem. The software segment encompasses a wide array of solutions designed to automate the identification and removal of sensitive data from clinical records, including structured databases, unstructured clinical notes, and even medical images. These software platforms are increasingly leveraging AI and natural language processing (NLP) to enhance accuracy, adaptability, and speed, making them indispensabl
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the dataset files and the code used for feature engineering in the paper titled "Open Data, Private Learners: A De-Identified Dataset for Learning Analytics Research" submitted to the Nature Scientific data journal.
Facebook
Twitter
According to our latest research, the global Data Anonymization Tools market size in 2024 stands at USD 3.2 billion, demonstrating robust growth driven by the escalating need for data privacy and compliance with stringent regulatory frameworks. The market is projected to expand at a CAGR of 17.4% from 2025 to 2033, reaching a forecasted value of USD 13.4 billion by 2033. This growth trajectory is primarily fueled by the increasing volume of sensitive data generated across industries and the urgent requirement for organizations to safeguard personally identifiable information (PII) while enabling data-driven innovation.
A primary growth factor for the Data Anonymization Tools market is the intensifying regulatory landscape governing data privacy and protection worldwide. Legislation such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and similar frameworks in Asia Pacific and Latin America have compelled organizations to adopt advanced data anonymization solutions. These regulations mandate strict controls over the processing, storage, and sharing of personal data, imposing significant penalties for non-compliance. Consequently, enterprises across sectors are increasingly investing in software and services that ensure data remains anonymized and compliant, thereby mitigating risks associated with data breaches and unauthorized disclosures.
Another significant driver is the exponential growth in data volumes generated by digital transformation, cloud migration, and the proliferation of connected devices. As organizations leverage big data analytics, machine learning, and artificial intelligence to gain business insights, the challenge of protecting sensitive information while maintaining data utility becomes paramount. Data anonymization tools enable organizations to securely share and analyze datasets without exposing personal or confidential information. This capability not only supports regulatory compliance but also fosters collaboration and innovation in sectors like healthcare, finance, and retail, where data-driven decision-making is critical to competitive advantage.
Moreover, the rising frequency and sophistication of cyber threats have heightened awareness regarding the vulnerabilities associated with storing and processing unprotected data. High-profile data breaches and the resultant financial and reputational damages have underscored the importance of robust data anonymization solutions. Organizations are increasingly prioritizing the implementation of tools that can de-identify data before it is used for analytics, testing, or sharing with third parties. This trend is further amplified by the growing adoption of cloud-based services, which necessitate additional layers of data protection to address the complexities of distributed environments and cross-border data flows.
In the healthcare sector, the demand for Healthcare Data Anonymization Services is on the rise, driven by the need to protect patient privacy while enabling the use of data for research and innovation. Healthcare organizations are increasingly adopting these services to comply with regulations like HIPAA and GDPR, which mandate stringent data protection measures. By anonymizing patient data, healthcare providers can safely share information for clinical trials, population health studies, and collaborative research without compromising patient confidentiality. This not only enhances the ability to conduct meaningful research but also supports the development of personalized medicine and improved patient outcomes.
Regionally, North America dominates the Data Anonymization Tools market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The United States, in particular, benefits from a highly developed technology infrastructure, a mature regulatory environment, and a strong presence of leading data security vendors. EuropeÂ’s market growth is propelled by the stringent enforcement of GDPR and the widespread adoption of privacy-enhancing technologies across industries. Meanwhile, Asia Pacific is experiencing rapid expansion due to increasing digitalization, rising awareness of data privacy, and the introduction of new data protection regulations in countries like India, China,
Facebook
Twitter
According to our latest research, the global medical imaging de-identification software market size reached USD 315 million in 2024, driven by the increasing adoption of digital healthcare solutions and stringent regulatory requirements for patient data privacy. The market is expected to grow at a robust CAGR of 13.2% during the forecast period, reaching approximately USD 858 million by 2033. The primary growth factor fueling this expansion is the rising volume of medical imaging data and the escalating need to ensure compliance with data protection laws such as HIPAA, GDPR, and other regional regulations.
The growth trajectory of the medical imaging de-identification software market is underpinned by the exponential increase in digital imaging procedures across healthcare facilities worldwide. As advanced imaging modalities like MRI, CT, and PET scans become standard in diagnostic workflows, the volume of data generated has surged. This data often contains sensitive patient information, making it imperative for healthcare organizations to adopt robust de-identification solutions. The proliferation of health information exchanges and the increasing emphasis on interoperability have further heightened the need for secure and compliant data sharing. These factors collectively foster a conducive environment for the adoption of de-identification software, as organizations seek to balance data utility with stringent privacy requirements.
Another major driver is the evolving regulatory landscape that mandates strict adherence to patient confidentiality and data protection standards. Regulatory frameworks such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States, the General Data Protection Regulation (GDPR) in Europe, and similar regulations in Asia Pacific and other regions are compelling healthcare providers and research institutions to implement advanced de-identification solutions. These regulations impose hefty penalties for non-compliance, further incentivizing investments in software that can automate and streamline the de-identification process. Moreover, the growing trend of collaborative research and data sharing among healthcare entities necessitates reliable de-identification tools to facilitate secure and lawful data exchange.
Technological advancements in artificial intelligence and machine learning are also playing a pivotal role in shaping the medical imaging de-identification software market. Modern solutions leverage AI-driven algorithms to enhance the accuracy and efficiency of de-identification processes, reducing the risk of inadvertent data leaks. These innovations are particularly valuable in large-scale research projects, where massive datasets must be anonymized rapidly and without compromising data integrity. Furthermore, the integration of de-identification software with existing healthcare IT infrastructure, such as PACS and EHR systems, is becoming increasingly seamless, making adoption easier for end-users. This technological evolution is expected to drive further market growth over the next decade.
From a regional perspective, North America currently dominates the medical imaging de-identification software market, accounting for the largest share in 2024. The regionÂ’s leadership is attributed to the presence of advanced healthcare infrastructure, high adoption rates of digital health technologies, and stringent regulatory frameworks. Europe follows closely, propelled by GDPR compliance and increasing investments in healthcare IT. The Asia Pacific region is experiencing the fastest growth, fueled by expanding healthcare access, rapid digitalization, and rising awareness of data privacy. Latin America and the Middle East & Africa are also witnessing gradual adoption, supported by ongoing healthcare modernization initiatives and regulatory developments.
In the realm of healthcare technology, Patient Identity Matching Software has emerged as a critical tool for ensuring the accuracy and integrity of patient data across various platforms. This software plays a pivotal role in minimizing errors related to patient identification, which can lead to serious medical mishaps. By utilizing advanced algorithms and data matching techniques, Patient Identity Matching Software
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Real-World Data De-identification AI market size reached USD 1.85 billion in 2024, with a robust compound annual growth rate (CAGR) of 21.6% projected from 2025 to 2033. The market is anticipated to achieve a value of USD 13.95 billion by 2033. This remarkable growth is primarily driven by the escalating need for secure data sharing and compliance with stringent privacy regulations across industries, particularly in healthcare, life sciences, and insurance sectors. As organizations increasingly leverage real-world data (RWD) for advanced analytics, clinical research, and operational efficiency, the demand for sophisticated AI-powered de-identification solutions continues to surge worldwide.
One of the principal growth factors fueling the Real-World Data De-identification AI market is the intensifying focus on data privacy and regulatory compliance. Global regulations such as the General Data Protection Regulation (GDPR) in Europe, the Health Insurance Portability and Accountability Act (HIPAA) in the United States, and other regional data protection laws have necessitated the adoption of robust de-identification technologies. Organizations in healthcare, pharmaceuticals, and insurance are increasingly mandated to anonymize or pseudonymize sensitive data before it can be used for research, analytics, or shared with third parties. AI-driven de-identification solutions offer the scalability, accuracy, and adaptability required to process vast volumes of structured and unstructured data, ensuring compliance while preserving the analytical value of the data. This regulatory landscape, combined with the growing value placed on ethical data stewardship, continues to propel market expansion.
Another significant driver is the exponential growth in healthcare and life sciences data, fueled by the proliferation of electronic health records (EHRs), wearable devices, genomics, and real-world evidence (RWE) initiatives. The integration of AI for de-identification enables organizations to unlock the full potential of these data sources without compromising patient privacy. Pharmaceutical companies, for example, leverage de-identified real-world data for drug development, safety monitoring, and post-market surveillance. Similarly, insurers and government agencies utilize anonymized datasets to enhance risk assessment, optimize healthcare delivery, and inform policy decisions. The ability of AI-powered de-identification tools to rapidly and accurately process diverse data types—including text, images, and audio—further amplifies their adoption across multiple sectors, driving sustained market growth.
Technological advancements in artificial intelligence and machine learning are also instrumental in shaping the Real-World Data De-identification AI market. The evolution of natural language processing (NLP), deep learning, and pattern recognition algorithms has significantly improved the precision and efficiency of de-identification processes. These innovations enable the automation of previously labor-intensive tasks, such as identifying and masking personally identifiable information (PII) in complex datasets. Moreover, AI-based solutions can dynamically adapt to evolving data formats and regulatory requirements, offering future-proof capabilities to organizations. The continuous investment in R&D and strategic collaborations between technology providers and industry stakeholders further stimulate innovation, expanding the scope and effectiveness of de-identification solutions.
From a regional perspective, North America currently dominates the Real-World Data De-identification AI market, accounting for the largest revenue share in 2024. This leadership is attributed to the region’s advanced healthcare infrastructure, high adoption of digital technologies, and proactive regulatory environment. Europe follows closely, driven by stringent data protection laws and significant investments in healthcare digitization. The Asia Pacific region, meanwhile, is witnessing the fastest growth rate, propelled by the rapid expansion of healthcare IT, increasing awareness of data privacy, and supportive government initiatives. Latin America and the Middle East & Africa are also emerging as promising markets, albeit at a comparatively nascent stage, as organizations in these regions begin to recognize the value of AI-driven data de-identification for compliance and innovation.
Facebook
Twitter
According to our latest research, the global Imaging Study De-Identification Gateways market size reached USD 612.4 million in 2024, and is expected to grow at a robust CAGR of 16.7% from 2025 to 2033. By the end of the forecast period, the market is projected to reach USD 2,134.7 million. This remarkable growth trajectory is driven by the heightened demand for data privacy compliance and the rapid adoption of digital health technologies worldwide, as regulatory frameworks such as HIPAA and GDPR increasingly mandate strict de-identification of medical imaging data.
The primary growth factor fueling the Imaging Study De-Identification Gateways market is the intensifying focus on patient privacy and data security. With the proliferation of digital health records and the exponential rise in imaging studies, healthcare providers are under mounting pressure to ensure that sensitive patient information is adequately protected. De-identification gateways have become indispensable for organizations aiming to comply with complex regulatory requirements. These solutions systematically remove or obfuscate personally identifiable information (PII) from imaging data, thereby enabling secure data sharing for clinical collaboration, research, and artificial intelligence (AI) model training. The surge in telemedicine and remote diagnostics further amplifies the need for robust de-identification solutions, as data is increasingly exchanged across disparate systems and geographies, exposing it to potential breaches if not adequately protected.
Another significant driver is the integration of AI and machine learning technologies into medical imaging workflows. As healthcare organizations leverage large, diverse datasets to develop and validate AI algorithms, the necessity for de-identified imaging data becomes paramount. De-identification gateways facilitate the ethical and legal use of patient data for secondary purposes such as research and clinical trials, without compromising patient confidentiality. The growing adoption of cloud-based healthcare solutions is also propelling the market, as cloud environments demand advanced de-identification capabilities to safeguard data during storage, processing, and transmission. Furthermore, the increasing collaboration between hospitals, research institutes, and technology vendors is fostering innovation and accelerating the deployment of sophisticated de-identification solutions.
The market is also benefitting from the global trend toward interoperability and data standardization in healthcare. As health systems strive to integrate disparate imaging modalities and electronic health record (EHR) platforms, de-identification gateways play a crucial role in ensuring that data exchanged across networks adheres to privacy standards. The rise in cross-border research initiatives and international clinical trials is further stimulating demand, as organizations must navigate a complex web of privacy laws and data protection regulations. Additionally, the emergence of precision medicine and personalized healthcare is driving the need for large-scale, anonymized imaging datasets, which can only be achieved through robust de-identification processes. These trends collectively underscore the critical importance of de-identification gateways in the modern healthcare ecosystem.
Regionally, North America dominates the Imaging Study De-Identification Gateways market, accounting for the largest revenue share in 2024, owing to stringent regulatory mandates, advanced healthcare infrastructure, and early adoption of digital health technologies. Europe follows closely, driven by the enforcement of GDPR and a strong emphasis on data privacy across the region. The Asia Pacific region is witnessing the fastest growth, supported by rapid healthcare digitization, expanding diagnostic imaging capabilities, and increasing investments in health IT. Latin America and the Middle East & Africa are also showing promising growth, albeit from a smaller base, as governments and healthcare providers in these regions recognize the value of secure data sharing and compliance with international standards. Overall, the global landscape is characterized by a growing awareness of privacy risks and a collective push toward secure, compliant imaging data management.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global Imaging Study De-Identification Services market size reached USD 412.5 million in 2024, reflecting robust expansion fueled by rising data privacy demands. The market is projected to grow at a CAGR of 16.4% from 2025 to 2033, reaching an estimated USD 1,478.2 million by 2033. The key growth factor underpinning this trajectory is the increasing adoption of digital imaging in healthcare, alongside stringent regulatory frameworks such as HIPAA and GDPR that mandate the protection of patient information.
The primary driver for the Imaging Study De-Identification Services market is the exponential growth in medical imaging data, propelled by technological advancements in imaging modalities and the digital transformation of healthcare systems globally. As hospitals and diagnostic centers transition to electronic health records (EHRs) and Picture Archiving and Communication Systems (PACS), the volume of imaging studies containing sensitive patient information has surged. This growth necessitates efficient de-identification services to safeguard patient privacy and enable compliant data sharing. Additionally, the utilization of artificial intelligence and machine learning in radiology research has escalated the demand for large, anonymized datasets, further amplifying the need for reliable de-identification solutions.
Another significant growth factor is the increasing emphasis on clinical research and collaborative studies across institutions and borders. The ability to share imaging data without compromising patient confidentiality is crucial for multi-center trials, epidemiological studies, and the development of AI-driven diagnostic tools. Regulatory agencies worldwide are enforcing strict data privacy regulations, compelling healthcare organizations to adopt de-identification services. The integration of automated de-identification solutions, which offer scalability and accuracy, is rapidly gaining traction, enhancing the efficiency of data sharing and research processes. This trend is particularly prominent in regions with advanced healthcare infrastructure and a high prevalence of research activities.
The emergence of hybrid de-identification models, which combine the strengths of automated and manual approaches, is also contributing to market expansion. These solutions address the limitations of fully automated systems by incorporating human oversight for complex cases, ensuring both compliance and data integrity. As healthcare providers and research organizations increasingly recognize the value of de-identified imaging data for secondary uses such as AI training, population health management, and regulatory submissions, the demand for tailored de-identification services continues to rise. This shift is further supported by the growing awareness of data breaches and the associated financial and reputational risks.
From a regional perspective, North America remains the dominant market for Imaging Study De-Identification Services, driven by a mature healthcare ecosystem, stringent regulatory requirements, and early adoption of digital health technologies. Europe follows closely, benefiting from robust data protection laws and active research collaborations. The Asia Pacific region is witnessing the fastest growth, fueled by expanding healthcare infrastructure, rising investments in medical research, and increasing awareness of data privacy. Latin America and the Middle East & Africa are also experiencing gradual adoption, supported by government initiatives and international partnerships aimed at improving healthcare data management and compliance.
The Service Type segment within the Imaging Study De-Identification Services market is categorized into Automated De-Identification, Manual De-Identification, and Hybrid De-Identification. Automated De-Identification services have emerged as the leading segment, owing to their ability to process vast volumes of imaging data efficiently and accurately. These solutions leverage advanced algorithms and artificial intelligence to identify and redact patient identifiers from imaging studies, significantly reducing the risk of human error and ensuring compliance with regulatory standards. The scalability of automated systems makes them particularly attractive for large hospitals, research networks, and organizations handling multi-center studies
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains anonymized DICOM images acquired as part of a cardiac T1 mapping study using a 5T MRI system. All personal identifiers have been removed in compliance with DICOM de-identification standards and institutional ethics approval. The dataset includes pre- and post-contrast MOLLI sequences from healthy volunteers and patients. It is made publicly available for academic and non-commercial research purposes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This file contains de-identified and anonymized healthcare facility-level raw primary data used in the analysis.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global veterinary image de-identification tools market size reached USD 148.7 million in 2024, reflecting growing adoption of data privacy solutions in veterinary healthcare. The market is expected to expand at a robust CAGR of 13.2% from 2025 to 2033, with the projected market size reaching USD 430.6 million by 2033. This growth is primarily driven by increasing regulatory requirements for data privacy, the proliferation of digital imaging in veterinary diagnostics, and the rising need to facilitate secure data sharing for research and telemedicine applications.
The primary growth factor for the veterinary image de-identification tools market is the mounting pressure to comply with data privacy regulations such as the General Data Protection Regulation (GDPR) in Europe and the Health Insurance Portability and Accountability Act (HIPAA) in the United States. Although these regulations are primarily human healthcare-focused, their principles are increasingly being adopted within the veterinary sector, especially as digital health records and imaging become standard practice. Veterinary clinics and hospitals are now required to anonymize or de-identify sensitive image data before sharing it for research, consultation, or educational purposes. This regulatory environment has created a robust demand for advanced de-identification tools that can efficiently strip personally identifiable information (PII) from veterinary images without compromising diagnostic quality.
Another significant driver is the rapid digitization of veterinary healthcare, which has led to a surge in the volume and complexity of veterinary imaging data. Modern diagnostic tools such as digital radiography, computed tomography (CT), and magnetic resonance imaging (MRI) are now commonplace in both small and large animal practices. With the adoption of Picture Archiving and Communication Systems (PACS) and Electronic Medical Records (EMR), the need to manage, store, and share vast amounts of imaging data securely has become paramount. De-identification tools are essential in this context, enabling seamless data interoperability while ensuring that client and patient confidentiality is maintained. Furthermore, these tools are increasingly integrated with cloud-based platforms, facilitating remote consultations and telemedicine, which have seen significant growth post-pandemic.
The market is further propelled by the expanding scope of veterinary research and the globalization of veterinary clinical trials. As collaborations between academic institutions, research organizations, and industry partners intensify, there is a growing need to share large datasets of veterinary images across borders. De-identification tools play a critical role in enabling this data exchange while adhering to diverse regional privacy standards. Additionally, the increasing focus on artificial intelligence (AI) and machine learning in veterinary diagnostics necessitates access to large, anonymized image datasets for algorithm training and validation. This trend is expected to further accelerate the adoption of veterinary image de-identification solutions in the coming years.
From a regional perspective, North America currently dominates the veterinary image de-identification tools market, owing to its advanced veterinary healthcare infrastructure, high adoption of digital technologies, and stringent data privacy regulations. Europe follows closely, driven by proactive regulatory frameworks and a strong focus on veterinary research. The Asia Pacific region is anticipated to witness the fastest growth during the forecast period, fueled by increasing investment in animal healthcare, rapid digitalization, and rising awareness about data security. Latin America and the Middle East & Africa are also expected to experience steady growth, supported by gradually improving veterinary services and growing emphasis on research and development.
The veterinary image de-identification tools market, when analyzed by component, is segmented into software and services. The software segment commands a significant share of the market, as veterinary organizations increasingly rely on automated solutions to anonymize images efficiently and consistently. These software tools are designed to integrate seamlessly with existing imaging modalities and hospital information systems, offering features s
Facebook
Twitter
As per our latest research, the veterinary image de-identification tools market size globally stood at USD 245.8 million in 2024. The market is experiencing robust growth, and it is projected to reach USD 783.6 million by 2033, expanding at a remarkable CAGR of 13.5% during the forecast period from 2025 to 2033. The primary growth drivers include the rising adoption of digital imaging technologies in veterinary healthcare, stricter data privacy regulations, and the increasing demand for advanced solutions that ensure compliance and data security in veterinary practices.
The surge in veterinary digital imaging, including X-rays, CT scans, and MRIs, is fundamentally transforming animal healthcare and research. As imaging data becomes more central to diagnostics and research, the need to protect sensitive information—such as client details, animal identifiers, and proprietary research data—has grown exponentially. The implementation of data privacy regulations, such as the General Data Protection Regulation (GDPR) in Europe and similar frameworks in North America and Asia Pacific, is compelling veterinary institutions and research bodies to adopt robust de-identification tools. These tools ensure that personal and sensitive data is removed or anonymized before images are shared for research, collaboration, or educational purposes, thereby mitigating the risk of data breaches and maintaining client trust.
Another significant growth factor for the veterinary image de-identification tools market is the increasing collaboration between veterinary hospitals, research institutes, and diagnostic centers. With a growing trend toward multi-institutional studies and the sharing of large imaging datasets for AI-based diagnostics and clinical research, maintaining data privacy and compliance is paramount. De-identification tools facilitate secure data sharing, enabling organizations to participate in global research initiatives without compromising on privacy or regulatory requirements. This is particularly important as the use of AI and machine learning in veterinary diagnostics accelerates, demanding large volumes of high-quality, de-identified imaging data for algorithm training and validation.
Furthermore, the rapid digitalization of veterinary practices, coupled with the growing awareness of cybersecurity threats, is driving the adoption of advanced de-identification solutions. Veterinary professionals are increasingly recognizing the risks associated with storing and transmitting identifiable image data, especially as telemedicine and remote consultations gain traction. The integration of de-identification tools into Picture Archiving and Communication Systems (PACS) and hospital management software is streamlining workflows, reducing manual errors, and ensuring seamless compliance with evolving regulatory standards. This trend is particularly pronounced in developed markets, where digital infrastructure and regulatory enforcement are more mature, but it is also gaining momentum in emerging economies as they modernize their veterinary healthcare systems.
From a regional perspective, North America continues to dominate the veterinary image de-identification tools market due to its advanced veterinary healthcare infrastructure, high adoption of digital imaging, and stringent data privacy laws. However, Europe is rapidly catching up, driven by strong regulatory frameworks and significant investments in veterinary research and education. The Asia Pacific region, meanwhile, is witnessing the fastest growth, propelled by increasing pet ownership, rising livestock healthcare needs, and the digital transformation of veterinary services. Latin America and the Middle East & Africa are also showing promising potential, although growth in these regions is somewhat tempered by infrastructural and regulatory challenges.
The component segment of the veterinary image
Facebook
Twitterhttps://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
The CARMEN-I corpus comprises 2,000 clinical records, encompassing discharge letters, referrals, and radiology reports from Hospital Clínic of Barcelona between March 2020 and March 2022. These reports, primarily in Spanish with some Catalan sections, cover COVID-19 patients with diverse comorbidities like kidney failure, cardiovascular diseases, malignancies, and immunosuppression. The corpus underwent thorough anonymization, validation, and expert annotation, replacing sensitive data with synthetic equivalents. A subset of the corpus features annotations of medical concepts by specialists, encompassing symptoms, diseases, procedures, medications, species, and humans (including family members). CARMEN-I serves as a valuable resource for training and assessing clinical NLP techniques and language models, aiding tasks like de-identification, concept detection, linguistic modifier extraction, document classification, and more. It also facilitates training researchers in clinical NLP and is a collaborative effort involving Barcelona Supercomputing Center's NLP4BIA team, Hospital Clínic, and Universitat de Barcelona's CLiC group.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This updated labeled dataset builds upon the initial systematic review by van de Schoot et al. (2018; DOI: 10.1080/00273171.2017.1412293), which included studies on post-traumatic stress symptom (PTSS) trajectories up to 2016, sourced from the Open Science Framework (OSF). As part of the FORAS project - Framework for PTSS trajectORies: Analysis and Synthesis (funded by the Dutch Research Council, grant no. 406.22.GO.048 and pre-registered at PROSPERO under ID CRD42023494027), we extended this dataset to include publications between 2016 and 2023. In total, the search identified 10,594 de-duplicated records obtained via different search methods, each published with their own search query and result: Exact replication of the initial search: OSF.IO/QABW3 Comprehensive database search: OSF.IO/D3UV5 Snowballing: OSF.IO/M32TS Full-text search via Dimensions data: OSF.IO/7EXC5 Semantic search via OpenAlex: OSF.IO/M32TS Humans (BC, RN) and AI (Bron et al., 2024) have screened the records, and disagreements have been solved (MvZ, BG, RvdS). Each record was screened separately for Title, Abstract, and Full-text inclusion and per inclusion criteria. A detailed screening logbook is available at OSF.IO/B9GD3, and the entire process is described in https://doi.org/10.31234/osf.io/p4xm5. A description of all columns/variables and full methodological details is available in the accompanying codebook. Important Notes: Duplicates: To maintain consistency and transparency, duplicates are left in the dataset and are labeled with the same classification as the original records. A filter is provided to allow users to exclude these duplicates as needed. Anonymized Data: The dataset "...._anonymous" excludes DOIs, OpenAlex IDs, titles, and abstracts to ensure data anonymization during the review process. The complete dataset, including all identifiers, is uploaded under embargo and will be publicly available on 01-10-2025. This dataset serves not only as a valuable resource for researchers interested in systematic reviews of PTSS trajectories and facilitates reproducibility and transparency in the research process but also for data scientists who would like to mimic the screening process using different machine learning and AI models.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the complete replication package for the manuscript “Digital Nomads as Unintentional Brand Intermediaries". It includes de-identified data, analysis code, study instruments, and supplementary sources to reproduce all tables and figures.
Contents
Study 1 (Survey): cleaned, de-identified dataset; codebook; SEM/CFA scripts and output; item wordings and scale anchors.
Study 2 (Experiment): cleaned, de-identified dataset; pre-registration (if applicable); analysis scripts for manipulation checks, ANCOVA, and mediation; instrument and manipulation-check items.
Study 3 (Interviews): anonymized excerpted quotations used in the paper; theme/codebook; audit trail summary (sampling, saturation notes).
Supplementary materials: figure/table source files; robustness checks (controls, multigroup, invariance); README with step-by-step replication instructions and software versions.
Restrictions
Full video stimuli are not redistributed due to third-party rights. We provide transcripts, frame stills, detailed metadata (links, durations, posting dates), and procedures to reconstruct the stimuli set. Researchers may request time-limited access to the files for verification under a non-distribution agreement.
Anonymization & Compliance
All datasets are de-identified and stored per GDPR and institutional ethics approval. Any indirect identifiers were removed or binned.
Licensing & Citation
Data and materials: CC BY 4.0. Please cite this repository using its DOI (https://doi.org/10.5281/zenodo.17199056) when reusing these materials.
Reproducibility
A replication script (R/Python) reproduces the tables and figures from the raw de-identified data; session info and package versions are provided.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The data used to support the rFSS-II paper was de-identified to protect patient privacy.
Facebook
TwitterTo avoid risk of indirect re-identification, data were de-identified by removing any explicitly identifying columns (e.g. contact information, IP address, role/occupation, country, etc.). The dataset was further de-identified by removing all open response questions that might have allowed respondents to include identifying information (e.g. "Other, please specify" open response questions).
Respondents were presented with a short message before starting the survey, which included a “Privacy Policy” section that stated, "The information you submit to the survey will be aggregated and anonymized."
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This CT series comes from the Patient Contributed Image Repository. As per the PCIR website: "medical images and reports have been de-identified and anonymized and contributed to the public domain. There are no restrictions on their use. Anyone may download them. There is no data use agreement required to be signed. No login is necessary."
This particular dataset contains just the sequence of CT scans from series 7 of this file. The intent is to offer a toy dataset to play with sequential DICOM images and plot axial, sagittal, and coronal slices.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These three datasets contain de-identified data on testing for pests in imports of horticultural products into Australia in a period within 2021-2023. The creator of this data page is distributing the data with the permission of the data owner (emails 14/6/2024, 25/6/2024, 1/7/2024).
Dataset anonymized_hort_aggdat_01-07-2024.csv
This dataset (anonymized_hort_aggdat_01-07-2024.csv) has one row for each line of fruit or vegetables tested. Consignments of fruits or vegetables are divided into lines (details may depend on the type of fruit or vegetable). 600 units are sampled from each line, where a unit is usually a single fruit or vegetable (rounding may occur, for example if fruit are grouped into punnets). A result is then obtained from each line ("inspection result"). If the result is not Pass, then fumigation or other actions may be taken. The columns of the data are:
| Variable Name | Values | Definition |
| entry | ANONYMIZED_VALUE1, ANONYMIZED_VALUE2, etc | anonymised identifier of the consignment |
| volume | numeric | volume of the line |
| volume_unit |
KG – kilograms | units in which volume is measured (almost always kg) |
| arrival_date | date | |
| importer_name | ANONYMIZED_VALUE_1, ANONYMIZED_VALUE2 etc | anonymised identifer of the importer |
| supplier_name | ANONYMIZED_VALUE_1, ANONYMIZED_VALUE2 etc | anonymised identifer of the supplier |
| cargo_type | the freight type of the consignment (e.g., FCL and FCX are container types via sea and AIR is air freight) | |
| port | character valued code | destination port of the consignment/entry |
| country | ANONYMIZED_VALUE_1, ANONYMIZED_VALUE2 etc | anonymised country of origin |
| finalise_type | whether the line was released as normal, from biosecurity control, disposed of, destroyed or exported | |
| document_failure | Pass, Fail | whether a failure was recorded against a line at onshore document verification. Note: A fail then followed by a pass and goods moving to inspection, will display fail. |
| inspection_result | Pass, Fail | whether a failure was recorded against a line at onshore verification inspection. Note: A fail then followed by a pass and goods being released, will display fail. Lines that qualified for the Compliance-Based Intervention Scheme (CBIS) may not have been inspected as a result. See here for more information about CBIS. |
| fumigated | Not fumigated, Fumigated | Whether line was fumigated |
| other_treatment | character | other remedial treatment applied to the line/entry (reconditioning for seeds) |
| cbis_commodity |
Fresh CBIS, Other |
"Fresh CBIS" means that the line qualified for the Compliance-Based Intervention Scheme (CBIS) and may not have been inspected as a result. "Other" means that the line did not qualify for CBIS. See here for more information about CBIS. |
| actionable | Where the department's Science Services Group have determined that detected biosecurity risk material requires remedial action to mitigate biosecurity risk. Note: Seeds are only actioned if a high risk weed seed is detected or were 3 or more species of biosecurity concern are identified. | |
| commodity | character | Commodity description |
| rcd_nbr | 1, 2, 3 etc | anonymised identifier of line |
Dataset anonymized_hort_pests_01-07-2024.csv
This dataset contains a row for when there is a pest detection. Note that not all pest detections require action. It may be linked to anonymized_hort_aggdat_01-07-2024.csv using rcd_nbr as a key. The columns of the data are:
| Variable Name | Values | Definition |
| rcd_nbr | 1, 2, 3 etc | anonymised identifier of line |
| bottle_number | numeric | identifier for a particular pest for a particular line |
| pest_type | Disease, Invertebrate, Plant, Seed, Vertebrate, Na, blank | type of potential pest |
Dataset anonymized_hort_seeds_incidents_01-07-2024.csv
This dataset contains a row for seeds detections. Note that not all seed detections require action. It may be linked to anonymized_hort_aggdat_01-07-2024.csv by rcd_nbr as a key and to anonymized_hort_pests_01-07-2024.csv using bottle_number as a key. The columns of the data are:
| Variable Name | Values | Definition |
| rcd_nbr | 1, 2, 3 etc | anonymised identifier of line |
| bottle_number | numeric | identifier for a particular pest for a particular line |
| pest_type | Disease, Invertebrate, Plant, Seed, Vertebrate, Na, blank | type of potential pest (always equal to Seed in this spreadsheet) |
| comments | text field | comments |
| other_treatment | Reconditioned, or blank | other treatments applied |
Facebook
Twitterhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2
In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.