23 datasets found
  1. h

    Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure...

    • heidata.uni-heidelberg.de
    pdf, tsv, txt
    Updated Nov 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich (2024). Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure Score Analytics [data] [Dataset]. http://doi.org/10.11588/DATA/MXM0Q2
    Explore at:
    tsv(197975), tsv(190296), tsv(191831), pdf(640128), tsv(107100), txt(3421), tsv(286102), tsv(106632)Available download formats
    Dataset updated
    Nov 20, 2024
    Dataset provided by
    heiDATA
    Authors
    Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich
    License

    https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2

    Description

    In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.

  2. D

    De-identified Healthcare Data Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). De-identified Healthcare Data Market Research Report 2033 [Dataset]. https://dataintelo.com/report/de-identified-healthcare-data-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    De-identified Healthcare Data Market Outlook




    According to our latest research, the global de-identified healthcare data market size reached USD 3.4 billion in 2024. The market is expanding at a robust CAGR of 15.2% and is forecasted to attain a value of USD 10.9 billion by 2033. This remarkable growth is primarily driven by the increasing demand for privacy-compliant data solutions that enable research, analytics, and innovation without compromising patient confidentiality. The adoption of stringent data privacy regulations and the rapid digitization of healthcare records are further fueling the market’s momentum.




    One of the primary growth factors for the de-identified healthcare data market is the rising emphasis on patient privacy and security. The implementation of regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States and the General Data Protection Regulation (GDPR) in Europe has necessitated robust data de-identification processes. These regulations mandate the removal of personally identifiable information from healthcare datasets, making de-identified data a critical resource for organizations aiming to comply with legal requirements while still leveraging valuable insights for research and analytics. As healthcare organizations increasingly digitize patient records and data sharing becomes more prevalent, the demand for effective de-identification solutions continues to surge, driving market growth.




    Another significant driver is the exponential growth in healthcare data volume, propelled by the widespread adoption of electronic health records (EHRs), wearable devices, and genomics. The sheer scale and diversity of healthcare data present both opportunities and challenges for healthcare stakeholders. De-identified data allows organizations to harness this vast information pool for applications such as clinical research, drug development, population health management, and artificial intelligence (AI) model training. Pharmaceutical and biotechnology companies, in particular, are leveraging de-identified datasets to accelerate drug discovery, optimize clinical trials, and identify patient cohorts, thereby shortening development timelines and reducing costs. This trend is expected to intensify as precision medicine and data-driven healthcare models gain traction globally.




    Technological advancements are also playing a pivotal role in shaping the de-identified healthcare data market. The emergence of sophisticated de-identification software, advanced encryption algorithms, and secure data sharing platforms has enhanced the ability of organizations to anonymize and utilize healthcare data effectively. Artificial intelligence and machine learning tools are being increasingly deployed to automate the de-identification process, improving scalability and accuracy. Furthermore, partnerships between healthcare providers, technology vendors, and research institutions are fostering innovation and facilitating the adoption of best practices in data privacy. As these technologies continue to evolve, they are expected to lower operational barriers and expand the market’s reach across various healthcare segments.




    From a regional perspective, North America holds the largest share of the de-identified healthcare data market, accounting for over 42% of global revenue in 2024. This dominance is attributed to the region’s advanced healthcare infrastructure, strong regulatory framework, and high adoption of digital health technologies. Europe follows closely, driven by stringent data privacy laws and robust investments in healthcare IT. The Asia Pacific region is emerging as a high-growth market, propelled by rapid digital transformation, increasing healthcare expenditure, and growing awareness of data privacy issues. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as governments and healthcare organizations prioritize data-driven healthcare initiatives.



    Component Analysis




    The de-identified healthcare data market by component is segmented into software, services, and platforms. Software solutions form the backbone of the market, providing automated tools for data masking, anonymization, and encryption. These solutions are in high demand due to their ability to efficiently process vast volumes of healthcare data while ensuring compliance with regulatory standards. A

  3. D

    De-Identification Software For Healthcare Data Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). De-Identification Software For Healthcare Data Market Research Report 2033 [Dataset]. https://dataintelo.com/report/de-identification-software-for-healthcare-data-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    De-Identification Software for Healthcare Data Market Outlook



    According to our latest research, the global market size for De-Identification Software for Healthcare Data in 2024 stands at USD 468 million, with a robust compound annual growth rate (CAGR) of 20.1% projected from 2025 to 2033. By the end of 2033, the market is forecasted to reach an impressive USD 2,633 million, reflecting substantial momentum driven by increasing regulatory demands and the proliferation of digital health records. As per our latest research, the primary growth driver for this sector is the intensifying focus on patient privacy and security in healthcare data management, propelled by global data protection regulations and the expanding adoption of electronic health records (EHRs).




    The growth trajectory of the De-Identification Software for Healthcare Data Market is significantly influenced by the evolving regulatory landscape governing patient information privacy. Stringent regulations such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States, the General Data Protection Regulation (GDPR) in Europe, and similar frameworks globally are compelling healthcare organizations to invest in advanced de-identification solutions. These regulations mandate the removal or masking of personally identifiable information (PII) from healthcare datasets before sharing, research, or analytics, to safeguard patient privacy. As healthcare data becomes increasingly digitized, the risk of data breaches and unauthorized access grows, making robust de-identification software not just a compliance tool but a critical component of risk management strategies for healthcare providers, payers, and researchers.




    Another significant growth factor is the rising volume and complexity of healthcare data generated through diverse sources such as EHRs, wearables, genomic sequencing, and telemedicine platforms. The integration of artificial intelligence (AI) and machine learning (ML) technologies into de-identification software has enabled more sophisticated and automated data anonymization processes, reducing manual intervention and improving accuracy. This technological advancement allows for the secure sharing of large-scale clinical and genomic datasets, which is crucial for collaborative research, population health analytics, and the development of personalized medicine. As the demand for interoperability and data exchange across healthcare ecosystems intensifies, scalable and automated de-identification solutions are becoming indispensable.




    The market is further propelled by the expanding use of healthcare data for secondary purposes such as clinical research, public health monitoring, and healthcare analytics. Pharmaceutical companies, research organizations, and health insurers increasingly require access to de-identified datasets to derive insights, improve patient outcomes, and streamline operations without compromising privacy. The growing trend of data monetization and the emergence of health data marketplaces are also fueling the adoption of de-identification software, as organizations seek to unlock the value of their data assets while adhering to ethical and legal standards. These factors collectively create a fertile environment for sustained market growth over the forecast period.




    Regionally, North America continues to dominate the De-Identification Software for Healthcare Data Market, accounting for the largest share in 2024, followed by Europe and Asia Pacific. The high adoption rate of EHRs, advanced healthcare IT infrastructure, and the presence of leading market players in the United States and Canada underpin this leadership. Europe’s market is bolstered by GDPR compliance requirements and growing investments in digital health innovation, while Asia Pacific is witnessing rapid growth due to increasing healthcare digitization and a rising awareness of data privacy. Latin America and the Middle East & Africa are gradually emerging as promising markets, driven by healthcare modernization initiatives and evolving regulatory frameworks.



    Component Analysis



    The Component segment of the De-Identification Software for Healthcare Data Market is broadly categorized into Software and Services. The software segment holds the lion’s share of the market, primarily due to the growing need for automated

  4. G

    De-Identification Software for Healthcare Data Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). De-Identification Software for Healthcare Data Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/de-identification-software-for-healthcare-data-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    De-Identification Software for Healthcare Data Market Outlook



    According to our latest research, the global De-Identification Software for Healthcare Data market size reached USD 410 million in 2024, reflecting a robust surge in demand for data privacy and compliance solutions. The market is projected to expand at a CAGR of 17.2% from 2025 to 2033, reaching an estimated USD 1,444 million by 2033. This significant growth is primarily driven by escalating regulatory requirements, increasing incidences of data breaches, and the proliferation of digital health data across healthcare systems worldwide.



    One of the primary growth factors for the De-Identification Software for Healthcare Data market is the tightening of data privacy regulations such as HIPAA in the United States, GDPR in Europe, and similar frameworks in other regions. These legislations mandate stringent procedures for handling personally identifiable information (PII) and protected health information (PHI), compelling healthcare organizations to adopt advanced de-identification solutions. As healthcare providers, payers, and research entities increasingly digitize patient records, the risk of data exposure intensifies, making robust de-identification tools indispensable for compliance and risk mitigation. Furthermore, the growing awareness among healthcare professionals and administrators regarding the consequences of non-compliance, including hefty fines and reputational damage, is accelerating the adoption of these solutions.



    Another critical driver is the exponential growth of healthcare data generated from electronic health records (EHRs), wearable devices, telemedicine platforms, and genomic studies. The sheer volume and complexity of this data necessitate sophisticated de-identification software capable of processing both structured and unstructured information. The demand is further amplified by the surge in collaborative research, clinical trials, and data sharing initiatives, which require the anonymization of patient data to protect privacy while enabling valuable insights. As artificial intelligence and machine learning applications become more prevalent in healthcare, the need for high-quality, de-identified datasets is also rising, fostering further market expansion.



    Additionally, the rise in cyber threats and high-profile data breaches within the healthcare sector have underscored the urgent need for comprehensive data protection strategies. Healthcare organizations are increasingly prioritizing investments in de-identification software to safeguard sensitive patient information from unauthorized access and malicious actors. This trend is supported by the growing involvement of insurance companies and research organizations, which handle vast amounts of patient data and are equally vulnerable to breaches. The convergence of these factors is expected to sustain the momentum of the De-Identification Software for Healthcare Data market over the forecast period.



    From a regional perspective, North America continues to dominate the market, accounting for the largest share in 2024, driven by robust healthcare infrastructure, early adoption of advanced technologies, and strict regulatory frameworks. However, Asia Pacific is emerging as the fastest-growing region, fueled by rapid digitization of healthcare systems, increasing investments in health IT, and rising awareness of data privacy. Europe, with its comprehensive data protection laws, also represents a significant market, while Latin America and the Middle East & Africa are gradually catching up as healthcare modernization accelerates in these regions. The global landscape is thus characterized by both mature and emerging markets, each contributing to the overall growth trajectory.



    Data Loss Prevention in Healthcare is becoming increasingly crucial as the industry continues to digitize and expand its data management capabilities. With the rise of electronic health records, telemedicine, and wearable health devices, the volume of sensitive patient information being handled by healthcare organizations has skyrocketed. This surge in data has made the sector a prime target for cyberattacks, emphasizing the need for robust data loss prevention strategies. Healthcare providers are now investing in advanced technologies and protocols to protect patient data from unauthorized access and bre

  5. G

    Clinical Data De-Identification Pipelines Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Clinical Data De-Identification Pipelines Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/clinical-data-de-identification-pipelines-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 7, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Clinical Data De-Identification Pipelines Market Outlook



    According to our latest research, the global clinical data de-identification pipelines market size reached USD 680 million in 2024, with a robust growth trajectory driven by stringent data privacy regulations and the increasing adoption of digital health records. The market is expected to expand at a CAGR of 15.6% from 2025 to 2033, with the forecasted market size projected to reach USD 2.1 billion by 2033. This growth is primarily attributed to the rising emphasis on patient data security, the proliferation of healthcare data, and the need to facilitate compliant data sharing for research and analytics.




    The rapid digitalization of healthcare systems worldwide has resulted in an unprecedented surge in electronic health records (EHRs), clinical trial data, and patient registries. As healthcare organizations increasingly leverage these vast datasets for research, analytics, and population health management, the risk of data breaches and unauthorized disclosures has escalated. This scenario has intensified the demand for robust clinical data de-identification pipelines, which ensure that personally identifiable information (PII) is systematically removed or masked before data is shared or analyzed. Regulatory frameworks such as HIPAA in the United States, GDPR in Europe, and similar mandates in other regions have made de-identification not just a best practice but a legal requirement, further propelling the adoption of advanced software and services in this market.




    Another significant growth driver for the clinical data de-identification pipelines market is the expanding landscape of clinical research and precision medicine. Pharmaceutical and biotechnology companies, as well as academic and research institutes, are increasingly reliant on large-scale, multi-source datasets to accelerate drug discovery, understand disease mechanisms, and personalize treatment protocols. However, these research initiatives necessitate stringent privacy safeguards to maintain patient confidentiality while enabling meaningful data analysis. The integration of artificial intelligence (AI) and machine learning (ML) technologies into de-identification pipelines has enhanced the accuracy and efficiency of data anonymization processes, thereby supporting the dual objectives of compliance and research innovation.




    Strategic partnerships and collaborations among healthcare providers, technology vendors, and research organizations have also played a pivotal role in shaping the clinical data de-identification pipelines market. Leading technology firms are investing in the development of scalable, interoperable solutions that can seamlessly integrate with existing healthcare IT infrastructure. Moreover, the emergence of cloud-based deployment models has made de-identification solutions more accessible to smaller healthcare entities and research organizations, democratizing access to advanced privacy tools. This trend is particularly pronounced in regions with rapidly evolving healthcare ecosystems, such as Asia Pacific and Latin America, where digital health initiatives are gaining momentum.




    From a regional perspective, North America continues to dominate the clinical data de-identification pipelines market, accounting for the largest revenue share in 2024. This leadership is underpinned by the presence of a mature healthcare IT infrastructure, strong regulatory oversight, and significant investments in clinical research. Europe follows closely, benefiting from stringent data protection laws and a vibrant research community. Meanwhile, Asia Pacific is emerging as the fastest-growing market, fueled by large-scale government initiatives to digitize healthcare, rising awareness about patient privacy, and the increasing participation of regional players in global clinical research networks. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as healthcare modernization efforts gather pace.





    Component Analysis


    <br /

  6. G

    K-Anonymity Tools for Public Datasets Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). K-Anonymity Tools for Public Datasets Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/k-anonymity-tools-for-public-datasets-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    K-Anonymity Tools for Public Datasets Market Outlook




    According to our latest research, the global K-Anonymity Tools for Public Datasets market size reached USD 1.14 billion in 2024, reflecting the growing necessity for robust privacy solutions across industries. The market is experiencing a strong expansion, registering a CAGR of 18.7% from 2025 to 2033. By 2033, the market is anticipated to reach a value of USD 6.32 billion, driven by increasing regulatory pressures, growing data volumes, and heightened awareness of data privacy. This growth is underpinned by the widespread adoption of K-anonymity tools in sectors handling sensitive public datasets, where data de-identification and privacy preservation are paramount.




    One of the primary growth factors fueling the K-Anonymity Tools for Public Datasets market is the global surge in data privacy regulations such as GDPR, CCPA, and HIPAA. Organizations are now compelled to implement advanced anonymization techniques to ensure compliance with these stringent policies. K-anonymity, which guarantees that individual data entries cannot be distinguished from at least k-1 others, has emerged as a preferred solution for public dataset anonymization. The proliferation of massive datasets in healthcare, government, and research sectors further amplifies the demand for scalable and efficient anonymization tools. As data breaches and privacy violations continue to make headlines, enterprises are proactively investing in K-anonymity tools to mitigate reputational and financial risks, thereby propelling market growth.




    Technological advancements and the integration of artificial intelligence and machine learning with K-anonymity tools are also significant growth drivers. Modern K-anonymity solutions now offer automated risk assessment, real-time anonymization, and customizable privacy thresholds, making them more adaptable to diverse organizational needs. The rising adoption of cloud-based solutions has further democratized access to sophisticated privacy tools, enabling small and medium enterprises to leverage K-anonymity without substantial capital outlays. Additionally, the growing trend of data sharing for research and analytics—especially in healthcare and academia—necessitates robust anonymization to protect individual identities while preserving data utility. This evolution of capabilities and accessibility is expected to sustain the market's upward trajectory.




    Another crucial factor is the increasing collaboration between public and private sectors in data-driven initiatives. Governments are opening public datasets for research, innovation, and policy-making, but such initiatives come with heightened privacy concerns. K-anonymity tools provide a practical solution for balancing transparency and privacy in open data programs. The market is also witnessing substantial investments from venture capitalists and technology giants, further accelerating innovation and adoption. The convergence of privacy technology with broader digital transformation initiatives ensures that K-anonymity tools remain at the forefront of enterprise data governance strategies. As organizations prioritize ethical data use and responsible AI, the relevance and demand for these tools are set to intensify.




    Regionally, North America leads the K-Anonymity Tools for Public Datasets market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The dominance of North America can be attributed to robust regulatory frameworks, high technology adoption rates, and the presence of major market players. Europe’s growth is propelled by strict data protection laws and widespread digitalization across sectors. Asia Pacific is rapidly emerging as a high-growth region, driven by expanding IT infrastructure, increasing digital health initiatives, and rising awareness of data privacy. Latin America and Middle East & Africa are also showing promising growth, albeit from a smaller base, as governments and enterprises in these regions gradually adopt data privacy best practices.





    <h2 id='compon

  7. f

    DataSheet_1_Segmentation stability of human head and neck cancer medical...

    • frontiersin.figshare.com
    pdf
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jaakko Sahlsten; Kareem A. Wahid; Enrico Glerean; Joel Jaskari; Mohamed A. Naser; Renjie He; Benjamin H. Kann; Antti Mäkitie; Clifton D. Fuller; Kimmo Kaski (2023). DataSheet_1_Segmentation stability of human head and neck cancer medical images for radiotherapy applications under de-identification conditions: Benchmarking data sharing and artificial intelligence use-cases.pdf [Dataset]. http://doi.org/10.3389/fonc.2023.1120392.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Frontiers
    Authors
    Jaakko Sahlsten; Kareem A. Wahid; Enrico Glerean; Joel Jaskari; Mohamed A. Naser; Renjie He; Benjamin H. Kann; Antti Mäkitie; Clifton D. Fuller; Kimmo Kaski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundDemand for head and neck cancer (HNC) radiotherapy data in algorithmic development has prompted increased image dataset sharing. Medical images must comply with data protection requirements so that re-use is enabled without disclosing patient identifiers. Defacing, i.e., the removal of facial features from images, is often considered a reasonable compromise between data protection and re-usability for neuroimaging data. While defacing tools have been developed by the neuroimaging community, their acceptability for radiotherapy applications have not been explored. Therefore, this study systematically investigated the impact of available defacing algorithms on HNC organs at risk (OARs).MethodsA publicly available dataset of magnetic resonance imaging scans for 55 HNC patients with eight segmented OARs (bilateral submandibular glands, parotid glands, level II neck lymph nodes, level III neck lymph nodes) was utilized. Eight publicly available defacing algorithms were investigated: afni_refacer, DeepDefacer, defacer, fsl_deface, mask_face, mri_deface, pydeface, and quickshear. Using a subset of scans where defacing succeeded (N=29), a 5-fold cross-validation 3D U-net based OAR auto-segmentation model was utilized to perform two main experiments: 1.) comparing original and defaced data for training when evaluated on original data; 2.) using original data for training and comparing the model evaluation on original and defaced data. Models were primarily assessed using the Dice similarity coefficient (DSC).ResultsMost defacing methods were unable to produce any usable images for evaluation, while mask_face, fsl_deface, and pydeface were unable to remove the face for 29%, 18%, and 24% of subjects, respectively. When using the original data for evaluation, the composite OAR DSC was statistically higher (p ≤ 0.05) for the model trained with the original data with a DSC of 0.760 compared to the mask_face, fsl_deface, and pydeface models with DSCs of 0.742, 0.736, and 0.449, respectively. Moreover, the model trained with original data had decreased performance (p ≤ 0.05) when evaluated on the defaced data with DSCs of 0.673, 0.693, and 0.406 for mask_face, fsl_deface, and pydeface, respectively.ConclusionDefacing algorithms may have a significant impact on HNC OAR auto-segmentation model training and testing. This work highlights the need for further development of HNC-specific image anonymization methods.

  8. D

    De-Identification Solutions For Medical Images Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). De-Identification Solutions For Medical Images Market Research Report 2033 [Dataset]. https://dataintelo.com/report/de-identification-solutions-for-medical-images-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    De-Identification Solutions for Medical Images Market Outlook




    According to our latest research, the global De-Identification Solutions for Medical Images market size was valued at USD 425.8 million in 2024, with a robust growth trajectory projected at a CAGR of 13.6% from 2025 to 2033. By the end of 2033, the market is anticipated to reach USD 1,314.7 million. This remarkable expansion is primarily fueled by the increasing adoption of advanced imaging technologies in healthcare, stringent regulatory mandates for patient data privacy, and the rising prevalence of medical imaging data in clinical research and diagnostics. As per our latest research, the market is witnessing a dynamic shift towards cloud-based and AI-powered de-identification solutions, enabling healthcare organizations to meet compliance requirements while fostering innovation in medical imaging analytics.




    One of the foremost growth drivers for the De-Identification Solutions for Medical Images market is the exponential rise in digital healthcare data, particularly from radiology, pathology, and cardiology departments. The proliferation of high-resolution imaging modalities such as MRI, CT, and PET scans has resulted in massive data volumes that require secure handling and anonymization. Healthcare providers and research organizations are increasingly recognizing the importance of de-identification to protect patient privacy, comply with regulations such as HIPAA, GDPR, and local data protection laws, and enable the secondary use of medical images for research, AI training, and collaborative studies. This trend is further amplified by the growing integration of electronic health records (EHRs) with imaging systems, necessitating robust and scalable de-identification solutions to mitigate the risk of data breaches and unauthorized disclosures.




    Another significant factor propelling market growth is the rapid advancement of artificial intelligence and machine learning algorithms in the field of medical imaging. AI-driven de-identification tools are now capable of automating the anonymization process with high accuracy, reducing manual intervention, and ensuring consistent compliance with regulatory standards. These solutions not only streamline workflow efficiency but also enhance data utility for research and innovation. The increasing adoption of cloud-based platforms is further supporting the deployment of scalable de-identification services, enabling healthcare organizations to process and share large datasets seamlessly while maintaining stringent data privacy controls. This technological evolution is also facilitating the participation of smaller healthcare facilities and research institutes in global data-sharing initiatives, thereby broadening the market base.




    The surge in clinical trials, multi-center research collaborations, and the emergence of precision medicine are also contributing to the robust demand for de-identification solutions for medical images. Pharmaceutical companies, contract research organizations (CROs), and academic institutes are increasingly leveraging de-identified imaging datasets to accelerate drug discovery, validate diagnostic algorithms, and conduct population health studies. The emphasis on interoperability and data standardization across healthcare systems is driving the adoption of sophisticated de-identification tools that can support multiple imaging formats and workflows. Furthermore, the COVID-19 pandemic has underscored the importance of secure data sharing for public health research, further catalyzing investments in advanced de-identification technologies.




    From a regional perspective, North America continues to dominate the De-Identification Solutions for Medical Images market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The presence of a well-established healthcare infrastructure, stringent regulatory oversight, and a high concentration of leading market players are key factors supporting market leadership in North America. Meanwhile, Asia Pacific is witnessing the fastest growth, driven by rapid digitalization of healthcare, increasing investments in medical imaging, and rising awareness of data privacy. Europe remains a significant market owing to robust data protection regulations and a strong focus on research and innovation. Latin America and the Middle East & Africa are gradually emerging as promising markets, supported by healthcare modernization initiatives and growing participation in global health research networks.

    <br

  9. Hospital's Dataset for Various Diseases

    • kaggle.com
    zip
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Hassan (2024). Hospital's Dataset for Various Diseases [Dataset]. https://www.kaggle.com/datasets/deathriderjr/hospitals-dataset-for-various-diseases
    Explore at:
    zip(2936 bytes)Available download formats
    Dataset updated
    Jan 21, 2024
    Authors
    Ali Hassan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    The Comprehensive Health Monitoring Dataset is a rich and diverse collection of health-related information designed for use in a wide range of research purposes. This dataset encompasses a variety of health indicators, providing valuable insights into the overall well-being of individuals. The dataset is meticulously curated, featuring a set of key columns that cover various aspects of health, including symptoms, vital signs, and medical parameters.

    ****Key Columns:****

    Cough: Binary indicator (0 or 1) representing the presence or absence of cough symptoms.

    Fever: Binary variable indicating the presence or absence of fever symptoms.

    Difficulty Breathing: Binary measure (0 or 1) denoting the occurrence of difficulty in breathing.

    Blood Pressure: Continuous numerical values representing blood pressure readings, capturing both systolic and diastolic measures.

    Use Cases:

    Researchers can leverage this dataset for a variety of research purposes, including but not limited to:

    Epidemiological Studies: Analyzing the prevalence of common symptoms such as cough, fever, and difficulty breathing in different populations.

    Disease Surveillance: Monitoring the spread of diseases by examining the dataset for patterns and trends related to specific symptoms.

    Public Health Interventions: Informing public health strategies by identifying correlations between certain symptoms and health outcomes.

    Ethical Considerations:

    Researchers using this dataset are encouraged to adhere to ethical guidelines and privacy regulations to ensure the responsible and respectful use of health-related data. Proper anonymization and de-identification measures should be employed to protect the privacy of individuals represented in the dataset.

  10. p

    CARMEN-I: A resource of anonymized electronic health records in Spanish and...

    • physionet.org
    • oppositeofnorth.com
    Updated Apr 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eulalia Farre Maduell; Salvador Lima-Lopez; Santiago Andres Frid; Artur Conesa; Elisa Asensio; Antonio Lopez-Rueda; Helena Arino; Elena Calvo; Maria Jesús Bertran; Maria Angeles Marcos; Montserrat Nofre Maiz; Laura Tañá Velasco; Antonia Marti; Ricardo Farreres; Xavier Pastor; Xavier Borrat Frigola; Martin Krallinger (2024). CARMEN-I: A resource of anonymized electronic health records in Spanish and Catalan for training and testing NLP tools [Dataset]. http://doi.org/10.13026/x7ed-9r91
    Explore at:
    Dataset updated
    Apr 20, 2024
    Authors
    Eulalia Farre Maduell; Salvador Lima-Lopez; Santiago Andres Frid; Artur Conesa; Elisa Asensio; Antonio Lopez-Rueda; Helena Arino; Elena Calvo; Maria Jesús Bertran; Maria Angeles Marcos; Montserrat Nofre Maiz; Laura Tañá Velasco; Antonia Marti; Ricardo Farreres; Xavier Pastor; Xavier Borrat Frigola; Martin Krallinger
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The CARMEN-I corpus comprises 2,000 clinical records, encompassing discharge letters, referrals, and radiology reports from Hospital Clínic of Barcelona between March 2020 and March 2022. These reports, primarily in Spanish with some Catalan sections, cover COVID-19 patients with diverse comorbidities like kidney failure, cardiovascular diseases, malignancies, and immunosuppression. The corpus underwent thorough anonymization, validation, and expert annotation, replacing sensitive data with synthetic equivalents. A subset of the corpus features annotations of medical concepts by specialists, encompassing symptoms, diseases, procedures, medications, species, and humans (including family members). CARMEN-I serves as a valuable resource for training and assessing clinical NLP techniques and language models, aiding tasks like de-identification, concept detection, linguistic modifier extraction, document classification, and more. It also facilitates training researchers in clinical NLP and is a collaborative effort involving Barcelona Supercomputing Center's NLP4BIA team, Hospital Clínic, and Universitat de Barcelona's CLiC group.

  11. G

    Data Anonymization Tools Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data Anonymization Tools Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-anonymization-tools-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data Anonymization Tools Market Outlook



    According to our latest research, the global Data Anonymization Tools market size in 2024 stands at USD 3.2 billion, demonstrating robust growth driven by the escalating need for data privacy and compliance with stringent regulatory frameworks. The market is projected to expand at a CAGR of 17.4% from 2025 to 2033, reaching a forecasted value of USD 13.4 billion by 2033. This growth trajectory is primarily fueled by the increasing volume of sensitive data generated across industries and the urgent requirement for organizations to safeguard personally identifiable information (PII) while enabling data-driven innovation.




    A primary growth factor for the Data Anonymization Tools market is the intensifying regulatory landscape governing data privacy and protection worldwide. Legislation such as the General Data Protection Regulation (GDPR) in Europe, the California Consumer Privacy Act (CCPA) in the United States, and similar frameworks in Asia Pacific and Latin America have compelled organizations to adopt advanced data anonymization solutions. These regulations mandate strict controls over the processing, storage, and sharing of personal data, imposing significant penalties for non-compliance. Consequently, enterprises across sectors are increasingly investing in software and services that ensure data remains anonymized and compliant, thereby mitigating risks associated with data breaches and unauthorized disclosures.




    Another significant driver is the exponential growth in data volumes generated by digital transformation, cloud migration, and the proliferation of connected devices. As organizations leverage big data analytics, machine learning, and artificial intelligence to gain business insights, the challenge of protecting sensitive information while maintaining data utility becomes paramount. Data anonymization tools enable organizations to securely share and analyze datasets without exposing personal or confidential information. This capability not only supports regulatory compliance but also fosters collaboration and innovation in sectors like healthcare, finance, and retail, where data-driven decision-making is critical to competitive advantage.




    Moreover, the rising frequency and sophistication of cyber threats have heightened awareness regarding the vulnerabilities associated with storing and processing unprotected data. High-profile data breaches and the resultant financial and reputational damages have underscored the importance of robust data anonymization solutions. Organizations are increasingly prioritizing the implementation of tools that can de-identify data before it is used for analytics, testing, or sharing with third parties. This trend is further amplified by the growing adoption of cloud-based services, which necessitate additional layers of data protection to address the complexities of distributed environments and cross-border data flows.



    In the healthcare sector, the demand for Healthcare Data Anonymization Services is on the rise, driven by the need to protect patient privacy while enabling the use of data for research and innovation. Healthcare organizations are increasingly adopting these services to comply with regulations like HIPAA and GDPR, which mandate stringent data protection measures. By anonymizing patient data, healthcare providers can safely share information for clinical trials, population health studies, and collaborative research without compromising patient confidentiality. This not only enhances the ability to conduct meaningful research but also supports the development of personalized medicine and improved patient outcomes.




    Regionally, North America dominates the Data Anonymization Tools market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The United States, in particular, benefits from a highly developed technology infrastructure, a mature regulatory environment, and a strong presence of leading data security vendors. EuropeÂ’s market growth is propelled by the stringent enforcement of GDPR and the widespread adoption of privacy-enhancing technologies across industries. Meanwhile, Asia Pacific is experiencing rapid expansion due to increasing digitalization, rising awareness of data privacy, and the introduction of new data protection regulations in countries like India, China,

  12. D

    Data De-identification AI Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data De-identification AI Market Research Report 2033 [Dataset]. https://dataintelo.com/report/data-de-identification-ai-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data De-identification AI Market Outlook



    According to our latest research, the global Data De-identification AI market size reached USD 1.42 billion in 2024, reflecting a strong demand for advanced privacy technologies across industries. The market is expected to grow at a robust CAGR of 27.4% from 2025 to 2033, with the forecasted market size anticipated to reach USD 12.38 billion by 2033. This remarkable growth is primarily driven by stringent regulatory compliance requirements and the exponential rise in sensitive data generation worldwide, fueling the adoption of AI-powered de-identification solutions.




    One of the key growth factors propelling the Data De-identification AI market is the intensifying global focus on data privacy and security. Regulatory frameworks such as the GDPR in Europe, CCPA in California, and similar data protection acts across Asia Pacific and Latin America are mandating organizations to implement robust data anonymization and de-identification practices. As the volume of personal and sensitive data continues to surge, especially in sectors like healthcare, BFSI, and government, enterprises are increasingly turning to AI-driven de-identification tools to ensure compliance while maintaining data utility for analytics and innovation. This regulatory pressure, combined with heightened consumer awareness about data privacy, is significantly accelerating market expansion.




    Another major driver is the rapid digital transformation across industries, resulting in massive data collection and exchange. Organizations are leveraging big data analytics, machine learning, and cloud computing to derive actionable insights from vast datasets. However, this also raises the risk of data breaches and misuse of personally identifiable information (PII). AI-powered data de-identification solutions offer advanced capabilities such as automated masking, tokenization, and pseudonymization, enabling organizations to securely share and analyze sensitive information without compromising privacy. This capability is particularly crucial for sectors like healthcare and financial services, where data-driven innovation must be balanced with strict privacy requirements.




    Furthermore, the proliferation of AI and machine learning applications is creating new opportunities and challenges in managing sensitive data. As organizations deploy AI models that require large-scale, real-world datasets, the need to de-identify data before use becomes paramount. AI-based de-identification tools not only expedite this process but also enhance accuracy and scalability, supporting the development of ethical and compliant AI systems. Additionally, the growing adoption of cloud-based solutions and the increasing integration of de-identification technologies into existing data management workflows are further boosting market growth. The convergence of these factors is expected to sustain the upward trajectory of the Data De-identification AI market throughout the forecast period.




    Regionally, North America currently leads the market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The dominance of North America can be attributed to the presence of major technology providers, a mature regulatory environment, and high digital adoption rates. However, Asia Pacific is anticipated to witness the fastest growth over the next decade, fueled by rapid digitalization, expanding healthcare infrastructure, and increasing government initiatives to strengthen data privacy. Europe continues to be a strong market due to its rigorous GDPR compliance landscape, while Latin America and the Middle East & Africa are emerging as promising regions with growing investments in digital transformation and data security.



    Component Analysis



    The Data De-identification AI market by component is segmented into software and services, each playing a pivotal role in the overall ecosystem. The software segment currently dominates the market, driven by the increasing need for automated, scalable, and customizable data de-identification solutions. These software platforms are equipped with advanced features such as AI-based masking, encryption, and pseudonymization, enabling organizations to efficiently process large volumes of sensitive data in real-time. The integration of machine learning algorithms allows for context-aware de-identification, reducing the risk of re-identification while preserving data utility for analytics and machine learning

  13. D

    Updated PTSS dataset for the FORAS project

    • dataverse.nl
    csv, docx, xlsx
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bruno Coimbra; Bruno Coimbra; Rutger Neeleman; Rutger Neeleman; Elizabeth Grandfield; Elizabeth Grandfield; Mirjam van Zuiden; Mirjam van Zuiden; Rens van de Schoot; Rens van de Schoot (2025). Updated PTSS dataset for the FORAS project [Dataset]. http://doi.org/10.34894/CRE6ZC
    Explore at:
    docx(48426), csv(21840732), xlsx(9398219), xlsx(1199186)Available download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    DataverseNL
    Authors
    Bruno Coimbra; Bruno Coimbra; Rutger Neeleman; Rutger Neeleman; Elizabeth Grandfield; Elizabeth Grandfield; Mirjam van Zuiden; Mirjam van Zuiden; Rens van de Schoot; Rens van de Schoot
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    Dutch Research Council
    Description

    This updated labeled dataset builds upon the initial systematic review by van de Schoot et al. (2018; DOI: 10.1080/00273171.2017.1412293), which included studies on post-traumatic stress symptom (PTSS) trajectories up to 2016, sourced from the Open Science Framework (OSF). As part of the FORAS project - Framework for PTSS trajectORies: Analysis and Synthesis (funded by the Dutch Research Council, grant no. 406.22.GO.048 and pre-registered at PROSPERO under ID CRD42023494027), we extended this dataset to include publications between 2016 and 2023. In total, the search identified 10,594 de-duplicated records obtained via different search methods, each published with their own search query and result: Exact replication of the initial search: OSF.IO/QABW3 Comprehensive database search: OSF.IO/D3UV5 Snowballing: OSF.IO/M32TS Full-text search via Dimensions data: OSF.IO/7EXC5 Semantic search via OpenAlex: OSF.IO/M32TS Humans (BC, RN) and AI (Bron et al., 2024) have screened the records, and disagreements have been solved (MvZ, BG, RvdS). Each record was screened separately for Title, Abstract, and Full-text inclusion and per inclusion criteria. A detailed screening logbook is available at OSF.IO/B9GD3, and the entire process is described in https://doi.org/10.31234/osf.io/p4xm5. A description of all columns/variables and full methodological details is available in the accompanying codebook. Important Notes: Duplicates: To maintain consistency and transparency, duplicates are left in the dataset and are labeled with the same classification as the original records. A filter is provided to allow users to exclude these duplicates as needed. Anonymized Data: The dataset "...._anonymous" excludes DOIs, OpenAlex IDs, titles, and abstracts to ensure data anonymization during the review process. The complete dataset, including all identifiers, is uploaded under embargo and will be publicly available on 01-10-2025. This dataset serves not only as a valuable resource for researchers interested in systematic reviews of PTSS trajectories and facilitates reproducibility and transparency in the research process but also for data scientists who would like to mimic the screening process using different machine learning and AI models.

  14. z

    Ambient Influence: Digital Nomads as Unintentional Brand Intermediaries —...

    • zenodo.org
    bin, csv +1
    Updated Sep 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simeli Ioanna; Simeli Ioanna; Evangelos Christou; Evangelos Christou; Chryssoula Chatzigeorgiou; Chryssoula Chatzigeorgiou (2025). Ambient Influence: Digital Nomads as Unintentional Brand Intermediaries — Data, Code, and Materials [Dataset]. http://doi.org/10.5281/zenodo.17199056
    Explore at:
    bin, csv, text/x-pythonAvailable download formats
    Dataset updated
    Sep 25, 2025
    Dataset provided by
    International Hellenic University
    Authors
    Simeli Ioanna; Simeli Ioanna; Evangelos Christou; Evangelos Christou; Chryssoula Chatzigeorgiou; Chryssoula Chatzigeorgiou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the complete replication package for the manuscript “Digital Nomads as Unintentional Brand Intermediaries". It includes de-identified data, analysis code, study instruments, and supplementary sources to reproduce all tables and figures.

    Contents

    • Study 1 (Survey): cleaned, de-identified dataset; codebook; SEM/CFA scripts and output; item wordings and scale anchors.

    • Study 2 (Experiment): cleaned, de-identified dataset; pre-registration (if applicable); analysis scripts for manipulation checks, ANCOVA, and mediation; instrument and manipulation-check items.

    • Study 3 (Interviews): anonymized excerpted quotations used in the paper; theme/codebook; audit trail summary (sampling, saturation notes).

    • Supplementary materials: figure/table source files; robustness checks (controls, multigroup, invariance); README with step-by-step replication instructions and software versions.

    Restrictions
    Full video stimuli are not redistributed due to third-party rights. We provide transcripts, frame stills, detailed metadata (links, durations, posting dates), and procedures to reconstruct the stimuli set. Researchers may request time-limited access to the files for verification under a non-distribution agreement.

    Anonymization & Compliance
    All datasets are de-identified and stored per GDPR and institutional ethics approval. Any indirect identifiers were removed or binned.

    Licensing & Citation
    Data and materials: CC BY 4.0. Please cite this repository using its DOI (https://doi.org/10.5281/zenodo.17199056) when reusing these materials.

    Reproducibility
    A replication script (R/Python) reproduces the tables and figures from the raw de-identified data; session info and package versions are provided.

  15. D

    Veterinary Data De-Identification Services Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Veterinary Data De-Identification Services Market Research Report 2033 [Dataset]. https://dataintelo.com/report/veterinary-data-de-identification-services-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Veterinary Data De-Identification Services Market Outlook



    According to our latest research, the veterinary data de-identification services market size reached USD 145.8 million in 2024, reflecting a growing emphasis on data privacy and regulatory compliance in the veterinary sector. The market is poised for robust expansion, projected to attain USD 393.2 million by 2033, propelled by a CAGR of 11.7% from 2025 to 2033. This growth is primarily fueled by the increasing digitization of veterinary records, rising concerns over data security, and the integration of advanced technologies in veterinary healthcare management.




    The surge in demand for veterinary data de-identification services is largely attributed to the exponential growth of digital data in the veterinary industry. As veterinary practices, research institutes, and pharmaceutical companies increasingly adopt electronic health records and data-driven approaches, the volume of sensitive animal health data has soared. This growth has necessitated robust data protection strategies to safeguard confidential information, especially as regulations similar to human healthcare data privacy, such as GDPR and HIPAA-like standards, are being extended to veterinary data. The need to anonymize and pseudonymize animal health data for research, clinical trials, and collaborative studies without compromising privacy is a significant market driver, pushing organizations to invest in specialized de-identification services.




    Another key growth factor is the rising collaboration between veterinary clinics, research institutions, and pharmaceutical companies. These collaborations often require the sharing of large datasets to advance veterinary science, drug development, and clinical research. However, the sharing of identifiable data poses ethical and legal risks, elevating the importance of de-identification solutions that ensure compliance and foster trust among stakeholders. The increasing prevalence of zoonotic diseases and the global focus on One Health initiatives have further highlighted the need for secure and compliant data sharing, driving the uptake of de-identification services across the veterinary ecosystem.




    Technological advancements are also reshaping the veterinary data de-identification services market. The integration of artificial intelligence, machine learning, and blockchain technologies has enhanced the efficacy and reliability of de-identification processes. These innovations enable more precise anonymization and encryption of veterinary data, reducing the risk of re-identification while maintaining data utility for research and analytics. Additionally, the growing awareness among veterinary professionals about the risks of data breaches and the potential legal consequences has led to increased investments in comprehensive data de-identification and security solutions, further propelling market growth.




    From a regional perspective, North America continues to dominate the veterinary data de-identification services market, accounting for the largest revenue share in 2024. The region’s leadership is supported by stringent data privacy regulations, a high concentration of veterinary research institutions, and rapid adoption of digital health technologies. Europe follows closely, driven by strong regulatory frameworks and increasing investments in veterinary research. Asia Pacific is emerging as a high-growth region, with expanding veterinary healthcare infrastructure, rising pet ownership, and growing awareness of data privacy. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as digital transformation initiatives gain traction in these regions.



    Service Type Analysis



    The service type segment in the veterinary data de-identification services market encompasses anonymization, pseudonymization, data masking, encryption, and other specialized services. Anonymization remains the most widely adopted service, as it irreversibly removes personally identifiable information from veterinary datasets, ensuring compliance with stringent data privacy regulations. Veterinary clinics and research institutions favor anonymization for sharing data in multi-institutional studies and public health surveillance, as it allows for the safe aggregation and analysis of large datasets without risking the exposure of sensitive information. The growing complexity of veterinary data, including genomic and behavioral da

  16. Z

    MEDDOCAN corpus: gold standard annotations for Medical Document...

    • data-staging.niaid.nih.gov
    • live.european-language-grid.eu
    Updated Nov 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marimon, Montserrat; Gonzalez-Agirre, Aitor; Intxaurrondo, Ander; Rodríguez, Heidy; Lopez Martin, Jose Antonio; Villegas, Marta; Krallinger, Martin (2022). MEDDOCAN corpus: gold standard annotations for Medical Document Anonymization on Spanish clinical case reports [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_4279322
    Explore at:
    Dataset updated
    Nov 4, 2022
    Dataset provided by
    Centro Nacional de Investigaciones Oncológicas
    Hospital 12 de Octubre
    Barcelona Supercomputing Center
    Authors
    Marimon, Montserrat; Gonzalez-Agirre, Aitor; Intxaurrondo, Ander; Rodríguez, Heidy; Lopez Martin, Jose Antonio; Villegas, Marta; Krallinger, Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Intro:

    Meddocan shared task dataset (divided in train, dev and test). In addition, we include here the Meddocan background set.

    It contains the training, development and test sets of the Meddocan shared task with Gold Standard annotations.

    In addition, it contains the documents of the background set, without annotations.

    Annotation quality

    Inter-annotator agreement: 98%

    For more information, see the paper.

    Format:

    Annotations are distributed in Brat format. See Brat webpage for more information.

    In addition, annotations are also distributed in XML format (based on i2b2 XML format).

    In the Meddocan webpage, there is a script to convert between MEDDOCAN-Brat, MEDDOCAN-XML, and i2b2 formats.

    Shared task goal:

    In the three subtasks, the goal will be to predict the annotations given only the plain text files.

    Resources:

    Web

    Citation: Montserrat Marimon et al. “Automatic De-identification of Medical Texts in Spanish: the MEDDOCAN Track, Corpus, Guidelines, Methods and Evaluation of Results.” In: IberLEF@ SEPLN. 2019, pp. 618–638.

    Silver Standard corpus

    Annotation guidelines

    For further information, please visit https://temu.bsc.es/meddocan/ or email us at encargo-pln-life@bsc.es

    Copyright (c) 2019 Secretaría de Estado para el Avance Digital (SEAD)

  17. G

    Data De-Identification for Omics Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Data De-Identification for Omics Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/data-de-identification-for-omics-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data De-Identification for Omics Market Outlook



    According to the latest research, the global Data De-Identification for Omics market size reached USD 1.28 billion in 2024, supported by a robust demand for privacy-preserving technologies in the biomedical and healthcare sectors. The market is expanding at a CAGR of 17.6% and is projected to attain a value of USD 5.85 billion by 2033. This remarkable growth is primarily driven by the increasing adoption of omics technologies in clinical research and drug discovery, coupled with stringent data privacy regulations across regions.




    One of the major growth factors propelling the Data De-Identification for Omics market is the exponential increase in the generation of omics data, particularly from genomics, proteomics, and metabolomics studies. As next-generation sequencing and high-throughput omics platforms become more affordable and widespread, vast amounts of sensitive biological data are being produced daily. This surge necessitates robust data de-identification solutions to protect patient privacy and comply with global regulatory frameworks such as the General Data Protection Regulation (GDPR), the Health Insurance Portability and Accountability Act (HIPAA), and other data protection laws. The risk of re-identification from omics datasets, given their highly personal and unique nature, further underscores the need for advanced anonymization and pseudonymization tools, thereby fueling market demand.




    Another significant driver is the integration of omics data in personalized medicine and precision healthcare initiatives. As healthcare providers and pharmaceutical companies increasingly leverage genomics and other omics data to tailor treatments and therapies, ensuring the privacy and security of this information becomes paramount. Data de-identification technologies enable organizations to share and analyze omics datasets without compromising individual identities, thereby accelerating collaborative research and innovation. Moreover, the growing trend of cross-border clinical trials and international research collaborations is amplifying the need for standardized, interoperable de-identification solutions that can operate seamlessly across jurisdictions, further catalyzing market expansion.




    Technological advancements in artificial intelligence and machine learning are also transforming the Data De-Identification for Omics market. AI-powered de-identification platforms can automate the detection and masking of personal identifiers in complex omics datasets, significantly reducing manual effort and the risk of human error. These intelligent systems are capable of adapting to evolving data types and regulatory requirements, offering scalability and flexibility to research organizations and healthcare providers. Additionally, the increasing adoption of cloud-based solutions is facilitating secure, scalable, and cost-effective data de-identification workflows, making these technologies accessible to a broader range of end-users, from large pharmaceutical companies to small research institutes.




    Regionally, North America continues to dominate the Data De-Identification for Omics market, accounting for the largest market share in 2024. This leadership is attributed to the presence of leading omics research institutions, robust healthcare infrastructure, and strict regulatory frameworks governing data privacy. Europe follows closely, driven by the implementation of GDPR and the region’s strong focus on biomedical research. The Asia Pacific region is witnessing the fastest growth, propelled by increasing investments in healthcare infrastructure, expanding genomics research, and rising awareness of data privacy. Latin America and the Middle East & Africa are also emerging as promising markets, supported by government initiatives to modernize healthcare systems and encourage biomedical innovation.





    Component Analysis



    The Component segment of the Data De-Identification for Omics market is bifurcated into software and servic

  18. h

    CCVG

    • huggingface.co
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ZhitianHou (2025). CCVG [Dataset]. https://huggingface.co/datasets/TIM0927/CCVG
    Explore at:
    Dataset updated
    Oct 14, 2025
    Authors
    ZhitianHou
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    CCVG: Chinese Court View Generation Dataset

    中文 | 🤗 huggingface | 🤖 modelscope | 📄 Arxiv | 💻 GitHub For privacy protection reasons, the original data has been removed pending de-identification and anonymization. Only metadata and dataset descriptions are currently available for research reference. CCVG is a curated Chinese dataset designed for Criminal Court View Generation (CVG) and charge prediction tasks.It contains criminal case documents with fact descriptions and court… See the full description on the dataset page: https://huggingface.co/datasets/TIM0927/CCVG.

  19. G

    Pathology Image De-Identification Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Pathology Image De-Identification Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/pathology-image-de-identification-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Pathology Image De-Identification Market Outlook



    According to our latest research, the global pathology image de-identification market size reached USD 185 million in 2024, driven by the increasing adoption of digital pathology and stringent data privacy regulations. The market is projected to grow at a robust CAGR of 13.6% during the forecast period, reaching USD 546 million by 2033. The primary growth factor is the rising demand for secure and compliant sharing of pathology images for research, diagnostics, and educational purposes, as healthcare organizations worldwide continue to prioritize patient privacy and data security.




    One of the most significant growth factors propelling the pathology image de-identification market is the intensifying global focus on healthcare data privacy and compliance with regulatory frameworks such as HIPAA, GDPR, and other region-specific mandates. As digital pathology becomes increasingly prevalent, the volume of images containing sensitive patient information has surged, necessitating robust de-identification solutions to protect patient identities. Healthcare providers, research institutions, and diagnostic laboratories are compelled to adopt advanced de-identification technologies to ensure that patient confidentiality is maintained while enabling seamless data sharing for collaborative research, clinical trials, and educational initiatives. This regulatory landscape is prompting rapid investments in both software and services that automate and streamline the de-identification process, ensuring compliance and reducing the risk of costly data breaches.




    Another critical driver for the pathology image de-identification market is the exponential growth in the use of artificial intelligence (AI) and machine learning (ML) in pathology research and diagnostics. AI-powered algorithms require large, diverse, and high-quality datasets to train and validate models, but the use of real-world pathology images is often restricted due to privacy concerns. De-identification solutions bridge this gap by enabling the anonymization of pathology images, thereby facilitating the creation of expansive datasets that can be shared across institutions and geographic boundaries without compromising patient privacy. This, in turn, accelerates the development of AI-driven diagnostic tools, enhances research productivity, and fosters innovation in precision medicine, further boosting the demand for sophisticated de-identification technologies.




    The proliferation of digital health platforms and telepathology services is also fueling the growth of the pathology image de-identification market. As healthcare delivery models evolve to include remote consultations, second opinions, and virtual tumor boards, the need for secure transmission and sharing of pathology images has become paramount. De-identification solutions are integral to these workflows, ensuring that images can be exchanged between pathologists, clinicians, and researchers without exposing personally identifiable information. This trend is particularly pronounced in academic and research settings, where the exchange of de-identified images is essential for multi-site studies, collaborative projects, and medical education. The convergence of digital pathology, telemedicine, and regulatory compliance is thus creating a fertile environment for the sustained expansion of the pathology image de-identification market.




    From a regional perspective, North America continues to dominate the pathology image de-identification market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The United States, in particular, benefits from a mature healthcare IT infrastructure, strong regulatory oversight, and a high concentration of leading digital pathology vendors. Europe’s market growth is underpinned by the implementation of GDPR and increasing investments in digital health, while Asia Pacific is emerging as a high-growth region due to rapid digitization of healthcare, expanding research activities, and rising awareness of data privacy. Latin America and the Middle East & Africa are also witnessing steady adoption, albeit at a slower pace, as healthcare systems modernize and embrace digital pathology solutions.



    "https://growthmarketreports.com/request-sample/177452">
    <button class="btn btn-lg text-center&

  20. G

    Real-World Data De-identification AI Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Real-World Data De-identification AI Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/real-world-data-de-identification-ai-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Real-World Data De-identification AI Market Outlook




    As per our latest research, the global Real-World Data De-identification AI market size reached USD 1.45 billion in 2024, demonstrating robust momentum driven by stringent data privacy regulations and the exponential growth of healthcare data. The market is expected to expand at a CAGR of 22.3% from 2025 to 2033, reaching an estimated value of USD 10.53 billion by 2033. The principal growth factor behind this surge is the urgent need for advanced AI-driven de-identification solutions to safeguard sensitive real-world data while enabling its use for analytics, research, and innovation across multiple sectors.




    The accelerating adoption of electronic health records (EHRs), insurance claims, and patient-generated data has dramatically increased the volume and complexity of real-world data (RWD) available for research and operational purposes. However, this surge in data has also heightened concerns regarding patient privacy, data breaches, and regulatory compliance. Organizations are increasingly turning to AI-powered de-identification solutions to automate the anonymization process, ensuring that personally identifiable information (PII) is thoroughly protected without compromising the utility of the data. These AI solutions leverage advanced natural language processing (NLP) and machine learning algorithms to accurately identify and mask sensitive information across structured and unstructured datasets, making them indispensable tools in the modern data governance landscape.




    Another significant growth driver is the evolving regulatory landscape, particularly the enforcement of laws such as the Health Insurance Portability and Accountability Act (HIPAA) in the United States, the General Data Protection Regulation (GDPR) in Europe, and similar frameworks across Asia Pacific and Latin America. These regulations mandate rigorous de-identification and privacy-preserving measures for the use and sharing of real-world data, spurring demand for AI-based de-identification platforms that can demonstrate compliance while streamlining data workflows. Furthermore, the increasing collaboration between healthcare providers, life sciences companies, insurers, and government agencies for data-driven research and innovation has further fueled the need for scalable, automated de-identification technologies.



    The integration of AI Insider-Threat Detection for EHR systems is becoming increasingly crucial as healthcare organizations strive to protect sensitive patient data from internal threats. This technology leverages advanced machine learning algorithms to monitor and analyze user behavior within electronic health records, identifying potential security breaches before they occur. By proactively detecting anomalies and unauthorized access attempts, AI Insider-Threat Detection enhances the overall security posture of healthcare institutions, ensuring compliance with stringent data protection regulations. As the volume of electronic health records continues to grow, the need for sophisticated insider-threat detection solutions becomes more pronounced, safeguarding patient privacy while maintaining the integrity of healthcare data systems.




    The rapid advancements in artificial intelligence, particularly in deep learning and NLP, have significantly enhanced the accuracy, scalability, and efficiency of de-identification tools. Unlike traditional rule-based systems, AI-powered solutions can adapt to diverse data formats, languages, and contexts, reducing manual intervention and minimizing the risk of residual disclosure. This technological evolution, combined with growing investments in digital health infrastructure and the proliferation of cloud computing, is expected to sustain the high growth trajectory of the Real-World Data De-identification AI market over the forecast period. As organizations increasingly recognize the strategic value of de-identified data for research, population health management, drug development, and policy-making, the market is poised for continued expansion.




    Regionally, North America currently leads the market, accounting for the largest share in 2024, followed by Europe and Asia Pacific. The dominance of North America is attributed to the presence of advanced healthcare infrastructure, proactive regulatory enforcement, and a

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich (2024). Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure Score Analytics [data] [Dataset]. http://doi.org/10.11588/DATA/MXM0Q2

Anonymize or Synthesize? – Privacy-Preserving Methods for Heart Failure Score Analytics [data]

Related Article
Explore at:
tsv(197975), tsv(190296), tsv(191831), pdf(640128), tsv(107100), txt(3421), tsv(286102), tsv(106632)Available download formats
Dataset updated
Nov 20, 2024
Dataset provided by
heiDATA
Authors
Tim Ingo Johann; Tim Ingo Johann; Karen Otte; Karen Otte; Fabian Prasser; Fabian Prasser; Christoph Dieterich; Christoph Dieterich
License

https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.11588/DATA/MXM0Q2

Description

In the publication [1] we implemented anonymization and synthetization techniques for a structured data set, which was collected during the HiGHmed Use Case Cardiology study [2]. We employed the data anonymization tool ARX [3] and the data synthetization framework ASyH [4] individually and in combination. We evaluated the utility and shortcomings of the different approaches by statistical analyses and privacy risk assessments. Data utility was assessed by computing two heart failure risk scores (Barcelona BioHF [5] and MAGGIC [6]) on the protected data sets. We observed only minimal deviations to scores from the original data set. Additionally, we performed a re-identification risk analysis and found only minor residual risks for common types of privacy threats. We could demonstrate that anonymization and synthetization methods protect privacy while retaining data utility for heart failure risk assessment. Both approaches and a combination thereof introduce only minimal deviations from the original data set over all features. While data synthesis techniques produce any number of new records, data anonymization techniques offer more formal privacy guarantees. Consequently, data synthesis on anonymized data further enhances privacy protection with little impacting data utility. We hereby share all generated data sets with the scientific community through a use and access agreement. [1] Johann TI, Otte K, Prasser F, Dieterich C: Anonymize or synthesize? Privacy-preserving methods for heart failure score analytics. Eur Heart J 2024;. doi://10.1093/ehjdh/ztae083 [2] Sommer KK, Amr A, Bavendiek, Beierle F, Brunecker P, Dathe H et al. Structured, harmonized, and interoperable integration of clinical routine data to compute heart failure risk scores. Life (Basel) 2022;12:749. [3] Prasser F, Eicher J, Spengler H, Bild R, Kuhn KA. Flexible data anonymization using ARX—current status and challenges ahead. Softw Pract Exper 2020;50:1277–1304. [4] Johann TI, Wilhelmi H. ASyH—anonymous synthesizer for health data, GitHub, 2023. Available at: https://github.com/dieterich-lab/ASyH. [5] Lupón J, de Antonio M, Vila J, Peñafiel J, Galán A, Zamora E, et al. Development of a novel heart failure risk tool: the Barcelona bio-heart failure risk calculator (BCN Bio-HF calculator). PLoS One 2014;9:e85466. [6] Pocock SJ, Ariti CA, McMurray JJV, Maggioni A, Køber L, Squire IB, et al. Predicting survival in heart failure: a risk score based on 39 372 patients from 30 studies. Eur Heart J 2013;34:1404–1413.

Search
Clear search
Close search
Google apps
Main menu