84 datasets found
  1. Most sensitive private information online 2017

    • statista.com
    Updated Apr 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Most sensitive private information online 2017 [Dataset]. https://www.statista.com/statistics/418738/personal-information-sensitivity-hacking/
    Explore at:
    Dataset updated
    Apr 3, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Aug 10, 2017 - Aug 14, 2017
    Area covered
    United States
    Description

    This statistic presents the types of personal information which U.S. adults would be most concerned about online hackers gaining access to. During the August 2017 survey period, 73 percent of respondents stated that they would feel most concerned about hackers gaining access to their personal banking information.

  2. h

    pii-masking-300k

    • huggingface.co
    Updated Apr 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai4Privacy (2024). pii-masking-300k [Dataset]. http://doi.org/10.57967/hf/1995
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 4, 2024
    Dataset authored and provided by
    Ai4Privacy
    License

    https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/

    Description

    Purpose and Features

    🌍 World's largest open dataset for privacy masking 🌎 The dataset is useful to train and evaluate models to remove personally identifiable and sensitive information from text, especially in the context of AI assistants and LLMs. Key facts:

    OpenPII-220k text entries have 27 PII classes (types of sensitive data), targeting 749 discussion subjects / use cases split across education, health, and psychology. FinPII contains an additional ~20 types tailored to… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-300k.

  3. Sensitive personal data available online to third parties in Canada 2023, by...

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Sensitive personal data available online to third parties in Canada 2023, by type [Dataset]. https://www.statista.com/statistics/1344572/canada-access-to-users-personal-information-online/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 3, 2023 - Jan 5, 2023
    Area covered
    Canada
    Description

    A 2023 survey of Canadians found that almost four out of every 10 respondents think their home address is available online to people who should not have access to it. A further share of ** percent thought their date of birth was available online, while ** percent of respondents believed their credit card number was accessible to third parties.

  4. AN ANALYSIS OF LAW REGULATING VULNERABILITY OF SENSITIVE PERSONAL DATA TO...

    • figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anand Raut; V N Ghormade (2023). AN ANALYSIS OF LAW REGULATING VULNERABILITY OF SENSITIVE PERSONAL DATA TO PHISHING IN INDIA [Dataset]. http://doi.org/10.6084/m9.figshare.1194489.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Anand Raut; V N Ghormade
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    In the era of information technology data privacy is has become sacrosanct to individual privacy. Technology being dual edged sword can be misused to harm internet privacy. Phishing an online offence is similar to and is derived from word fishing in real world where offender sends mails (hook) to victims (bait) who think it is genuine mail and relies on it.

  5. d

    Open Data Privacy Policy (Sensitive Regulated Data: Permitted and Restricted...

    • datasets.ai
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +7more
    21
    Updated Sep 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Tempe (2022). Open Data Privacy Policy (Sensitive Regulated Data: Permitted and Restricted Uses) [Dataset]. https://datasets.ai/datasets/open-data-privacy-policy-sensitive-regulated-data-permitted-and-restricted-uses-30dc6
    Explore at:
    21Available download formats
    Dataset updated
    Sep 2, 2022
    Dataset authored and provided by
    City of Tempe
    Description
    Sensitive Regulated Data: Permitted and Restricted Uses
    • Purpose
    • Scope and Authority
    • Standard
    • Violation of the Standard - Misuse of Information
    • Definitions
    • References
    • Appendix A: Personally Identifiable Information (PII)
    • Appendix B: Security of Personally Owned Devices that Access or Maintain Sensitive Restricted Data
    • Appendix C: Sensitive Security Information (SSI)

  6. e

    How can e-infrastructures deal with the sensitive data challenge (Working...

    • b2find.eudat.eu
    Updated Nov 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). How can e-infrastructures deal with the sensitive data challenge (Working Paper) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/afd30269-f496-549b-b750-59765fbe9b93
    Explore at:
    Dataset updated
    Nov 3, 2023
    Description

    Sensitive personal data is data “revealing racial or ethnic origin, political opinions, religious beliefs, and (…) data concerning health or sex life”. Therefore, data sharing for research purposes must be opened for human health data to address cross-discipline research and improve the well-being of humans. The EUDAT Sensitive Data Group was created to address the unsatisfactory condition of using sensitive data in data infrastructures, like EUDAT. During the EUDAT User Forum, 26-27 Sept. 2016 in Krakow, Poland, the first meeting of the EUDAT Sensitive Data Working Group took place addressing different requirements and possible solutions for the processing of sensitive data in e-infrastructures. This meeting went beyond the current solutions of EUDAT and to find possibilities for more comprehensive data services and solutions as part of the open data environment of current e-infrastructures. In the Working Paper the first results are shown.

  7. 4

    Data underlying the paper "Dataslip: Into the Present and Future(s) of...

    • data.4tu.nl
    zip
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandra GĂłmez Ortega (2024). Data underlying the paper "Dataslip: Into the Present and Future(s) of Personal Data Collection." Included in Chapter 6 of the PhD thesis: Sensitive Data Donation [Dataset]. http://doi.org/10.4121/35a8648c-bca2-4566-a664-b415a12e176a.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    4TU.ResearchData
    Authors
    Alejandra GĂłmez Ortega
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This project investigates the challenges and potential solutions around the collection and use of personal data through an interactive installation called "dataslip". It was deployed across various events and used as a conversation starter for identifying challenges, collected via post-it notes, and solutions, collected through a generative workshop. The dataset includes the vector files to build the "dataslip" installation and the challenges and solutions identified.

  8. t

    Privacy-Sensitive Conversations between Care Workers and Care Home Residents...

    • test.researchdata.tuwien.ac.at
    • researchdata.tuwien.ac.at
    • +1more
    bin, text/markdown
    Updated Dec 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Reinhard Grabler; Michael Starzinger; Matthias Hirschmanner; Matthias Hirschmanner; Helena Anna Frijns; Helena Anna Frijns; Reinhard Grabler; Michael Starzinger; Reinhard Grabler; Michael Starzinger; Reinhard Grabler; Michael Starzinger (2024). Privacy-Sensitive Conversations between Care Workers and Care Home Residents in a Residential Care Home [Dataset]. http://doi.org/10.70124/hbtq5-ykv92
    Explore at:
    bin, text/markdownAvailable download formats
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    TU Wien
    Authors
    Reinhard Grabler; Michael Starzinger; Matthias Hirschmanner; Matthias Hirschmanner; Helena Anna Frijns; Helena Anna Frijns; Reinhard Grabler; Michael Starzinger; Reinhard Grabler; Michael Starzinger; Reinhard Grabler; Michael Starzinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 2024 - Aug 2024
    Description

    Dataset Card for "privacy-care-interactions"

    Table of Contents

    Dataset Description

    Purpose and Features

    đź”’ Collection of Privacy-Sensitive Conversations between Care Workers and Care Home Residents in an Residential Care Home đź”’

    The dataset is useful to train and evaluate models to identify and classify privacy-sensitive parts of conversations from text, especially in the context of AI assistants and LLMs.

    Dataset Overview

    Language Distribution 🌍

    • English (en): 95

    Locale Distribution 🌎

    • United States (US) 🇺🇸: 95

    Key Facts 🔑

    • This is synthetic data! Generated using proprietary algorithms - no privacy violations!
    • Conversations are classified following the taxonomy for privacy-sensitive robotics by Rueben et al. (2017).
    • The data was manually labeled by an expert.

    Dataset Structure

    Data Instances

    The provided data format is .jsonl, the JSON Lines text format, also called newline-delimited JSON. An example entry looks as follows.

    { "text": "CW: Have you ever been to Italy? CR: Oh, yes... many years ago.", "taxonomy": 0, "category": 0, "affected_speaker": 1, "language": "en", "locale": "US", "data_type": 1, "uid": 16, "split": "train" }

    Data Fields

    The data fields are:

    • text: a string feature. The abbreviaton of the speakers refer to the care worker (CW) and the care recipient (CR).
    • taxonomy: a classification label, with possible values including informational (0), invasion (1), collection (2), processing (3), dissemination (4), physical (5), personal-space (6), territoriality (7), intrusion (8), obtrusion (9), contamination (10), modesty (11), psychological (12), interrogation (13), psychological-distance (14), social (15), association (16), crowding-isolation (17), public-gaze (18), solitude (19), intimacy (20), anonymity (21), reserve (22). The taxonomy is derived from Rueben et al. (2017). The classifications were manually labeled by an expert.
    • category: a classification label, with possible values including personal-information (0), family (1), health (2), thoughts (3), values (4), acquaintance (5), appointment (6). The privacy category affected in the conversation. The classifications were manually labeled by an expert.
    • affected_speaker: a classification label, with possible values including care-worker (0), care-recipient (1), other (2), both (3). The speaker whose privacy is impacted during the conversation. The classifications were manually labeled by an expert.
    • language: a string feature. Language code as defined by ISO 639.
    • locale: a string feature. Regional code as defined by ISO 3166-1 alpha-2.
    • data_type: a string a classification label, with possible values including real (0), synthetic (1).
    • uid: a int64 feature. A unique identifier within the dataset.
    • split: a string feature. Either train, validation or test.

    Dataset Splits

    The dataset has 2 subsets:

    • split: with a total of 95 examples split into train, validation and test (70%-15%-15%)
    • unsplit: with a total of 95 examples in a single train split
    nametrainvalidationtest
    split661415
    unsplit95n/an/a

    The files follow the naming convention subset-split-language.jsonl. The following files are contained in the dataset:

    • split-train-en.jsonl
    • split-validation-en.jsonl
    • split-test-en.jsonl
    • unsplit-train-en.jsonl

    Dataset Creation

    Curation Rationale

    Recording audio of care workers and residents during care interactions, which includes partial and full body washing, giving of medication, as well as wound care, is a highly privacy-sensitive use case. Therefore, a dataset is created, which includes privacy-sensitive parts of conversations, synthesized from real-world data. This dataset serves as a basis for fine-tuning a local LLM to highlight and classify privacy-sensitive sections of transcripts created in care interactions, to further mask them to protect privacy.

    Source Data

    Initial Data Collection

    The intial data was collected in the project Caring Robots of TU Wien in cooperation with Caritas Wien. One project track aims to facilitate Large Languge Models (LLM) to support documentation of care workers, with LLM-generated summaries of audio recordings of interactions between care workers and care home residents. The initial data are the transcriptions of those care interactions.

    Data Processing

    The transcriptions were thoroughly reviewed, and sections containing privacy-sensitive information were identified and marked using qualitative data analysis software by two experts. Subsequently, the accessible portions of the interviews were translated from German to US English using the locally executed LLM icky/translate. In the next step, another llama3.1:70b was used locally to synthesize the conversation segments. This process involved generating similar, yet distinct and new, conversations that are not linked to the original data. The dataset was split using the train_test_split function from the <a href="https://scikit-learn.org/1.5/modules/generated/sklearn.model_selection.train_test_split.html" target="_blank"

  9. Sensitive data: legal, ethical and secure storage issues

    • figshare.com
    pdf
    Updated Oct 10, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Australian National Data Service; Kate LeMay (2016). Sensitive data: legal, ethical and secure storage issues [Dataset]. http://doi.org/10.6084/m9.figshare.4003485.v2
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Oct 10, 2016
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Australian National Data Service; Kate LeMay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Slides from the introduction to a panel session at eResearch Australasia (Melbourne, October 2016). Panellists: Kate LeMay (Australian National Data Service), Gabrielle Hirsch (Walter and Eliza Hall Institute of Medical Research), Gordon McGurk (National Health and Medical Research Council) and Jeff Christiansen (Intersect).Short abstractHuman medical, health and personal data are a major category of sensitive data. These data need particular care, both during the management of a research project and when planning to publish them. The Australian National Data Service (ANDS) has developed guides around the management and sharing of sensitive data. ANDS is convening this panel to consider legal, ethical and secure storage issues around sensitive data, in the stages of the research life cycle: research conception and planning, commencement of research, data collection and processing, data analysis storage and management, and dissemination of results and data access.

    The legal framework around privacy in Australia is complex and differs between states. Many Acts regulate the collection, use, disclosure and handling of private data. There are also many ethical considerations around the management and sharing of sensitive data. The National Health and Medical Research Council (NHMRC) has developed the Human Research Ethics Application (HREA) as a replacement for the National Ethics Application Form (NEAF). The aim of the HREA is to be a concise streamlined application to facilitate efficient and effective ethics review for research involving humans. The application will assist researchers to consider the ethical principles of the National Statement of Ethical Conduct in Human Research (2007) in relation to their research.

    National security standard guidelines and health and medical research policy drivers underpin the need for a national fit-for-purpose health and medical research data storage facility to store, access and use health and medical research data. med.data.edu.au is an NCRIS-funded facility that underpins the Australian health and medical research sector by providing secure data storage and compute services that adhere to privacy and confidentiality requirements of data custodians who are responsible for human-derived research datasets.

  10. A set of generated Instagram Data Download Packages (DDPs) to investigate...

    • zenodo.org
    • data.niaid.nih.gov
    html, zip
    Updated Jan 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Boeschoten; Laura Boeschoten; Ruben van den Goorbergh; Daniel Oberski; Ruben van den Goorbergh; Daniel Oberski (2021). A set of generated Instagram Data Download Packages (DDPs) to investigate their structure and content [Dataset]. http://doi.org/10.5281/zenodo.4472606
    Explore at:
    zip, htmlAvailable download formats
    Dataset updated
    Jan 28, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Laura Boeschoten; Laura Boeschoten; Ruben van den Goorbergh; Daniel Oberski; Ruben van den Goorbergh; Daniel Oberski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Instagram data-download example dataset

    In this repository you can find a data-set consisting of 11 personal Instagram archives, or Data-Download Packages (DDPs).

    How the data was generated

    These Instagram accounts were all new and generated by a group of researchers who were interested to figure out in detail
    the structure and variety in structure of these Instagram DDPs. The participants user the Instagram account extensively for approximately a week. The participants also intensively communicated with each other so that the data can be used as an example of a network.

    The data was primarily generated to evaluate the performance of de-identification software. Therefore, the text in the DDPs particularly contain many randomly chosen (Dutch) first names, phone numbers, e-mail addresses and URLS. In addition, the images in the DDPs contain many faces and text as well. The DDPs contain faces and text (usernames) of third parties. However, only content of so-called `professional accounts' are shared, such as accounts of famous individuals or institutions who self-consciously and actively seek publicity, and these sources are easily publicly available. Furthermore, the DDPs do not contain sensitive personal data of these individuals.


    Obtaining your Instagram DDP

    After using the Instagram accounts intensively for approximately a week, the participants requested their personal Instagram DDPs by using the following steps. You can follow these steps yourself if you are interested in your personal Instagram DDP.

    1. Go to www.instagram.com and log in
    2. Click on your profile picture, go to *Settings* and *Privacy and Security*
    3. Scroll to *Data download* and click *Request download*
    4. Enter your email adress and click *Next*
    5. Enter your password and click *Request download*

    Instagram then delivered the data in a compressed zip folder with the format **username_YYYYMMDD.zip** (i.e., Instagram handle and date of download) to the participant, and the participants shared these DDPs with us.

    Data cleaning

    To comply with the Instagram user agreement, participants shared their full name, phone number and e-mail address. In addition, Instagram logged the i.p. addresses the participant used during their active period on Instagram. After colleting the DDPs, we manually replaced such information with random replacements such that the DDps shared here do not contain any personal data of the participants.

    How this data-set can be used

    This data-set was generated with the intention to evaluate the performance of the de-identification software. We invite other researchers to use this data-set for example to investigate what type of data can be found in Instagram DDPs or to investigate the structure of Instagram DDPs. The packages can also be used for example data-analyses, although no substantive research questions can be answered using this data as the data does not reflect how research subjects behave `in the wild'.


    Authors

    The data collection is executed by Laura Boeschoten, Ruben van den Goorbergh and Daniel Oberski of Utrecht University. For questions, please contact l.boeschoten@uu.nl.

    Acknowledgments

    The researchers would like to thank everyone who participated in this data-generation project.

  11. Performance of privacy-preserving inference of our cancer prediction model.

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongha Son; Kyoohyung Han; Yong Seok Lee; Jonghan Yu; Young-Hyuck Im; Soo-Yong Shin (2023). Performance of privacy-preserving inference of our cancer prediction model. [Dataset]. http://doi.org/10.1371/journal.pone.0260681.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yongha Son; Kyoohyung Han; Yong Seok Lee; Jonghan Yu; Young-Hyuck Im; Soo-Yong Shin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance of privacy-preserving inference of our cancer prediction model.

  12. f

    Descriptive statistics, ANOVA results, and pairwise t-test results for...

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adrian Hoffmann; Julia Meisters; Jochen Musch (2023). Descriptive statistics, ANOVA results, and pairwise t-test results for perceived sensitivity, perceived confidentiality, and subjective ease of faking. [Dataset]. http://doi.org/10.1371/journal.pone.0258603.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Adrian Hoffmann; Julia Meisters; Jochen Musch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Descriptive statistics, ANOVA results, and pairwise t-test results for perceived sensitivity, perceived confidentiality, and subjective ease of faking.

  13. Benchmarks for one privacy-preserving GRU cell evaluations.

    • plos.figshare.com
    xls
    Updated Jun 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongha Son; Kyoohyung Han; Yong Seok Lee; Jonghan Yu; Young-Hyuck Im; Soo-Yong Shin (2023). Benchmarks for one privacy-preserving GRU cell evaluations. [Dataset]. http://doi.org/10.1371/journal.pone.0260681.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yongha Son; Kyoohyung Han; Yong Seok Lee; Jonghan Yu; Young-Hyuck Im; Soo-Yong Shin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Benchmarks for one privacy-preserving GRU cell evaluations.

  14. Causes of sensitive information loss in global businesses 2023

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Causes of sensitive information loss in global businesses 2023 [Dataset]. https://www.statista.com/statistics/1387393/loss-sensitive-information-organizations-cause-worldwide/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jan 2023 - Sep 2023
    Area covered
    Worldwide
    Description

    According to a 2023 survey of Chief Information Security Officers (CISO) worldwide, ** percent of sensitive data loss at organizations happens because of carless users, A further **** percent of the respondents said Compromised systems caused data loss. Additionally, around ** percent of respondents, malicious employee or contractor was the cause behind their incidents.

  15. h

    pii-masking-200k

    • huggingface.co
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai4Privacy (2024). pii-masking-200k [Dataset]. http://doi.org/10.57967/hf/1532
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 22, 2024
    Dataset authored and provided by
    Ai4Privacy
    Description

    Ai4Privacy Community

    Join our community at https://discord.gg/FmzWshaaQT to help build open datasets for privacy masking.

      Purpose and Features
    

    Previous world's largest open dataset for privacy. Now it is pii-masking-300k The purpose of the dataset is to train models to remove personally identifiable information (PII) from text, especially in the context of AI assistants and LLMs. The example texts have 54 PII classes (types of sensitive data), targeting 229 discussion… See the full description on the dataset page: https://huggingface.co/datasets/ai4privacy/pii-masking-200k.

  16. Inline Sensitive Data Redaction Card Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jul 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Inline Sensitive Data Redaction Card Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/inline-sensitive-data-redaction-card-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jul 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Inline Sensitive Data Redaction Card Market Outlook



    According to our latest research, the global Inline Sensitive Data Redaction Card market size reached USD 1.54 billion in 2024, with a robust year-on-year growth driven by the increasing demand for real-time data privacy and regulatory compliance across industries. The market is anticipated to expand at a CAGR of 14.2% from 2025 to 2033, projecting a forecasted market size of USD 4.21 billion by 2033. This remarkable growth trajectory is primarily attributed to the proliferation of digital transformation initiatives, stringent data protection mandates, and the exponential rise in data breaches globally.




    One of the primary growth factors fueling the expansion of the Inline Sensitive Data Redaction Card market is the intensifying regulatory landscape surrounding data privacy. With the implementation of regulations such as the General Data Protection Regulation (GDPR), California Consumer Privacy Act (CCPA), and similar frameworks worldwide, organizations are under immense pressure to ensure that sensitive data—such as personally identifiable information (PII), payment card information, and health records—is effectively protected throughout its lifecycle. Inline redaction cards, both hardware and software-based, offer automated, real-time data masking and redaction capabilities that help enterprises comply with these regulations, thereby minimizing the risk of hefty fines and reputational damage. As regulatory scrutiny continues to escalate, the demand for robust redaction solutions is expected to remain strong, propelling market growth.




    Another significant driver is the accelerated digital transformation across various sectors, leading to an unprecedented surge in data generation and exchange. Industries such as financial services, healthcare, government, and retail are increasingly reliant on digital platforms to deliver seamless customer experiences, streamline operations, and enable remote work. However, this digital shift also exposes organizations to greater cyber risks, including data leaks and unauthorized access. Inline Sensitive Data Redaction Cards provide a critical layer of security by ensuring that sensitive information is automatically identified and redacted before it is stored, processed, or transmitted. This capability is particularly vital in environments where large volumes of data traverse multiple endpoints and networks, making manual redaction impractical and error-prone. The integration of AI and machine learning into these solutions further enhances their efficiency, accuracy, and adaptability, making them indispensable for modern enterprises.




    The proliferation of cloud computing and hybrid IT environments is also playing a pivotal role in shaping the Inline Sensitive Data Redaction Card market. As organizations migrate their workloads to the cloud and adopt SaaS applications, the need for data-centric security measures that operate seamlessly across on-premises and cloud infrastructures becomes paramount. Inline redaction solutions are evolving to support diverse deployment models, enabling businesses to maintain consistent data protection policies regardless of where their data resides. This flexibility not only supports compliance and risk management objectives but also empowers organizations to innovate without compromising security. Furthermore, the growing awareness of the business value of data privacy—such as enhanced customer trust and competitive differentiation—is encouraging more enterprises to invest in advanced redaction technologies.




    From a regional perspective, North America continues to dominate the Inline Sensitive Data Redaction Card market, accounting for the largest revenue share in 2024. The region’s leadership is underpinned by the presence of major technology vendors, early adoption of advanced cybersecurity solutions, and a highly regulated business environment. Europe follows closely, driven by stringent data privacy laws and a strong focus on digital sovereignty. Meanwhile, the Asia Pacific region is emerging as the fastest-growing market, fueled by rapid digitalization, increasing cyber threats, and evolving regulatory frameworks in countries such as China, India, and Japan. Latin America and the Middle East & Africa are also witnessing steady growth, albeit from a smaller base, as organizations in these regions ramp up their investments in data protection and compliance solutions.



    <div class="free_sample_div text-center&quo

  17. g

    Processing of personal data declared to the CNIL since 25 May 2018 |...

    • gimi9.com
    Updated Dec 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Processing of personal data declared to the CNIL since 25 May 2018 | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_5ef476d329a15f93d8a66bd1
    Explore at:
    Dataset updated
    Dec 16, 2024
    Description

    Since the entry into force of the General Data Protection Regulation (GDPR), on 25 May 2018, only digital processing of the most sensitive personal data must be subject to prior formalities with the CNIL. These formalities may take the form of simplified declarations (declarations of conformity with a reference framework proposed by the CNIL), requests for an opinion (for the sovereign activities of the State) or applications for authorisation (in the field of health). To find out more: cnil.fr. In accordance with the amended Data Protection Act (Article 36), the CNIL keeps available to the public the list of these formalities in an open and easily reusable format, known as “List article 36”. ** Warnings:** 1/The published data are the result of the prior formalities completed, since May 25, 2018, by the controllers of personal data processing at the CNIL, via its dedicated teleservices. The CNIL cannot be held responsible for their content. 2/The processing carried out on behalf of the State may not appear in the dataset, the formalities having been completed in the form of requests for an opinion on a draft regulatory act (decree or decree) not submitted via the teleservices mentioned. The information relating to these treatments is available on Legifrance, the opinion of the CNIL being published with the act authorising the treatment (to access the deliberations of the CNIL: https://www.legifrance.gouv.fr/initRechExpCnil.do). In addition, some important treatments are subject to fiches on the CNIL website. 3/Exceptionally exempted from the publication of the regulatory act authorising them (decree or decree) are not included in the published data set, in accordance with article 36 of the amended Data Protection Act. The treatments referred to in Article 30 I and II may be exempted, by decree in the Council of State, from the publication of the regulatory act which authorises them. These treatments are mentioned in Decree n°2007-914 of 15 May 2007.

  18. Data De-identification Software Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data De-identification Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-de-identification-software-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset provided by
    Authors
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data De-identification Software Market Outlook



    The global data de-identification software market size was valued at approximately USD 500 million in 2023 and is projected to reach around USD 1.5 billion by 2032, growing at a CAGR of 13.5% during the forecast period. The growth in this market is driven by the increasing need for data privacy and compliance with stringent regulatory requirements across various industries.



    The primary growth factor for the data de-identification software market is the rising awareness and concern regarding data privacy and security. With the advent of big data and the proliferation of digital services, organizations are increasingly recognizing the importance of protecting personal and sensitive information. Data breaches and cyber-attacks have led to significant financial and reputational damages, prompting businesses to invest in advanced data de-identification solutions to mitigate risks. Moreover, regulatory frameworks such as GDPR in Europe, CCPA in California, and HIPAA in the United States mandate strict compliance measures for data privacy, further propelling the demand for these software solutions.



    Another significant driver is the growing adoption of cloud-based services and data analytics. As organizations migrate their data to cloud platforms, the need for robust data protection mechanisms becomes paramount. De-identification software enables companies to anonymize sensitive information before storing it in the cloud, ensuring compliance with data protection regulations and reducing the risk of exposure. Additionally, the rise of data analytics for business intelligence and decision-making necessitates the use of de-identified data to maintain privacy while extracting valuable insights.



    The healthcare sector is particularly noteworthy for its substantial contribution to the market growth. The industry deals with large volumes of sensitive patient information that must be protected from unauthorized access. Data de-identification software plays a crucial role in enabling healthcare providers to share and analyze patient data for research and treatment purposes without compromising privacy. The COVID-19 pandemic has further accelerated the adoption of digital health solutions, increasing the demand for data de-identification tools to ensure compliance with privacy regulations and maintain patient trust.



    Data Masking Technology is becoming increasingly vital as organizations strive to protect sensitive information while maintaining data utility. This technology allows businesses to create a realistic but fictional version of their data, ensuring that sensitive information is not exposed during processes such as software testing, development, and analytics. By substituting sensitive data with anonymized values, data masking technology helps organizations comply with data protection regulations without hindering their operational efficiency. As data privacy concerns continue to rise, the adoption of data masking technology is expected to grow, offering a robust solution for safeguarding sensitive information across various sectors.



    Regionally, North America holds a significant share of the data de-identification software market, driven by the presence of key market players, stringent regulatory requirements, and a high level of digitalization across industries. The Asia Pacific region is expected to witness the fastest growth during the forecast period, attributed to the rapid adoption of digital technologies, increasing awareness of data privacy, and evolving regulatory landscape in countries like China, Japan, and India. Europe also plays a vital role due to the stringent data protection regulations enforced by the GDPR, which mandates rigorous data de-identification practices.



    Component Analysis



    By component, the data de-identification software market is segmented into software and services. The software segment is anticipated to dominate the market, driven by the increasing demand for advanced de-identification tools that can handle large volumes of data efficiently. Organizations are investing in sophisticated software solutions that offer automated and customizable de-identification processes to meet specific compliance requirements. These software solutions often come with features like encryption, tokenization, and data masking, enhancing their appeal to businesses across different sectors.



    <a href="https://dataintelo.com/report/data-masking-

  19. D

    Data De-identification & Pseudonymity Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jul 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data De-identification & Pseudonymity Software Report [Dataset]. https://www.datainsightsmarket.com/reports/data-de-identification-pseudonymity-software-1433473
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jul 17, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Data De-identification & Pseudonymization Software market is experiencing robust growth, driven by increasing concerns around data privacy regulations like GDPR and CCPA, and the rising need to protect sensitive personal information. The market, estimated at $2 billion in 2025, is projected to expand significantly over the forecast period (2025-2033), fueled by a Compound Annual Growth Rate (CAGR) of approximately 15%. This growth is propelled by several factors, including the adoption of cloud-based solutions, advancements in artificial intelligence (AI) and machine learning (ML) for data anonymization, and the growing demand for data-driven insights while maintaining regulatory compliance. Key market segments include healthcare, finance, and government, which are heavily regulated and consequently require robust data anonymization strategies. The competitive landscape is dynamic, with a mix of established players like IBM and Informatica alongside innovative startups like Aircloak and Privitar. The market is witnessing a shift towards more sophisticated techniques like differential privacy and homomorphic encryption, enabling data analysis without compromising individual privacy. The adoption of data de-identification and pseudonymization is expected to accelerate in the coming years, particularly within organizations handling large volumes of personal data. This increase will be influenced by stricter enforcement of privacy regulations, coupled with the expanding application of advanced analytics techniques. While challenges remain, such as the complexity of implementing these solutions and the potential for re-identification vulnerabilities, ongoing technological advancements and increasing awareness are mitigating these risks. Further growth will depend on the development of more user-friendly and cost-effective solutions catering to diverse organizational needs, along with better education and training on best practices in data protection. The market's expansion presents significant opportunities for vendors to develop and market innovative solutions, strengthening their competitive positioning within this rapidly evolving landscape.

  20. Number of data compromises and impacted individuals in U.S. 2005-2024

    • statista.com
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of data compromises and impacted individuals in U.S. 2005-2024 [Dataset]. https://www.statista.com/statistics/273550/data-breaches-recorded-in-the-united-states-by-number-of-breaches-and-records-exposed/
    Explore at:
    Dataset updated
    Jul 14, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In 2024, the number of data compromises in the United States stood at 3,158 cases. Meanwhile, over 1.35 billion individuals were affected in the same year by data compromises, including data breaches, leakage, and exposure. While these are three different events, they have one thing in common. As a result of all three incidents, the sensitive data is accessed by an unauthorized threat actor. Industries most vulnerable to data breaches Some industry sectors usually see more significant cases of private data violations than others. This is determined by the type and volume of the personal information organizations of these sectors store. In 2024 the financial services, healthcare, and professional services were the three industry sectors that recorded most data breaches. Overall, the number of healthcare data breaches in some industry sectors in the United States has gradually increased within the past few years. However, some sectors saw decrease. Largest data exposures worldwide In 2020, an adult streaming website, CAM4, experienced a leakage of nearly 11 billion records. This, by far, is the most extensive reported data leakage. This case, though, is unique because cyber security researchers found the vulnerability before the cyber criminals. The second-largest data breach is the Yahoo data breach, dating back to 2013. The company first reported about one billion exposed records, then later, in 2017, came up with an updated number of leaked records, which was three billion. In March 2018, the third biggest data breach happened, involving India’s national identification database Aadhaar. As a result of this incident, over 1.1 billion records were exposed.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista (2025). Most sensitive private information online 2017 [Dataset]. https://www.statista.com/statistics/418738/personal-information-sensitivity-hacking/
Organization logo

Most sensitive private information online 2017

Explore at:
Dataset updated
Apr 3, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Aug 10, 2017 - Aug 14, 2017
Area covered
United States
Description

This statistic presents the types of personal information which U.S. adults would be most concerned about online hackers gaining access to. During the August 2017 survey period, 73 percent of respondents stated that they would feel most concerned about hackers gaining access to their personal banking information.

Search
Clear search
Close search
Google apps
Main menu