33 datasets found
  1. SDNist v2: Deidentified Data Report Tool

    • nist.gov
    • data.nist.gov
    • +1more
    Updated Mar 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2023). SDNist v2: Deidentified Data Report Tool [Dataset]. http://doi.org/10.18434/mds2-2943
    Explore at:
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    SDNist v2 is a Python package that provides benchmark data and evaluation metrics for deidentified data generators. This version of SDNist supports using the NIST Diverse Communities Data Excerpts, a geographically partitioned, limited feature data set. The deidentified data report evaluates utility and privacy of a given deidentified dataset and generates a summary quality report with performance of a deidentified dataset enumerated and illustrated for each utility and privacy metric.

  2. D

    Sandbox Data Generator Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Sandbox Data Generator Market Research Report 2033 [Dataset]. https://dataintelo.com/report/sandbox-data-generator-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Sandbox Data Generator Market Outlook




    According to our latest research, the global Sandbox Data Generator market size reached USD 1.41 billion in 2024 and is projected to grow at a robust CAGR of 11.2% from 2025 to 2033. By the end of the forecast period, the market is expected to attain a value of USD 3.71 billion by 2033. This remarkable growth is primarily driven by the increasing demand for secure, reliable, and scalable test data generation solutions across industries such as BFSI, healthcare, and IT and telecommunications, as organizations strive to enhance their data privacy and compliance capabilities in an era of heightened regulatory scrutiny and digital transformation.




    A major growth factor propelling the Sandbox Data Generator market is the intensifying focus on data privacy and regulatory compliance across global enterprises. With stringent regulations such as GDPR, CCPA, and HIPAA becoming the norm, organizations are under immense pressure to ensure that non-production environments do not expose sensitive information. Sandbox data generators, which enable the creation of realistic yet anonymized or masked data sets for testing and development, are increasingly being adopted to address these compliance challenges. Furthermore, the rise of DevOps and agile methodologies has led to a surge in demand for efficient test data management, as businesses seek to accelerate software development cycles without compromising on data security. The integration of advanced data masking, subsetting, and anonymization features within sandbox data generation platforms is therefore a critical enabler for organizations aiming to achieve both rapid innovation and regulatory adherence.




    Another significant driver for the Sandbox Data Generator market is the exponential growth of digital transformation initiatives across various industry verticals. As enterprises migrate to cloud-based infrastructures and adopt advanced technologies such as AI, machine learning, and big data analytics, the need for high-quality, production-like test data has never been more acute. Sandbox data generators play a pivotal role in supporting these digital initiatives by supplying synthetic yet realistic datasets that facilitate robust testing, model training, and system validation. This, in turn, helps organizations minimize the risks associated with deploying new applications or features, while reducing the time and costs associated with traditional data provisioning methods. The rise of microservices architecture and API-driven development further amplifies the necessity for dynamic, scalable, and automated test data generation solutions.




    Additionally, the proliferation of data breaches and cyber threats has underscored the importance of robust data protection strategies, further fueling the adoption of sandbox data generators. Enterprises are increasingly recognizing that using real production data in test environments can expose them to significant security vulnerabilities and compliance risks. By leveraging sandbox data generators, organizations can create safe, de-identified datasets that maintain the statistical properties of real data, enabling comprehensive testing without jeopardizing sensitive information. This trend is particularly pronounced in sectors such as BFSI and healthcare, where data sensitivity and compliance requirements are paramount. As a result, vendors are investing heavily in enhancing the security, scalability, and automation capabilities of their sandbox data generation solutions to cater to the evolving needs of these high-stakes industries.




    From a regional perspective, North America is anticipated to maintain its dominance in the global Sandbox Data Generator market, driven by the presence of leading technology providers, a mature regulatory landscape, and high digital adoption rates among enterprises. However, the Asia Pacific region is poised for the fastest growth, fueled by rapid digitalization, increasing investments in IT infrastructure, and growing awareness of data privacy and compliance issues. Europe also represents a significant market, supported by stringent data protection regulations and a strong focus on innovation across key industries. As organizations worldwide continue to prioritize data security and agile development, the demand for advanced sandbox data generation solutions is expected to witness sustained growth across all major regions.



    Component Analysis




    The Sandbox Data Genera

  3. G

    Test Data Generation as a Service Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Test Data Generation as a Service Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/test-data-generation-as-a-service-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Oct 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Test Data Generation as a Service Market Outlook



    According to our latest research, the global Test Data Generation as a Service market size reached USD 1.36 billion in 2024, reflecting a dynamic surge in demand for efficient and scalable test data solutions. The market is expected to expand at a robust CAGR of 18.1% from 2025 to 2033, reaching a projected value of USD 5.41 billion by the end of the forecast period. This remarkable growth is primarily driven by the accelerated adoption of digital transformation initiatives, increasing complexity in software development, and the critical need for secure and compliant data management practices across industries.




    One of the primary growth factors for the Test Data Generation as a Service market is the rapid digitalization of enterprises across diverse verticals. As organizations intensify their focus on delivering high-quality software products and services, the need for realistic, secure, and diverse test data has become paramount. Modern software development methodologies, such as Agile and DevOps, necessitate continuous testing cycles that depend on readily available and reliable test data. This demand is further amplified by the proliferation of cloud-native applications, microservices architectures, and the integration of artificial intelligence and machine learning in business processes. Consequently, enterprises are increasingly turning to Test Data Generation as a Service solutions to streamline their testing workflows, reduce manual effort, and accelerate time-to-market for their digital offerings.




    Another significant driver propelling the market is the stringent regulatory landscape governing data privacy and security. With regulations such as GDPR, HIPAA, and CCPA becoming more prevalent, organizations face immense pressure to ensure that sensitive information is not exposed during software testing. Test Data Generation as a Service providers offer advanced data masking and anonymization capabilities, enabling enterprises to generate synthetic or de-identified data sets that comply with regulatory requirements. This not only mitigates the risk of data breaches but also fosters a culture of compliance and trust among stakeholders. Furthermore, the increasing frequency of cyber threats and data breaches has heightened the emphasis on robust security testing, further boosting the adoption of these services across sectors like BFSI, healthcare, and government.




    The growing complexity of IT environments and the need for seamless integration across legacy and modern systems also contribute to the expansion of the Test Data Generation as a Service market. Enterprises are grappling with heterogeneous application landscapes, comprising on-premises, cloud, and hybrid deployments. Test Data Generation as a Service solutions offer the flexibility to generate and provision data across these environments, ensuring consistent and reliable testing outcomes. Additionally, the scalability of cloud-based offerings allows organizations to handle large volumes of test data without significant infrastructure investments, making these solutions particularly attractive for small and medium enterprises (SMEs) seeking cost-effective testing alternatives.




    From a regional perspective, North America continues to dominate the Test Data Generation as a Service market, accounting for the largest share in 2024, followed closely by Europe and Asia Pacific. The region's leadership is attributed to the presence of major technology providers, early adoption of advanced software testing practices, and a mature regulatory environment. However, Asia Pacific is poised to exhibit the highest CAGR during the forecast period, driven by the rapid expansion of the IT and telecommunications sector, increasing digital initiatives by governments, and a burgeoning startup ecosystem. Latin America and the Middle East & Africa are also witnessing steady growth, supported by rising investments in digital infrastructure and heightened awareness about data security and compliance.





    Component An

  4. Synthetic Medical Dataset

    • kaggle.com
    zip
    Updated Oct 27, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amirhossein Jafarnezhad (2025). Synthetic Medical Dataset [Dataset]. https://www.kaggle.com/datasets/amirjdai/synthetic-medical-dataset
    Explore at:
    zip(3699946 bytes)Available download formats
    Dataset updated
    Oct 27, 2025
    Authors
    Amirhossein Jafarnezhad
    Description

    This dataset contains the core data to be used in projects for the textbook Introduction to Biomedical Data Science edited by Robert Hoyt MD FACP ABPM-CI, and Robert Muenchen MS PSAT (2019).

    Data was genererated using Synthea, a synthetic patient generator that models the medical history of synthetic patients. Their mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. De-identified real data still presents a challenge in the medical field because there are peopel who excel at re-identification of these data. For that reason the average medical center, etc. will not share their patient data. Most governmental data is at the hospital level. NHANES data is an exception.

    You can read Synthea's first academic paper here.

    284 scholarly articles cite this dataset (View in Google Scholar)

    Authors: Brenda Griffith

  5. NIST Excerpts Benchmark Data

    • nist.gov
    • data.nist.gov
    • +1more
    Updated Jan 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2025). NIST Excerpts Benchmark Data [Dataset]. http://doi.org/10.18434/mds2-2895
    Explore at:
    Dataset updated
    Jan 31, 2025
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    License

    https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

    Description

    The NIST Excerpts Benchmark Data are a set of target data for deidentification algorithms. The data are configured to work with "SDNist: Synthetic Data Report Tool", a package for evaluating synthetic data generators: https://github.com/usnistgov/SDNist. An installation of SDNist will download the data resources automatically. Jan 2025 -- Benhcmark Excerpts: - NIST American Community Survey (ACS) Data Excerpts, 24 demographic features over 40k records, - NIST Survey of Business Owners (SBO) Data Excerpts, 130 demographic and financial features over 161k records The data are curated subsets of U.S. Census Bureau products.

  6. G

    Synthetic Medical Image Data Services Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Synthetic Medical Image Data Services Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/synthetic-medical-image-data-services-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Aug 4, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Synthetic Medical Image Data Services Market Outlook



    According to our latest research, the global synthetic medical image data services market size stood at USD 452 million in 2024, reflecting robust adoption across healthcare and life sciences sectors. The market is expected to grow at a remarkable CAGR of 33.7% from 2025 to 2033, reaching a projected value of USD 5.4 billion by 2033. This exponential growth is primarily driven by the escalating demand for high-quality, diverse, and annotated medical imaging datasets to power artificial intelligence (AI) and machine learning (ML) algorithms for diagnostics, research, and training purposes. As per our comprehensive analysis, the rapid integration of synthetic data solutions is revolutionizing medical imaging workflows, enabling healthcare stakeholders to overcome data scarcity and privacy concerns while accelerating innovation.




    The synthetic medical image data services market is experiencing significant growth due to the increasing need for large, annotated datasets to train and validate AI-driven diagnostic tools. Traditional approaches to medical image acquisition are often hampered by regulatory restrictions, data privacy concerns, and the inherent variability and scarcity of rare disease cases. Synthetic data generation addresses these challenges by creating realistic, customizable, and privacy-compliant datasets that enhance the performance and generalizability of AI models. Furthermore, the adoption of synthetic data accelerates the development cycle for new imaging technologies and supports the validation of medical devices, fostering a more agile and innovative healthcare ecosystem. The growing sophistication of generative adversarial networks (GANs) and other deep learning techniques has further improved the realism and utility of synthetic images, making them increasingly indispensable for modern medical imaging applications.




    Another key growth factor for the synthetic medical image data services market is the rising emphasis on data privacy and compliance with regulations such as HIPAA in the United States and GDPR in Europe. These regulations impose stringent requirements on the use and sharing of patient data, often limiting the availability of real-world medical images for research and commercial purposes. Synthetic data offers a compelling solution by generating de-identified datasets that closely mimic real patient data without exposing sensitive information. This not only facilitates collaborative research and cross-institutional projects but also enables companies to scale their AI development efforts globally without the risk of data breaches or legal repercussions. As the healthcare industry continues to prioritize patient confidentiality, the demand for synthetic data services is expected to surge.




    The market is further propelled by the expanding applications of synthetic medical image data in education, training, and research. Medical professionals, students, and researchers increasingly rely on diverse and complex datasets to hone their diagnostic skills, test new hypotheses, and develop innovative imaging solutions. Synthetic data bridges the gap where real-world datasets are insufficient or unavailable, providing a cost-effective and scalable alternative for simulation-based training and validation. This capability is especially valuable in regions with limited access to advanced imaging resources or rare clinical cases. As academic and research institutions intensify their focus on AI and machine learning in healthcare, synthetic data services are poised to become a cornerstone of medical education and innovation.




    From a regional perspective, North America currently leads the synthetic medical image data services market, accounting for the largest share due to its advanced healthcare infrastructure, strong presence of AI technology providers, and supportive regulatory environment. Europe follows closely, driven by robust investments in digital health and a proactive stance on data privacy. The Asia Pacific region is emerging as a high-growth market, fueled by rapid digital transformation, increasing healthcare expenditure, and a burgeoning ecosystem of AI startups. Latin America and the Middle East & Africa, while still nascent, are expected to witness accelerated adoption as healthcare modernization initiatives gain momentum. Overall, the global market landscape is characterized by dynamic growth opportunities, with both developed and emerging regions contributing to the expansion of synthetic medical image da

  7. H

    Niger Year 2 Deidentified Data (2017-2018)

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Mar 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Lawrence Aber; Lindsay Brown; Carly Tubbs Dolan; Ha Yeon Kim; Jeannie Annan (2023). Niger Year 2 Deidentified Data (2017-2018) [Dataset]. http://doi.org/10.7910/DVN/MX4JYG
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    John Lawrence Aber; Lindsay Brown; Carly Tubbs Dolan; Ha Yeon Kim; Jeannie Annan
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    2017 - 2018
    Area covered
    Niger
    Dataset funded by
    NYU Abu Dhabi Research Institute
    Economic and Social Research Council (ESRC)- UK Department For Interntional Development (DFID: now UK Foreign & Commonwealth Office)
    Dubai Cares
    Description

    To generate the evidence needed to understand, improve and share what works to help refugee children learn and succeed in school, the International Rescue Committee (IRC) and NYU Global TIES for Children (TIES/NYU) established a strategic partnership, the Evidence for Action: Education in Emergencies (3EA) initiative. 3EA in Niger was designed and delivered to help strengthen the public education system in Niger and to serve refugee, IDP and host community children in the hard-hit Diffa region. It strove to achieve this through a remedial tutoring program infused with climate-targeted social-emotional learning (SEL) principles and practices (Tutoring in a Healing Classrooms), and adding skill-targeted SEL interventions (Mindfulness activities, Brain Games). Each year, the program was designed to be implemented with approximately 2000 students in second to fourth grades attending 28 Nigerien public schools across Diffa. Ninety tutors were enlisted per year to serve this group with each tutoring class averaging about 20 students. A series of cluster randomized control trials over the course of two years were held to evaluate the effectiveness of the HCT and skill-targeted SEL programming. This dataset does not contain treatment indicators. Please contact the authors for access if interested in using those variables.

  8. u

    PATRON Primary Care Research Data Repository

    • figshare.unimelb.edu.au
    pdf
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DOUGLAS BOYLE; LENA SANCI; Jon Emery; JANE GUNN; JANE HOCKING; JO-ANNE MANSKI-NANKERVIS; RACHEL CANAWAY (2023). PATRON Primary Care Research Data Repository [Dataset]. http://doi.org/10.26188/5c52934b4aeb0
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    The University of Melbourne
    Authors
    DOUGLAS BOYLE; LENA SANCI; Jon Emery; JANE GUNN; JANE HOCKING; JO-ANNE MANSKI-NANKERVIS; RACHEL CANAWAY
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PATRON is a human ethics approved program of research incorporating an enduring de-identified repository of Primary Care data facilitating research and knowledge generation. PATRON is a part of the 'Data for Decisions' initiative of the Department of General Practice, University of Melbourne. 'Data for Decisions' is a research initiative in partnership with general practices. It is an exciting undertaking that makes possible primary care research projects to increase knowledge and improve healthcare practices and policy. Principal Researcher: Jon EmeryData Custodian: Lena SanciData Steward: Douglas BoyleManager: Rachel CanawayMore information about Data for Decisions and utilising PATRON data is available from the Data for Decisions website.

  9. d

    Data from: Evaluating culture-free targeted next-generation sequencing for...

    • datadryad.org
    • search.dataone.org
    • +2more
    zip
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Colman; Rebecca E Colman; Marva Seifert; Andres De la Rossa; Sophia B Georghiou; Christine Hoogland; Swapna Uplekar; Sacha Laurent; Camilla Rodrigues; Priti Kambli; Nestani Tukvadze; Nino Maghradze; Shaheed V Omar; Lavania Joseph; Anita Suresh; Timothy C Rodwell (2025). Evaluating culture-free targeted next-generation sequencing for diagnosing drug-resistant tuberculosis: A multicentre clinical study of two end-to-end commercial workflows [Dataset]. http://doi.org/10.5061/dryad.qjq2bvqs6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 25, 2025
    Dataset provided by
    Dryad
    Authors
    Rebecca Colman; Rebecca E Colman; Marva Seifert; Andres De la Rossa; Sophia B Georghiou; Christine Hoogland; Swapna Uplekar; Sacha Laurent; Camilla Rodrigues; Priti Kambli; Nestani Tukvadze; Nino Maghradze; Shaheed V Omar; Lavania Joseph; Anita Suresh; Timothy C Rodwell
    Time period covered
    Mar 5, 2025
    Description

    Participants meeting eligibility criteria were asked to provide at least 6 mL sputum either in one or two samples collected on day 1 and day 2. Samples were homogenized, decontaminated, re-suspended in 4mL final volume for all downstream testing. MTB/RIF, acid-fast bacilli (AFB) smear, Hain MTBDRplus and MTBDRsl, Mycobacteria Growth Indicator Tube (MGIT) and Löwenstein–Jensen medium (LJ) culture were performed on the sediment for standard of care testing. MGIT pDST was performed for all culture-positive samples for RIF, INH, FQ (MFX, LFX), PZA, AMK, CAP, KAN, BDQ, LZD, CLF, STR, and EMB at WHO endorsed critical concentrations.

  10. ⚙️ SQL Tutorial Exercise Data

    • kaggle.com
    zip
    Updated Oct 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    mexwell (2023). ⚙️ SQL Tutorial Exercise Data [Dataset]. https://www.kaggle.com/datasets/mexwell/sql-tutorial-exercise-data
    Explore at:
    zip(3701453 bytes)Available download formats
    Dataset updated
    Oct 2, 2023
    Authors
    mexwell
    Description

    This dataset was created to be the base of the data.world SQL tutorial exercises. Data was genererated using Synthea, a synthetic patient generator that models the medical history of synthetic patients. Their mission is to output high-quality synthetic, realistic but not real, patient data and associated health records covering every aspect of healthcare. The resulting data is free from cost, privacy, and security restrictions, enabling research with Health IT data that is otherwise legally or practically unavailable. De-identified real data still presents a challenge in the medical field because there are peopel who excel at re-identification of these data. For that reason the average medical center, etc. will not share their patient data. Most governmental data is at the hospital level. NHANES data is an exception.

    You can read Synthea's first academic paper here.

    Original Data

    Acknowlegement

    Foto von Rubaitul Azad auf Unsplash

  11. H

    Lebanon Year 1 Deidentified Data (2016-2017)

    • dataverse.harvard.edu
    Updated Mar 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Lawrence Aber; Lindsay Brown; Carly Tubbs Dolan; Ha Yeon Kim; Jeannie Annan (2023). Lebanon Year 1 Deidentified Data (2016-2017) [Dataset]. http://doi.org/10.7910/DVN/97Q2B8
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    John Lawrence Aber; Lindsay Brown; Carly Tubbs Dolan; Ha Yeon Kim; Jeannie Annan
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    2016 - 2017
    Area covered
    Lebanon
    Dataset funded by
    Spencer Foundation
    Dubai Cares
    NYU Abu Dhabi Research Institute
    Porticus Foundation
    Description

    To generate the evidence needed to understand, improve and share what works to help refugee children learn and succeed in school, the International Rescue Committee (IRC) and NYU Global TIES for Children (TIES/NYU) established a strategic partnership, the Evidence for Action: Education in Emergencies (3EA) initiative. In Lebanon, this program was designed and delivered to complement the Lebanese public education system and enhance learning and retention of Syrian refugee children through remedial tutoring programs infused with climate-targeted social-emotional learning (SEL) principles and practices (Tutoring in a Healing Classrooms - HCT) and skill-targeted SEL interventions (Mindfulness activities, Brain Games, 5-Component SEL Curriculum). An estimated 5000 Syrian refugee children enrolled in Lebanese public schools and teachers working with them participated in the program. These students attended 2.5-hour-long tutoring sessions per day, three times a week, with each session consisting of three lessons (Arabic, French/English, mathematics) and each lesson lasting between 30 to 40 minutes. A series of cluster randomized control trials over the course of two years were held to evaluate the effectiveness of the HCT and skill-targeted SEL programming. This dataset does not contain treatment indicators. Please contact the authors for access if interested in using those variables.

  12. f

    Data Sheet 1_Synthetic4Health: generating annotated synthetic clinical...

    • frontiersin.figshare.com
    pdf
    Updated May 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Libo Ren; Samuel Belkadi; Lifeng Han; Warren Del-Pinto; Goran Nenadic (2025). Data Sheet 1_Synthetic4Health: generating annotated synthetic clinical letters.pdf [Dataset]. http://doi.org/10.3389/fdgth.2025.1497130.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2025
    Dataset provided by
    Frontiers
    Authors
    Libo Ren; Samuel Belkadi; Lifeng Han; Warren Del-Pinto; Goran Nenadic
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Clinical letters contain sensitive information, limiting their use in model training, medical research, and education. This study aims to generate reliable, diverse, and de-identified synthetic clinical letters to support these tasks. We investigated multiple pre-trained language models for text masking and generation, focusing on Bio_ClinicalBERT, and applied different masking strategies. Evaluation included qualitative and quantitative assessments, downstream named entity recognition (NER) tasks, and clinically focused evaluations using BioGPT and GPT-3.5-turbo. The experiments show: (1) encoder-only models perform better than encoder–decoder models; (2) models trained on general corpora perform comparably to clinical-domain models if clinical entities are preserved; (3) preserving clinical entities and document structure aligns with the task objectives; (4) Masking strategies have a noticeable impact on the quality of synthetic clinical letters: masking stopwords has a positive impact, while masking nouns or verbs has a negative effect; (5) The BERTScore should be the primary quantitative evaluation metric, with other metrics serving as supplementary references; (6) Contextual information has only a limited effect on the models' understanding, suggesting that synthetic letters can effectively substitute real ones in downstream NER tasks; (7) Although the model occasionally generates hallucinated content, it appears to have little effect on overall clinical performance. Unlike previous research, which primarily focuses on reconstructing original letters by training language models, this paper provides a foundational framework for generating diverse, de-identified clinical letters. It offers a direction for utilizing the model to process real-world clinical letters, thereby helping to expand datasets in the clinical domain. Our codes and trained models are available at https://github.com/HECTA-UoM/Synthetic4Health.

  13. p

    MIMIC-IV-Note: Deidentified free-text clinical notes

    • physionet.org
    • oppositeofnorth.com
    Updated Jan 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark (2023). MIMIC-IV-Note: Deidentified free-text clinical notes [Dataset]. http://doi.org/10.13026/1n74-ne17
    Explore at:
    Dataset updated
    Jan 6, 2023
    Authors
    Alistair Johnson; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The advent of large, open access text databases has driven advances in state-of-the-art model performance in natural language processing (NLP). The relatively limited amount of clinical data available for NLP has been cited as a significant barrier to the field's progress. Here we describe MIMIC-IV-Note: a collection of deidentified free-text clinical notes for patients included in the MIMIC-IV clinical database. MIMIC-IV-Note contains 331,794 deidentified discharge summaries from 145,915 patients admitted to the hospital and emergency department at the Beth Israel Deaconess Medical Center in Boston, MA, USA. The database also contains 2,321,355 deidentified radiology reports for 237,427 patients. All notes have had protected health information removed in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. All notes are linkable to MIMIC-IV providing important context to the clinical data therein. The database is intended to stimulate research in clinical natural language processing and associated areas.

  14. H

    K18MD019159 - (PI: Goldstein) - Specific Aim 2 - Summarized Data

    • dataverse.harvard.edu
    Updated Nov 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Evan Goldstein (2025). K18MD019159 - (PI: Goldstein) - Specific Aim 2 - Summarized Data [Dataset]. http://doi.org/10.7910/DVN/RHMXAM
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Evan Goldstein
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Summarized (deidentified) qualitative data for Specific Aim (SA) 2 of K18MD019159 (PI: Goldstein). One accompanying file is enclosed, which describes the methods and interview questions used to generate the summarized data. Please see the following for additional information about the summarized qualitative data provided in this dataset: Goldstein, E.V., Sanger, A. & Hill, J.L. Firearm experiences and safe storage challenges among a sample of Black adults: a rapid qualitative analysis. Inj. Epidemiol. 12, 79 (2025). https://doi.org/10.1186/s40621-025-00634-5

  15. d

    Loneliness and well-being in Finnish immigrants: A multimodal dataset from...

    • dataone.org
    • datadryad.org
    Updated Jun 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuning Wang; Jennifer Auxier; Mark Amayag; Parisa Farzanehkari; Anna Axelin; Iman Azimi; Amir M. Rahmani; Pasi Liljeberg (2025). Loneliness and well-being in Finnish immigrants: A multimodal dataset from wearables and passive data collection [Dataset]. http://doi.org/10.5061/dryad.qz612jmrn
    Explore at:
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Yuning Wang; Jennifer Auxier; Mark Amayag; Parisa Farzanehkari; Anna Axelin; Iman Azimi; Amir M. Rahmani; Pasi Liljeberg
    Description

    This dataset was collected from first-generation immigrants between 2022 and 2023. Over a 28-day period, 39 participants aged 18 to 65, fluent in English and experiencing loneliness (UCLA Loneliness Scale score ≥ 28) contributed to the study. Data collection utilized Samsung Watch Active 2, Oura Ring, AWARE, and Centralive smartphone application. This dataset contains raw data from photoplethysmogram (PPG), inertial measurement unit (IMU) readings, air pressure, and processed data on heart rate, heart rate variability, sleep metrics (bedtime, stages, quality), physical activity (steps, active calories, activity types), and smartphone usage patterns (screen time, notifications, call and message logs). Participants also completed ecological momentary assessments (EMA) and weekly surveys, including instruments like the Beck Depression Inventory (BDI), Patient Health Questionnaire-9 (PHQ-9), Perceived Stress Scale, Sense of Coherence Scale, Social Connectedness Scale, Twente Engagement with..., Design and set up This study was designed to create a longitudinal dataset capturing physiological, behavioral, and psychological data from first-generation immigrants living in Finland. The dataset aims to support research on the relationship between mental health and daily lifestyle factors, providing a foundation for further detection algorithm development. To achieve this, the study collected multimodal data over a 28-day period from every participant. Objective data were gathered from wearable devices, which recorded sleep patterns, physical activity, and cardiovascular health metrics and raw PPG signals. Passive smartphone data, such as screen usage, notifications, calls, and messages, were also collected to capture digital behavior patterns. Subjective data were collected through EMAs delivered via push notifications and weekly self-report surveys. These assessments measured daily emotional states—loneliness, stress, depression, and social connectedness. By integrating multiple d..., , # Loneliness and well-being in Finnish immigrants: A multimodal dataset from wearables and passive data collection

    Overview

    The dataset consists of longitudinal physiological, behavioral, and self-reported data collected from first-generation immigrants in Finland during 2022 and 2023. The study included 39 participants aged 18–65, all fluent in English and experiencing loneliness (UCLA Loneliness Scale score ≥28). Data were collected over a 28-day period using multimodal sources, including the Samsung Watch Active 2, Oura Ring, and the AWARE smartphone application.

    The dataset includes raw and processed data on cardiovascular health, sleep patterns, physical activity, smartphone usage, and mental health assessments. Daily and weekly ecological momentary assessments (EMA) captured momentary emotional states, while structured surveys administered through Centralive provided insights into participants' mental health and well-being.

    Data and File Structure

    At the root of the dat..., All participants provided written informed consent to share their de-identified data for public research purposes at the time of enrollment.

    To protect participant privacy and minimize the risk of re-identification, we applied the following de-identification procedures:

    1. All direct identifiers (e.g., names, contact information, device IDs) were removed.
    2. Each participant was assigned a pseudonymous identifier in the format Participant_#.
    3. Timestamp fields were randomly shifted to obscure precise timing while preserving temporal patterns essential for analysis.
    4. App identifiers were generalized into broader categories (e.g., “social media app†).
    5. GPS location data were excluded.
  16. D

    Pseudonymized Sandboxes For Data Science Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Pseudonymized Sandboxes For Data Science Market Research Report 2033 [Dataset]. https://dataintelo.com/report/pseudonymized-sandboxes-for-data-science-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Pseudonymized Sandboxes for Data Science Market Outlook



    According to our latest research, the global market size for Pseudonymized Sandboxes for Data Science reached USD 1.14 billion in 2024, reflecting a robust demand for secure and privacy-compliant data environments. The market is growing at a CAGR of 17.2% and is projected to reach USD 5.03 billion by 2033. This remarkable growth is primarily driven by increasing regulatory requirements for data privacy, the proliferation of sensitive data across industries, and the rising adoption of advanced analytics and artificial intelligence in business operations.




    The surge in data privacy regulations such as GDPR, HIPAA, and CCPA has become a significant growth driver for the Pseudonymized Sandboxes for Data Science market. Enterprises are under immense pressure to ensure that their data science and AI initiatives do not compromise personal or sensitive information. Pseudonymized sandboxes provide a secure, controlled environment where data scientists can work with de-identified data, minimizing the risk of data breaches and unauthorized access. This approach enables organizations to maintain compliance while accelerating analytics-driven innovation, making these sandboxes indispensable in regulated sectors such as healthcare, finance, and government. The demand is further amplified by the increasing frequency of cyber threats and the need for robust data governance frameworks.




    Another key factor fueling the market’s expansion is the exponential growth of big data and the adoption of cloud-based analytics solutions. As businesses generate and collect vast amounts of data, the need to analyze this information without exposing sensitive details has become paramount. Pseudonymized sandboxes offer a pragmatic solution, allowing organizations to leverage data for advanced analytics, machine learning, and AI model training while safeguarding privacy. The flexibility to deploy these sandboxes either on-premises or in the cloud caters to diverse enterprise needs, supporting scalability and cost-efficiency. This capability is especially attractive to industries like retail and IT & telecom, where rapid innovation and customer-centricity are critical.




    The market is also benefiting from the increasing collaboration between data science teams and business units. As organizations strive to become more data-driven, cross-functional teams require access to data without violating privacy norms. Pseudonymized sandboxes enable secure data sharing and experimentation, fostering a culture of innovation. Additionally, advances in pseudonymization technologies, such as tokenization and differential privacy, are enhancing the effectiveness and reliability of these sandboxes. The integration of automation and AI-driven data masking further streamlines the process, reducing manual intervention and operational risk. These trends collectively contribute to the sustained growth and adoption of pseudonymized sandboxes across various sectors.




    Regionally, North America dominates the Pseudonymized Sandboxes for Data Science market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The presence of stringent regulatory frameworks, mature data science ecosystems, and a high concentration of technology-driven enterprises are key factors underpinning North America’s leadership. Meanwhile, Asia Pacific is witnessing the fastest growth, driven by rapid digitalization, increasing awareness of data privacy, and government initiatives to enhance cybersecurity. Europe’s growth is anchored in its robust regulatory landscape and strong emphasis on data protection, while Latin America and the Middle East & Africa are gradually embracing pseudonymized sandboxes as digital transformation accelerates in these regions.



    Component Analysis



    The Pseudonymized Sandboxes for Data Science market is segmented by component into software and services. The software segment comprises the core platforms and tools that enable pseudonymization, data masking, tokenization, and sandboxing functionalities. This segment is witnessing significant growth as organizations increasingly invest in advanced software solutions to automate and streamline their data privacy processes. Modern pseudonymization software leverages artificial intelligence and machine learning to enhance data security, ensure regulatory compliance, and facilitate seamless integration with existing analytics infrastructure. The ab

  17. u

    Ministry of Justice Synthetic Data First Datasets, 2011-2023

    • datacatalogue.ukdataservice.ac.uk
    Updated Jun 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Justice (2025). Ministry of Justice Synthetic Data First Datasets, 2011-2023 [Dataset]. http://doi.org/10.5255/UKDA-SN-9398-3
    Explore at:
    Dataset updated
    Jun 18, 2025
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Authors
    Ministry of Justice
    Time period covered
    Jan 1, 2014 - Mar 30, 2022
    Area covered
    Wales, England
    Description

    The Ministry of Justice (MoJ) Data First Synthetic Data Project aims to improve engagement with Data First datasets by making synthetic versions of content available to enable more rapid development of research proposals and to thereby enhance the potential for linked administrative data to improve understanding and outcomes across justice systems. The project has led the development of two components: a dataset generation platform and an initial release of lo-fidelity, synthetic data tables.

    This study includes a synthetically-generated version of the Ministry of Justice Data First Probation datasets. Synthetic versions of all 43 tables in the MoJ Data First data ecosystem have been created. These versions can be used / joined in the same way as the real datasets. As well as underpinning training, synthetic datasets should enable researchers to explore research questions and to design research proposals prior to submitting these for approval. The code created during this exploration and design process should then enable initial results to be obtained as soon as data access is granted.

    The Ministry of Justice Data First probation dataset provides data on people under the supervision of the probation service in England and Wales from 2014. This is a statutory criminal justice service that supervises high-risk offenders released into the community. The data has been extracted from the management information system national Delius (nDelius), used by His Majesty's Prisons and Probation Service (HMPPS) to manage people on probation.

    Information is included on service users' characteristics and offence, and on their pre-sentence reports, sentence requirements, licence conditions and post-sentence supervision; for example, age, gender, ethnicity, offence category, key dates relating to sentence and recalls, activities and programmes required as part of rehabilitation (e.g. drug and alcohol treatment, skills training) and limitations set on their activities (e.g. curfew, location monitoring, drugs testing).

    Each record in the dataset gives information about a single person and probation journey. As part of Data First, records have been deidentified and deduplicated, using our probabilistic record linkage package, Splink, so that a unique identifier is assigned to all records believed to relate to the same person, allowing for longitudinal analysis and investigation of repeat interactions with probation. This aims to improve on links already made within probation services. This opens up the potential to better understand probation service users and address questions on, for example, what works to reduce reoffending.

    The Ministry of Justice Data First linking dataset can be used in combination with this and other Data First datasets to join up administrative records about people from across justice services (courts, prisons and probation) to increase understanding around users' interactions, pathways and outcomes.

  18. mimic-iii-clinical-database-demo-1.4

    • kaggle.com
    zip
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Montassar bellah (2025). mimic-iii-clinical-database-demo-1.4 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iii-clinical-database-demo-1-4
    Explore at:
    zip(11100065 bytes)Available download formats
    Dataset updated
    Apr 1, 2025
    Authors
    Montassar bellah
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Abstract MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26). Though deidentified, MIMIC-III contains detailed information regarding the care of real patients, and as such requires credentialing before access. To allow researchers to ascertain whether the database is suitable for their work, we have manually curated a demo subset, which contains information for 100 patients also present in the MIMIC-III Clinical Database. Notably, the demo dataset does not include free-text notes.

    Background In recent years there has been a concerted move towards the adoption of digital health record systems in hospitals. Despite this advance, interoperability of digital systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offers in terms of understanding and improving care is yet to be fully realized.

    MIMIC-III integrates deidentified, comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts, and makes it widely accessible to researchers internationally under a data use agreement. The open nature of the data allows clinical studies to be reproduced and improved in ways that would not otherwise be possible.

    The MIMIC-III database was populated with data that had been acquired during routine hospital care, so there was no associated burden on caregivers and no interference with their workflow. For more information on the collection of the data, see the MIMIC-III Clinical Database page.

    Methods The demo dataset contains all intensive care unit (ICU) stays for 100 patients. These patients were selected randomly from the subset of patients in the dataset who eventually die. Consequently, all patients will have a date of death (DOD). However, patients do not necessarily die during an individual hospital admission or ICU stay.

    This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.

    Data Description MIMIC-III is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-III Clinical Database page. The demo shares an identical schema, except all rows in the NOTEEVENTS table have been removed.

    The data files are distributed in comma separated value (CSV) format following the RFC 4180 standard. Notably, string fields which contain commas, newlines, and/or double quotes are encapsulated by double quotes ("). Actual double quotes in the data are escaped using an additional double quote. For example, the string she said "the patient was notified at 6pm" would be stored in the CSV as "she said ""the patient was notified at 6pm""". More detail is provided on the RFC 4180 description page: https://tools.ietf.org/html/rfc4180

    Usage Notes The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset.

    CSV files can be opened natively using any text editor or spreadsheet program. However, some tables are large, and it may be preferable to navigate the data stored in a relational database. One alternative is to create an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

    DB Browser for SQLite is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. We have found this tool to be useful for navigating SQLite files. Information regarding installation of the software and creation of the database can be found online: https://sqlitebrowser.org/

    Release Notes Release notes for the demo follow the release notes for the MIMIC-III database.

    Acknowledgements This research and development was supported by grants NIH-R01-EB017205, NIH-R01-EB001659, and NIH-R01-GM104987 from the National Institutes of Health. The authors would also like to thank Philips Healthcare and staff at the Beth Israel Deaconess Medical Center, Boston, for supporting database development, and Ken Pierce for providing ongoing support for the MIMIC research community.

    Conflicts of Interest The authors declare no competing financial interests.

    References Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Mo...

  19. H

    Data from: Economic Outcomes Among Microfinance Group Members Receiving...

    • dataverse.harvard.edu
    • search.dataone.org
    Updated May 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Wilson-Barthes; Jon Steingrimsson; Youjin Lee; Dan N Tran; Juddy Wachira; Catherine Kafu; Sonak D Pastakia; Rajesh Vedanthan; Jamil Alarik Said; Becky L Genberg; Omar Galarraga (2024). Economic Outcomes Among Microfinance Group Members Receiving Community-based Chronic Disease Care: Cluster Randomized Trial Evidence From Kenya [Dataset]. http://doi.org/10.7910/DVN/SLUA08
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 23, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    Marta Wilson-Barthes; Jon Steingrimsson; Youjin Lee; Dan N Tran; Juddy Wachira; Catherine Kafu; Sonak D Pastakia; Rajesh Vedanthan; Jamil Alarik Said; Becky L Genberg; Omar Galarraga
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Kenya
    Dataset funded by
    National Institute of Mental Health
    Description

    Analytic code and deidentified data set used to generate the findings presented in the manuscript "Economic Outcomes Among Microfinance Group Members Receiving Integrated HIV Care: Cluster Randomized Trial Evidence From Kenya"

  20. Deidentified data.

    • plos.figshare.com
    csv
    Updated Apr 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niels Brinkman; Teun Teunis; Seung Choi; David Ring; W. Michael Brode (2025). Deidentified data. [Dataset]. http://doi.org/10.1371/journal.pone.0319874.s004
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Niels Brinkman; Teun Teunis; Seung Choi; David Ring; W. Michael Brode
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All data used to generate the findings of this study. (CSV)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institute of Standards and Technology (2023). SDNist v2: Deidentified Data Report Tool [Dataset]. http://doi.org/10.18434/mds2-2943
Organization logo

SDNist v2: Deidentified Data Report Tool

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 13, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
License

https://www.nist.gov/open/licensehttps://www.nist.gov/open/license

Description

SDNist v2 is a Python package that provides benchmark data and evaluation metrics for deidentified data generators. This version of SDNist supports using the NIST Diverse Communities Data Excerpts, a geographically partitioned, limited feature data set. The deidentified data report evaluates utility and privacy of a given deidentified dataset and generates a summary quality report with performance of a deidentified dataset enumerated and illustrated for each utility and privacy metric.

Search
Clear search
Close search
Google apps
Main menu