53 datasets found
  1. D

    Data De-identification and Pseudonymity Software Market Report | Global...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data De-identification and Pseudonymity Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-de-identification-and-pseudonymity-software-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data De-identification and Pseudonymity Software Market Outlook



    The global data de-identification and pseudonymity software market is projected to grow significantly, reaching approximately USD 4.2 billion by 2032, driven primarily by increasing data privacy concerns and stringent regulatory requirements worldwide.



    The primary growth factor in the data de-identification and pseudonymity software market is the surge in data breaches and cyber-attacks. With the exponential increase in data generation, organizations are more vulnerable to data breaches and unauthorized access. These security concerns have prompted businesses and governments to invest heavily in robust data protection solutions. Data de-identification and pseudonymity software provide a secure way to anonymize sensitive information, making it less susceptible to malicious activities. As data protection laws become more rigorous, the demand for such technologies will continue to rise, further propelling market growth.



    Another significant factor contributing to market growth is the growing awareness and emphasis on data privacy among consumers. In recent years, consumers have become increasingly aware of how their data is being used and the potential risks associated with data misuse. This heightened awareness has put pressure on organizations to adopt comprehensive data protection measures. Data de-identification and pseudonymity software offer a means to protect personal information while still allowing organizations to utilize data for analytics and decision-making. This dual benefit is a key driver for the adoption of these technologies across various sectors.



    Moreover, regulatory compliance is a crucial driver for the market. Regulations such as the General Data Protection Regulation (GDPR) in Europe, the Health Insurance Portability and Accountability Act (HIPAA) in the United States, and various other data protection laws worldwide mandate stringent measures for data protection. Non-compliance can result in hefty fines and legal repercussions. Therefore, organizations are increasingly adopting data de-identification and pseudonymity software to ensure compliance with these regulations. The need for regulatory compliance is expected to sustain market growth in the foreseeable future.



    Regionally, North America currently dominates the global data de-identification and pseudonymity software market, accounting for the largest market share. This is attributed to the presence of major technology players, stringent data protection regulations, and high adoption rates of advanced technologies in the region. Europe follows closely, with significant market contributions from countries such as Germany, France, and the UK, driven by robust regulatory frameworks like GDPR. The Asia Pacific region is also expected to witness substantial growth, fueled by rapid digitalization, increasing cybersecurity threats, and growing awareness about data privacy in countries like China, India, and Japan.



    Data Masking Tools play a pivotal role in enhancing the security framework of organizations by providing an additional layer of protection for sensitive information. These tools are designed to obscure specific data within a dataset, ensuring that unauthorized users cannot access or decipher the original information. As businesses increasingly rely on data-driven insights, the need for robust data masking solutions becomes more critical. By employing data masking tools, organizations can safely share data across departments or with third-party vendors without compromising privacy. This capability is especially beneficial in industries such as healthcare and finance, where data privacy is paramount. The integration of data masking tools with existing data protection strategies can significantly reduce the risk of data breaches and ensure compliance with regulatory standards.



    Component Analysis



    The data de-identification and pseudonymity software market can be segmented by component into software and services. The software segment is anticipated to hold the lion's share due to the increasing adoption of data protection solutions across various industries. Software solutions provide automated tools for anonymizing and pseudonymizing data, ensuring compliance with regulatory standards. These solutions are essential for organizations aiming to mitigate the risks associated with data breaches and unauthorized access. As cyber threats continue to evolve, the demand for advanced software solutions is exp

  2. Technologies used to de-identify consumer data in the U.S. 2025

    • statista.com
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Technologies used to de-identify consumer data in the U.S. 2025 [Dataset]. https://www.statista.com/statistics/1613344/technologies-de-identify-consumer-data-usa/
    Explore at:
    Dataset updated
    May 16, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    During a 2025 survey carried out among organizations that collected consumer data in the United States, ** percent stated that they did not use and were not planning to use any data de-identifying technologies. Among those, who did or were planning to do so, differential privacy was the most commonly named technology. Data de-identification can be defined as stripping a dataset of information that can be used to determine the someone's identity.

  3. D

    Data De-identification & Pseudonymity Software Market Report | Global...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Data De-identification & Pseudonymity Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-de-identification-pseudonymity-software-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Data De-identification & Pseudonymity Software Market Outlook




    The global Data De-identification & Pseudonymity Software Market is projected to reach USD 3.5 billion by 2032, growing at a CAGR of 15.2% from 2024 to 2032. The rise in data privacy regulations and the increasing need for securing sensitive information are key factors driving this growth.




    The accelerating pace of digital transformation across various industries has led to an unprecedented surge in data generation. This voluminous data often contains sensitive information that needs robust protection. The growing awareness regarding data privacy and stringent regulations like GDPR in Europe, CCPA in California, and other data protection laws worldwide are compelling organizations to adopt advanced data de-identification and pseudonymity software. These solutions ensure that sensitive data is anonymized or pseudonymized, thus mitigating the risk of data breaches and ensuring compliance with regulations. Consequently, the adoption of data de-identification and pseudonymity software is rapidly increasing.




    Another significant growth factor is the increased focus on data security by industries such as healthcare, finance, and government. In healthcare, the protection of patient data is paramount, making the industry a significant consumer of de-identification software. Similarly, in the finance sector, protecting customer information is crucial to maintain trust and comply with regulatory requirements. Government agencies dealing with citizen data are also increasingly investing in these technologies to prevent unauthorized access and misuse of sensitive information. The demand for data de-identification and pseudonymity software is thus witnessing a steady rise across these critical sectors.




    Technological advancements and innovation in data security solutions are further propelling market growth. The integration of artificial intelligence and machine learning into de-identification and pseudonymity software has enhanced their effectiveness and efficiency. These advanced technologies enable more accurate and faster processing of large datasets, thereby offering robust data protection. Additionally, the rise of cloud computing and the increasing adoption of cloud-based solutions provide scalable and cost-effective options for organizations, further driving the market.



    In this context, the role of Identity Information Protection Service becomes increasingly crucial. As organizations strive to safeguard sensitive data, these services provide an essential layer of security by ensuring that identity-related information is protected from unauthorized access and misuse. Identity Information Protection Service helps organizations comply with data privacy regulations by offering robust solutions that secure personal identifiers, thus reducing the risk of identity theft and data breaches. By integrating these services, companies can enhance their data protection strategies, ensuring that identity information remains confidential and secure across various platforms and applications.




    Regionally, North America holds the largest market share, driven by stringent data protection regulations and high adoption rates of advanced technologies. Europe follows, with significant contributions from countries like Germany, the UK, and France, driven by GDPR compliance requirements. The Asia Pacific region is expected to witness the highest growth rate due to the rapid digitalization of economies like China and India, coupled with increasing awareness about data privacy. Latin America and the Middle East & Africa regions are also showing promising growth, albeit from a smaller base.



    Component Analysis




    The Data De-identification & Pseudonymity Software Market by component is segmented into software and services. The software segment includes standalone software solutions designed to de-identify or pseudonymize data. This segment is witnessing substantial growth due to the increasing demand for automated and scalable data protection solutions. The software solutions are enhanced with advanced algorithms and AI capabilities, providing accurate de-identification and pseudonymization of large datasets, which is crucial for organizations dealing with massive amounts of sensitive data.




  4. Envestnet | Yodlee's De-Identified Payroll Research Panel | USA Employee...

    • datarade.ai
    .sql, .txt
    Updated Mar 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Envestnet | Yodlee (2022). Envestnet | Yodlee's De-Identified Payroll Research Panel | USA Employee Payroll Data covering 4800+ employers | Cohort Analysis [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-payroll-panel-usa-employee-payroll-dat-envestnet-yodlee
    Explore at:
    .sql, .txtAvailable download formats
    Dataset updated
    Mar 1, 2022
    Dataset provided by
    Envestnethttp://envestnet.com/
    Yodlee
    Authors
    Envestnet | Yodlee
    Area covered
    United States of America
    Description

    Envestnet | Yodlee's Payroll Data Panel captures de-identified payroll information to deliver valuable employment insights, such as a company's wage costs, seasonal performance, headcount, hiring, layoffs, and more.

    De-identified payroll data analytics for major employers gives decision makers insight into employment trends across many industries. The payroll product includes 1000+ employers and data can be used for company specific or macro purposes. - 4800+ employers tagged - Frequency of payroll identified (i.e. weekly, bi-weekly)
    - Data at user and account level to allow for cohort analysis (e.g. Macys likely to lose 10% of revenue due to unemployment within their cohort)

    New Features - Mapping to Category codes and Employer Dependency Scoring Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

    1. Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

    2. Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

    3. Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis

  5. Q

    Community Expert Interviews on Priority Healthcare Needs Amongst People...

    • data.qdr.syr.edu
    pdf, txt
    Updated Nov 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carolyn Ingram; Carolyn Ingram (2023). Community Expert Interviews on Priority Healthcare Needs Amongst People Experiencing Homelessness in Dublin, Ireland: 2022-2023 [Dataset]. http://doi.org/10.5064/F6HFOEC5
    Explore at:
    pdf(599798), txt(6566), pdf(474790), pdf(138736), pdf(530060), pdf(612983), pdf(453939), pdf(729114), pdf(538538), pdf(396835), pdf(593906), pdf(656401), pdf(643059), pdf(506008), pdf(451086), pdf(550588), pdf(670927), pdf(180547), pdf(189571), pdf(367380)Available download formats
    Dataset updated
    Nov 10, 2023
    Dataset provided by
    Qualitative Data Repository
    Authors
    Carolyn Ingram; Carolyn Ingram
    License

    https://qdr.syr.edu/policies/qdr-standard-access-conditionshttps://qdr.syr.edu/policies/qdr-standard-access-conditions

    Time period covered
    Sep 1, 2022 - Mar 31, 2023
    Area covered
    Dublin, Ireland
    Description

    Project Overview This study used a community-based participatory approach to identify and investigate the needs of people experiencing homelessness in Dublin, Ireland. The project had several stages: A systematic review on health disparities amongst people experiencing homelessness in the Republic of Ireland; Observation and interviews with homeless attendees of a community health clinic; and Interviews with community experts (CEs) conducted from September 2022 to March 2023 on ongoing work and gaps in the research/health service response. This data deposit stems from stage 3, the community expert interview aspect of this project. Stage 1 of the project has been published (Ingram et al., 2023.) and associated data are available here. De-identified field note data from stage 2 of the project are planned for sharing upon completion of analysis, in January 2024. Data and Data Collection Overview A purposive, criterion-i sampling strategy (Palinkas et al., 2015) – where selected interviewees meet a predetermined criterion of importance – was used to identify professionals working in homeless health and/or addiction services in Dublin, stratified by occupation type. Potential CEs were identified through an internet search of homeless health and addiction services in Dublin. Interviewed CEs were invited to recommend colleagues they felt would have relevant perspectives on community health needs, expanding the sample via snowball strategy. Interview questions were based on World Health Organization Community Health Needs Assessment guidelines (Rowe at al., 2001). Semi-structured interviews were conducted between September 2022 and March 2023 utilising ZOOM™, the phone, or in person according to participant preference. Carolyn Ingram, who has formal qualitative research training, served as the interviewer. CEs were presented with an information sheet and gave audio recorded, informed oral consent – considered appropriate for remote research conducted with non-vulnerable adult participants – in the full knowledge that interviews would be audio recorded, transcribed, and de-identified, as approved by the researchers’ institutional Human Research Ethics Committee (LS-E-125-Ingram-Perrotta-Exemption). Interviewees also gave permission for de-identified transcripts to be shared in a qualitative data archive. Shared Data Organization 16 de-identified transcripts from the CE interviews are being published. Three participants from the total sample (N=19) did not consent to data archival. The transcript from each interviewee is named based on the type of work the interviewee performs, with individuals in the same type of work being differentiated by numbers. The full set of professional categories is as follows: Addiction Services Government Homeless Health Services Hospital Psychotherapist Researcher Social Care Any changes or removal of words or phrases for de-identification purposes are flagged by including [brackets] and italics. The documentation files included in this data project are the consent form and the interview guide used for the study, this data narrative and an administrative README file. References Ingram C, Buggy C, Elabbasy D, Perrotta C. (2023) “Homelessness and health-related outcomes in the Republic of Ireland: a systematic review, meta-analysis and evidence map.” Journal of Public Health (Berl). https://doi.org/10.1007/s10389-023-01934-0 Palinkas LA, Horwitz SM, Green CA, Wisdom JP, Duan N, Hoagwood K. (2015) “Purposeful sampling for qualitative data collection and analysis in mixed method implementation research.” Administration and Policy in Mental Health. Sep;42(5):533–44. https://doi.org/10.1007/s10488-013-0528-y Rowe A, McClelland A, Billingham K, Carey L. (2001) “Community health needs assessment: an introductory guide for the family health nurse in Europe” [Internet]. World Health Organization. Regional Office for Europe. Available at: https://apps.who.int/iris/handle/10665/108440

  6. n

    Data from: Generalizable EHR-R-REDCap pipeline for a national...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Jan 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller (2022). Generalizable EHR-R-REDCap pipeline for a national multi-institutional rare tumor patient registry [Dataset]. http://doi.org/10.5061/dryad.rjdfn2zcm
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 9, 2022
    Dataset provided by
    Massachusetts General Hospital
    Harvard Medical School
    Authors
    Sophia Shalhout; Farees Saqlain; Kayla Wright; Oladayo Akinyemi; David Miller
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objective: To develop a clinical informatics pipeline designed to capture large-scale structured EHR data for a national patient registry.

    Materials and Methods: The EHR-R-REDCap pipeline is implemented using R-statistical software to remap and import structured EHR data into the REDCap-based multi-institutional Merkel Cell Carcinoma (MCC) Patient Registry using an adaptable data dictionary.

    Results: Clinical laboratory data were extracted from EPIC Clarity across several participating institutions. Labs were transformed, remapped and imported into the MCC registry using the EHR labs abstraction (eLAB) pipeline. Forty-nine clinical tests encompassing 482,450 results were imported into the registry for 1,109 enrolled MCC patients. Data-quality assessment revealed highly accurate, valid labs. Univariate modeling was performed for labs at baseline on overall survival (N=176) using this clinical informatics pipeline.

    Conclusion: We demonstrate feasibility of the facile eLAB workflow. EHR data is successfully transformed, and bulk-loaded/imported into a REDCap-based national registry to execute real-world data analysis and interoperability.

    Methods eLAB Development and Source Code (R statistical software):

    eLAB is written in R (version 4.0.3), and utilizes the following packages for processing: DescTools, REDCapR, reshape2, splitstackshape, readxl, survival, survminer, and tidyverse. Source code for eLAB can be downloaded directly (https://github.com/TheMillerLab/eLAB).

    eLAB reformats EHR data abstracted for an identified population of patients (e.g. medical record numbers (MRN)/name list) under an Institutional Review Board (IRB)-approved protocol. The MCCPR does not host MRNs/names and eLAB converts these to MCCPR assigned record identification numbers (record_id) before import for de-identification.

    Functions were written to remap EHR bulk lab data pulls/queries from several sources including Clarity/Crystal reports or institutional EDW including Research Patient Data Registry (RPDR) at MGB. The input, a csv/delimited file of labs for user-defined patients, may vary. Thus, users may need to adapt the initial data wrangling script based on the data input format. However, the downstream transformation, code-lab lookup tables, outcomes analysis, and LOINC remapping are standard for use with the provided REDCap Data Dictionary, DataDictionary_eLAB.csv. The available R-markdown ((https://github.com/TheMillerLab/eLAB) provides suggestions and instructions on where or when upfront script modifications may be necessary to accommodate input variability.

    The eLAB pipeline takes several inputs. For example, the input for use with the ‘ehr_format(dt)’ single-line command is non-tabular data assigned as R object ‘dt’ with 4 columns: 1) Patient Name (MRN), 2) Collection Date, 3) Collection Time, and 4) Lab Results wherein several lab panels are in one data frame cell. A mock dataset in this ‘untidy-format’ is provided for demonstration purposes (https://github.com/TheMillerLab/eLAB).

    Bulk lab data pulls often result in subtypes of the same lab. For example, potassium labs are reported as “Potassium,” “Potassium-External,” “Potassium(POC),” “Potassium,whole-bld,” “Potassium-Level-External,” “Potassium,venous,” and “Potassium-whole-bld/plasma.” eLAB utilizes a key-value lookup table with ~300 lab subtypes for remapping labs to the Data Dictionary (DD) code. eLAB reformats/accepts only those lab units pre-defined by the registry DD. The lab lookup table is provided for direct use or may be re-configured/updated to meet end-user specifications. eLAB is designed to remap, transform, and filter/adjust value units of semi-structured/structured bulk laboratory values data pulls from the EHR to align with the pre-defined code of the DD.

    Data Dictionary (DD)

    EHR clinical laboratory data is captured in REDCap using the ‘Labs’ repeating instrument (Supplemental Figures 1-2). The DD is provided for use by researchers at REDCap-participating institutions and is optimized to accommodate the same lab-type captured more than once on the same day for the same patient. The instrument captures 35 clinical lab types. The DD serves several major purposes in the eLAB pipeline. First, it defines every lab type of interest and associated lab unit of interest with a set field/variable name. It also restricts/defines the type of data allowed for entry for each data field, such as a string or numerics. The DD is uploaded into REDCap by every participating site/collaborator and ensures each site collects and codes the data the same way. Automation pipelines, such as eLAB, are designed to remap/clean and reformat data/units utilizing key-value look-up tables that filter and select only the labs/units of interest. eLAB ensures the data pulled from the EHR contains the correct unit and format pre-configured by the DD. The use of the same DD at every participating site ensures that the data field code, format, and relationships in the database are uniform across each site to allow for the simple aggregation of the multi-site data. For example, since every site in the MCCPR uses the same DD, aggregation is efficient and different site csv files are simply combined.

    Study Cohort

    This study was approved by the MGB IRB. Search of the EHR was performed to identify patients diagnosed with MCC between 1975-2021 (N=1,109) for inclusion in the MCCPR. Subjects diagnosed with primary cutaneous MCC between 2016-2019 (N= 176) were included in the test cohort for exploratory studies of lab result associations with overall survival (OS) using eLAB.

    Statistical Analysis

    OS is defined as the time from date of MCC diagnosis to date of death. Data was censored at the date of the last follow-up visit if no death event occurred. Univariable Cox proportional hazard modeling was performed among all lab predictors. Due to the hypothesis-generating nature of the work, p-values were exploratory and Bonferroni corrections were not applied.

  7. Hospital Inpatient Discharges (SPARCS De-Identified): Adult Prevention...

    • health.data.ny.gov
    • healthdata.gov
    application/rdfxml +5
    Updated Nov 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York State Department of Health (2024). Hospital Inpatient Discharges (SPARCS De-Identified): Adult Prevention Quality Indicators (PQI) by County: Beginning 2009 [Dataset]. https://health.data.ny.gov/w/iqp6-vdi4/fbc6-cypp?cur=cA84In-PrXD
    Explore at:
    json, tsv, application/rdfxml, xml, csv, application/rssxmlAvailable download formats
    Dataset updated
    Nov 18, 2024
    Dataset authored and provided by
    New York State Department of Health
    Description

    This is one of two datasets that contain observed and expected rates for Agency for Healthcare Research and Quality Prevention Quality Indicators – Adult (AHRQ PQI) beginning in 2009. This dataset is at the county level. The Agency for Healthcare Research and Quality (AHRQ) Prevention Quality Indicators (PQIs) are a set of population based measures that can be used with hospital inpatient discharge data to identify ambulatory care sensitive conditions. These are conditions where 1) the need for hospitalization is potentially preventable with appropriate outpatient care, or 2) conditions that could be less severe if treated early and appropriately. All PQIs apply only to adult populations (over the age of 18 years). The rates were calculated using Statewide Planning and Research Cooperative System (SPARCS) inpatient data and Claritas population information.

    The observed rates and expected rates for each AHRQ PQI is presented by either resident county (including a statewide total) or resident zip code (including a statewide total).

  8. d

    Data dictionary for the ACTORDS 20-year follow-up study - Dataset -...

    • catalogue.data.govt.nz
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Data dictionary for the ACTORDS 20-year follow-up study - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/oai-figshare-com-article-28732205
    Explore at:
    Dataset updated
    Apr 4, 2025
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Metadata (data dictionary) and statistical analysis plan (including outcomes definitions for data dictionary) for the ACTORDS 20-year follow-up study. The DOI for the primary study publication will be added when available.Data and associated documentation for participants who have consented to future re-use of their data are available to other users under the data sharing arrangements provided by the University of Auckland’s Human Health Research Services (HHRS) platform (https://research-hub.auckland.ac.nz/subhub/human-health-research-services-platform). The data dictionary and metadata are published on the University of Auckland’s data repository Figshare, which allocates a DOI and thus makes these details searchable and available indefinitely. Researchers are able to use this information and the provided contact address (dataservices@auckland.ac.nz) to request a de-identified dataset through the HHRS Data Access Committee. Data will be shared with researchers who provide a methodologically sound proposal and have appropriate ethical approval, where necessary, to achieve the research aims in the approved proposal. Data requestors are required to sign a Data Access Agreement that includes a commitment to using the data only for the specified proposal, not to attempt to identify any individual participant, a commitment to secure storage and use of the data, and to destroy or return the data after completion of the project. The HHRS platform reserves the right to charge a fee to cover the costs of making data available, if needed, for data requests that require additional work to prepare.

  9. Hospital Inpatient Discharges (SPARCS De-Identified): Pediatric Prevention...

    • health.data.ny.gov
    • healthdata.gov
    application/rdfxml +5
    Updated Nov 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    New York State Department of Health (2024). Hospital Inpatient Discharges (SPARCS De-Identified): Pediatric Prevention Quality Indicators (PDI) by Patient County: Beginning 2009 [Dataset]. https://health.data.ny.gov/d/vh2s-8wb2
    Explore at:
    json, xml, csv, application/rssxml, application/rdfxml, tsvAvailable download formats
    Dataset updated
    Nov 18, 2024
    Dataset authored and provided by
    New York State Department of Health
    Description

    The dataset contains observed, expected, and risk-adjusted rates for the Agency for Healthcare Research and Quality Pediatric Quality Indicators – Pediatric (AHRQ PDI) beginning in 2009. The AHRQ PDIs are a set of population based measures that can be used with hospital inpatient discharge data to identify ambulatory care sensitive conditions. These are conditions where 1) the need for hospitalization is potentially preventable with appropriate outpatient care, or 2) conditions that could be less severe if treated early and appropriately. Both the Urinary Tract Infection and Gastroenteritis PDIs include admissions for patients aged 3 months through 17 years. The asthma PDI includes admissions for patients aged 2 through 17 years. Eligible admissions for the Diabetes Short-term Complications PDI includes admissions for patients aged 6 through 17 years.

    The rates were calculated using Statewide Planning and Research Cooperative System (SPARCS) inpatient data and Claritas population information.

    The observed, expected, risk-adjusted rates, and difference in rates, for each AHRQ PDI are presented by resident county (including a statewide total).

  10. n

    Data from: Medical data formatting to improve physician interpretation speed...

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    zip
    Updated Jun 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jacob Peterson (2022). Medical data formatting to improve physician interpretation speed in the military healthcare system [Dataset]. http://doi.org/10.5061/dryad.mkkwh712w
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 13, 2022
    Dataset provided by
    United States Department of the Navy
    Authors
    Jacob Peterson
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objective: The purpose of this project was to improve the ease and speed of physician comprehension when interpreting daily laboratory data for patients admitted within the Military Healthcare System (MHS). Materials and Methods: A JavaScript program was created to convert the laboratory data obtained via the outpatient electronic medical record (EMR) into a “fishbone diagram” format that is familiar to most physicians. Using a balanced crossover design, 35 internal medicine trainees and staff were asked to complete timed comprehension tests for laboratory data sets formatted in the outpatient EMR’s format and in fishbone diagram format. The number of responses per second and error rate per response were measured for each format. Participants were asked to rate relative ease of use for each format and indicate which format they preferred. Results: Comprehension speed increased 37% (6.28 seconds per interpretation) with the fishbone diagram format with no observed increase in errors. Using a Likert scale of 1 to 5 (1 being hard, 5 easy), participants indicated the new format was easier to use (4.14 for fishbone vs 2.14 for table) with 89% expressing a preference for the new format. Discussion: The publically available web application that converts tabular lab data to fishbone diagram format is currently used 10,000-12,000 times per month across the MHS, delivering significant benefit to the enterprise in terms of time saved and improved physician experience. Conclusions: This study supports the use of fishbone diagram formatting for laboratory data for inpatients within the MHS. Methods Study Design: De-identified chemistry and hematology results were presented to participants using the two data formats (tabular and fishbone diagram) along with questionnaires requesting the identification of individual values and trends. Participants completed the two questionnaires in a balanced crossover experiment. After completing both questionnaires participants were asked to complete a 3-question survey rating perceived ease of use and indicating an overall preference for one of the data formats. Participants: A total of 35 participants were recruited at a daily internal medicine residency didactic session. Participants were asked to abstain if they were unfamiliar with either data format. Patient Cases: Each laboratory data format was applied to a pair of basic metabolic panels (BMP) and a pair of complete blood counts (CBC) labeled as being from sequential days (one CBC and BMP for each day). The laboratory data were identical in quantity and type of information but individual result values used for each data format differed. Procedure: Before the study, every participant was informed about the project and confirmed familiarity with both data formats. Participants were each given both questionnaires (one for each data format) and a survey with the lab data hidden by a cover sheet. Participants were informed they would have 60 seconds to answer as many questions as possible about the data set provided and then would answer a set of questions about a set of data. The questions were designed so that each questionnaire requested identical cognitive tasks in the same order. For example, question three asked to identify a trend on both questionnaires but one questionnaire asked about anemia, the other about renal dysfunction. The study materials were distributed randomly but were prepared such that 50% of participants had the questionnaire with data formatted using a table as the first questionnaire. The remaining 50% started the questionnaire with data formatted using fishbone diagrams. Participants completed the two questionnaires in the assigned order and then completed a three-question survey. Outcome Measures: Responses were graded manually with incorrect or partially correct answers both counted as erroneous interpretations. Omitted questions, which were rare, were not considered to have undergone interpretation and were counted neither towards total interpretations nor as erroneous. For each questionnaire, the number of questions answered and the number of errors committed were recorded. For the survey results, the ratings for ease of use (1-5 on a Likert scale with 5 being easy) were recorded for each data format. The data format preference of each participant was also recorded.

  11. m

    Data from: Dataset for the study: Qualitative Assessment of the Florida...

    • scholarship.miami.edu
    zip
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anicca Liu; Rachel Nicole Waldman; Jose Szapocznik (2024). Dataset for the study: Qualitative Assessment of the Florida COVID-19 Response [Dataset]. https://scholarship.miami.edu/esploro/outputs/dataset/Dataset-for-the-study-Qualitative-Assessment/991032454018802976
    Explore at:
    zip(186024 bytes)Available download formats
    Dataset updated
    Nov 8, 2024
    Dataset provided by
    University of Miami Libraries
    Authors
    Anicca Liu; Rachel Nicole Waldman; Jose Szapocznik
    Time period covered
    2024
    Area covered
    Florida
    Description

    Data from this collection include de-identified notes and transcripts from semi-structured interviews with Florida stakeholders. Between 25 January 2021- 7 December 2022, twenty-five interviews were conducted with twenty-seven former and current leaders from government (Florida Legislature, Florida Department of Health, and Florida Division of Emergency Management), academia, and the private sector (disaster management and hospitality industry). Data include reflections from participants on the challenges encountered during COVID-19 and considerations for what should be done for future pandemics in Florida. To protect participant confidentiality, a comprehensive de-identification process was implemented. This included the provision of IDs to conceal participant identity. Personal identifiers, such as names, locations, organizational affiliations, etc. were redacted from the data. Other potentially identifying details (i.e., size of staff and participants’ role) were also omitted from interview notes and transcripts. Redacted sections are indicated by brackets (e.g., “[redacted]”, “[redacted number]” or “[…]”) to identify where information has been omitted. Full access to the transcripts is restricted to excerpts to uphold participant privacy.

  12. A Gold Standard Corpus for Activity Information (GoSCAI)

    • zenodo.org
    Updated May 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). A Gold Standard Corpus for Activity Information (GoSCAI) [Dataset]. http://doi.org/10.5281/zenodo.15528545
    Explore at:
    Dataset updated
    May 30, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Description

    A Gold Standard Corpus for Activity Information

    Dataset Title: A Gold Standard Corpus for Activity Information (GoSCAI)

    Dataset Curators: The Epidemiology & Biostatistics Section of the NIH Clinical Center Rehabilitation Medicine Department

    Dataset Version: 1.0 (May 16, 2025)

    Dataset Citation and DOI: NIH CC RMD Epidemiology & Biostatistics Section. (2025). A Gold Standard Corpus for Activity Information (GoSCAI) [Data set]. Zenodo. doi: 10.5281/zenodo.15528545

    EXECUTIVE SUMMARY

    This data statement is for a gold standard corpus of de-identified clinical notes that have been annotated for human functioning information based on the framework of the WHO's International Classification of Functioning, Disability and Health (ICF). The corpus includes 484 notes from a single institution within the United States written in English in a clinical setting. This dataset was curated for the purpose of training natural language processing models to automatically identify, extract, and classify information on human functioning at the whole-person, or activity, level.

    CURATION RATIONALE

    This dataset is curated to be a publicly available resource for the development and evaluation of methods for the automatic extraction and classification of activity-level functioning information as defined in the ICF. The goals of data curation are to 1) create a corpus of a size that can be manually deidentified and annotated, 2) maximize the density and diversity of functioning information of interest, and 3) allow public dissemination of the data.

    LANGUAGE VARIETIES

    Language Region: en-US

    Prose Description: English as written by native and bilingual English speakers in a clinical setting

    LANGUAGE USER DEMOGRAPHIC

    The language users represented in this dataset are medical and clinical professionals who work in a research hospital setting. These individuals hold professional degrees corresponding to their respective specialties. Specific demographic characteristics of the language users such as age, gender, or race/ethnicity were not collected.

    ANNOTATOR DEMOGRAPHIC

    The annotator group consisted of five people, 33 to 76 years old, including four females and one male. Socioeconomically, they came from the middle and upper-middle income classes. Regarding first language, three annotators had English as their first language, one had Chinese, and one had Spanish. Proficiency in English, the language of the data being annotated, was native for three of the annotators and bilingual for the other two. The annotation team included clinical rehabilitation domain experts with backgrounds in occupational therapy, physical therapy, and individuals with public health and data science expertise. Prior to annotation, all annotators were trained on the specific annotation process using established guidelines for the given domain, and annotators were required to achieve a specified proficiency level prior to annotating notes in this corpus.

    LINGUISTIC SITUATION AND TEXT CHARACTERISTICS

    The notes in the dataset were written as part of clinical care within a U.S. research hospital between May 2008 and November 2019. These notes were written by health professionals asynchronously following the patient encounter to document the interaction and support continuity of care. The intended audience of these notes were clinicians involved in the patients' care. The included notes come from nine disciplines - neuropsychology, occupational therapy, physical medicine (physiatry), physical therapy, psychiatry, recreational therapy, social work, speech language pathology, and vocational rehabilitation. The notes were curated to support research on natural language processing for functioning information between 2018 and 2024.

    PREPROCESSING AND DATA FORMATTING

    The final corpus was derived from a set of clinical notes extracted from the hospital electronic medical record (EMR) for the purpose of clinical research. The original data include character-based digital content originally. We work in ASCII 8 or UNICODE encoding, and therefore part of our pre-processing includes running encoding detection and transformation from encodings such as Windows-1252 or ISO-8859 format to our preferred format.

    On the larger corpus, we applied sampling to match our curation rationale. Given the resource constraints of manual annotation, we set out to create a dataset of 500 clinical notes, which would exclude notes over 10,000 characters in length.

    To promote density and diversity, we used five note characteristics as sampling criteria. We used the text length as expressed in number of characters. Next, we considered the discipline group as derived from note type metadata and describes which discipline a note originated from: occupational and vocational therapy (OT/VOC), physical therapy (PT), recreation therapy (RT), speech and language pathology (SLP), social work (SW), or miscellaneous (MISC, including psychiatry, neurology and physiatry). These disciplines were selected for collecting the larger corpus because their notes are likely to include functioning information. Existing information extraction tools were used to obtain annotation counts in four areas of functioning and provided a note’s annotation count, annotation density (annotation count divided by text length), and domain count (number of domains with at least 1 annotation).

    We used stratified sampling across the 6 discipline groups to ensure discipline diversity in the corpus. Because of low availability, 50 notes were sampled from SLP with relaxed criteria, and 90 notes each from the 5 other discipline groups with stricter criteria. Sampled SLP notes were those with the highest annotation density that had an annotation count of at least 5 and a domain count of at least 2. Other notes were sampled by highest annotation count and lowest text length, with a minimum annotation count of 15 and minimum domain count of 3.

    The notes in the resulting sample included certain types of PHI and PII. To prepare for public dissemination, all sensitive or potentially identifying information was manually annotated in the notes and replaced with substituted content to ensure readability and enough context needed for machine learning without exposing any sensitive information. This de-identification effort was manually reviewed to ensure no PII or PHI exposure and correct any resulting readability issues. Notes about pediatric patients were excluded. No intent was made to sample multiple notes from the same patient. No metadata is provided to group notes other than by note type, discipline, or discipline group. The dataset is not organized beyond the provided metadata, but publications about models trained on this dataset should include information on the train/test splits used.

    All notes were sentence-segmented and tokenized using the spaCy en_core_web_lg model with additional rules for sentence segmentation customized to the dataset. Notes are stored in an XML format readable by the GATE annotation software (https://gate.ac.uk/family/developer.html), which stores annotations separately in annotation sets.

    CAPTURE QUALITY

    As the clinical notes were extracted directly from the EMR in text format, the capture quality was determined to be high. The clinical notes did not have to be converted from other data formats, which means this dataset is free from noise introduced by conversion processes such as optical character recognition.

    LIMITATIONS

    Because of the effort required to manually deidentify and annotate notes, this corpus is limited in terms of size and representation. The curation decisions skewed note selection towards specific disciplines and note types to increase the likelihood of encountering information on functioning. Some subtypes of functioning occur infrequently in the data, or not at all. The deidentification of notes was done in a manner to preserve natural language as it would occur in the notes, but some information is lost, e.g. on rare diseases.

    METADATA

    Information on the manual annotation process is provided in the annotation guidelines for each of the four domains:

    - Communication & Cognition (https://zenodo.org/records/13910167)

    - Mobility (https://zenodo.org/records/11074838)

    - Self-Care & Domestic Life (SCDL) (https://zenodo.org/records/11210183)

    - Interpersonal Interactions & Relationships (IPIR) (https://zenodo.org/records/13774684)

    Inter-annotator agreement was established on development datasets described in the annotation guidelines prior to the annotation of this gold standard corpus.

    The gold standard corpus consists of 484 documents, which include 35,147 sentences in total. The distribution of annotated information is provided in the table below.

    <td style="width: 1.75in; padding: 0in 5.4pt 0in

    Domain

    Number of Annotated Sentences

    % of All Sentences

    Mean Number of Annotated Sentences per Document

    Communication & Cognition

    6033

    17.2%

  13. D

    Data Security Governance Platform Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Feb 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Data Security Governance Platform Report [Dataset]. https://www.datainsightsmarket.com/reports/data-security-governance-platform-1935261
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Feb 14, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The global market for Data Security Governance Platforms is projected to reach USD XXX million by 2033, exhibiting a CAGR of XX% during the forecast period 2023-2033. The increasing need for data protection, regulatory compliance, and privacy concerns are driving the market growth. Organizations are facing challenges in managing and securing vast amounts of sensitive data, which has led to the adoption of data security governance platforms. These platforms provide a centralized approach to data management, enabling organizations to classify, discover, and de-identify sensitive data to ensure its protection. The market is segmented based on application into data classification, data discovery, data de-identification, and others. Data classification and data discovery account for the largest share of the market due to their importance in identifying and protecting sensitive data. Based on type, the market is segmented into cloud-based and local deployment. Cloud-based solutions are gaining popularity due to their scalability, flexibility, and cost-effectiveness. The major players in the market include ASG Technologies, Ataccama, Collibra, erwin by Quest Software, IBM, Informatica, Io-Tahoe, OvalEdge, SAP, Hillstone Networks, Beijing Esensoft, Venustech Group, Beijing HaiTai FangYuan, AsiaInfo, and Quanzhi. The global data security governance platform market is highly concentrated, with a few key players accounting for a significant portion of the market share. These companies include ASG Technologies, Ataccama, Collibra, erwin by Quest Software, IBM, Informatica, Io-Tahoe, OvalEdge, SAP, Hillstone Networks, Beijing Esensoft, Venustech Group, Beijing HaiTai FangYuan, AsiaInfo, and Quanzhi. The key characteristics of innovation in the data security governance platform market include: The increased adoption of cloud computing and big data has led to a growing need for data security governance platforms. These platforms help organizations to manage and protect their data assets, comply with regulations, and improve their overall security posture. The growing threat of cyberattacks is also driving the demand for data security governance platforms. these platforms can help organizations to detect and respond to cyber threats, and minimize the impact of data breaches.

  14. t

    Data from: On the Challenges of Developing a Concise Questionnaire to...

    • tudatalib.ulb.tu-darmstadt.de
    Updated 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Biselli, Tom; Steinbrink, Enno; Herbert, Franziska; Schmidbauer-Wolf, Gina-Maria; Reuter, Christian (2021). On the Challenges of Developing a Concise Questionnaire to Identify Privacy Personas [Data Set] [Dataset]. https://tudatalib.ulb.tu-darmstadt.de/handle/tudatalib/3490
    Explore at:
    Dataset updated
    2021
    Authors
    Biselli, Tom; Steinbrink, Enno; Herbert, Franziska; Schmidbauer-Wolf, Gina-Maria; Reuter, Christian
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    This repository contains the survey data set and the script for the statistical analysis aiming at identifying privacy personas. Please cite the original paper when using this data: "Biselli, T., Steinbrink, E., Herbert, F., Schmidbauer-Wolf, G. M., & Reuter, C. (2022). On the Challenges of Developing a Concise Questionnaire to Identify Privacy Personas. Proceedings on Privacy Enhancing Technologies, 4(2022), 645-669."

  15. f

    Table_1_Effects of sex, age, and body mass index on serum bicarbonate.DOCX

    • frontiersin.figshare.com
    docx
    Updated Jul 20, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daisy Duan; Jamie Perin; Adam Osman; Francis Sgambati; Lenise J. Kim; Luu V. Pham; Vsevolod Y. Polotsky; Jonathan C. Jun (2023). Table_1_Effects of sex, age, and body mass index on serum bicarbonate.DOCX [Dataset]. http://doi.org/10.3389/frsle.2023.1195823.s002
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jul 20, 2023
    Dataset provided by
    Frontiers
    Authors
    Daisy Duan; Jamie Perin; Adam Osman; Francis Sgambati; Lenise J. Kim; Luu V. Pham; Vsevolod Y. Polotsky; Jonathan C. Jun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    RationaleObesity hypoventilation syndrome (OHS) is often underdiagnosed, with significant morbidity and mortality. Bicarbonate, as a surrogate of arterial carbon dioxide, has been proposed as a screening tool for OHS. Understanding the predictors of serum bicarbonate could provide insights into risk factors for OHS. We hypothesized that the bicarbonate levels would increase with an increase in body mass index (BMI), since the prevalence of OHS increases with obesity.MethodsWe used the TriNetX Research Network, an electronic health record database with de-identified clinical data from participating healthcare organizations across the United States, to identify 93,320 adults without pulmonary or advanced renal diseases who had serum bicarbonate and BMI measurements within 6 months of each other between 2017 and 2022. We used linear regression analysis to examine the associations between bicarbonate and BMI, age, and their interactions for the entire cohort and stratified by sex. We also applied a non-linear machine learning algorithm (XGBoost) to examine the relative importance of age, BMI, sex, race/ethnicity, and obstructive sleep apnea (OSA) status on bicarbonate.ResultsThis cohort population was 56% women and 72% white and 80% non-Hispanic individuals, with an average (SD) age of 49.4 (17.9) years and a BMI of 29.1 (6.1) kg/m2. The mean bicarbonate was 24.8 (2.8) mmol/L, with higher levels in men (mean 25.2 mmol/L) than in women (mean 24.4 mmol/L). We found a small negative association between bicarbonate and BMI, with an expected change of −0.03 mmol/L in bicarbonate for each 1 kg/m2 increase in BMI (p < 0.001), in the entire cohort and both sexes. We found sex differences in the bicarbonate trajectory with age, with women exhibiting lower bicarbonate values than men until age 50, after which the bicarbonate levels were modestly higher. The non-linear machine learning algorithm similarly revealed that age and sex played larger roles in determining bicarbonate levels than the BMI or OSA status.ConclusionContrary to our hypothesis, BMI is not associated with elevated bicarbonate levels, and age modifies the impact of sex on bicarbonate.

  16. n

    Comprehensive Epidemiologic Data Resource

    • cmr.earthdata.nasa.gov
    Updated Apr 21, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Comprehensive Epidemiologic Data Resource [Dataset]. https://cmr.earthdata.nasa.gov/search/concepts/C1214604829-SCIOPS
    Explore at:
    Dataset updated
    Apr 21, 2017
    Time period covered
    Jan 1, 1970 - Present
    Description

    The Comprehensive Epidemiologic Data Resource (CEDR) is the U.S. Department of Energy’s (DOE) electronic database comprised of health studies of DOE contract workers and environmental studies of areas surrounding DOE facilities. DOE recognizes the benefits of data sharing and supports the public’s right to know about worker and community health risks. CEDR provides independent researchers and the public with access to de-identified data collected since the Department’s early production years. CEDR’s holdings include more than 80 studies of more than one million workers. CEDR is a national user facility, with a large audience for data that are not available elsewhere.

    Most of CEDR’s holdings are derived from epidemiologic studies of DOE workers at many large nuclear weapons plants, such as Hanford, Los Alamos, Oak Ridge, Savannah River Site, and Rocky Flats. These studies primarily use death certificate information to identify excess deaths and patterns of disease among workers to determine what factors contribute to the risk of developing cancer and other illnesses. In addition, many of these studies have radiation exposure measurements on individual workers. Other CEDR collections include historical dose reconstruction studies of past offsite radiologic and chemical exposures around the nuclear weapons facilities. Now a mature system in routine operational use, CEDR’s modern, Internet-based systems respond to thousands of requests to its Web server daily.

    CEDR’s library of information, reports, journal articles, and data includes nearly 10,000 citations/documents. CEDR’s bibliographic search feature allows the user to select citations or publications associated with the studies found in the CEDR library.

    CEDR’s data collection -- There are two types of data derived from epidemiologic studies:

    1) Analytic data files: contain the data that a researcher directly used in conducting the analyses and result in reported findings or publication in a peer-reviewed journal. CEDR’s holdings include more than 200 analytic files.

    2) Working data files: files that contain the raw or unedited data from which a researcher selected variables to form an initial analytic data file set. The data in the working data files may contain errors; as such, it is recommended that they be analyzed and results interpreted with caution. There are more than 100 working data files in CEDR’s holdings.

  17. Hospital Inpatient Discharges (SPARCS De-Identified): In-Hospital/30-Day...

    • healthdata.gov
    • data.ny.gov
    application/rdfxml +5
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    health.data.ny.gov (2025). Hospital Inpatient Discharges (SPARCS De-Identified): In-Hospital/30-Day Acute Stroke Mortality Rates by Hospital: Beginning 2013 [Dataset]. https://healthdata.gov/State/Hospital-Inpatient-Discharges-SPARCS-De-Identified/fk3s-8n8z
    Explore at:
    xml, application/rdfxml, csv, application/rssxml, json, tsvAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    health.data.ny.gov
    Description

    The dataset contains hospital stroke designation and Coverdell registry participation status, acute stroke discharges counts (numerators, denominators), observed, expected and risk-adjusted acute stroke in-hospital/30-day post admission mortality rates with corresponding 95% confidence intervals. Mortality rates risk adjustment was based on the methodology developed by the New York State Department of Health.

    The purpose of this data set is reporting of hospital-specific risk adjusted acute stroke mortality rates (RAMR) to inform hospitals, to aid initiatives to improve hospital quality performance and measurement, and to identify performance outliers for public reporting.

  18. o

    Collaboratory Data on Community Engagement & Public Service in Higher...

    • openicpsr.org
    Updated Mar 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristin D. Medlin; Matthew Seto (2021). Collaboratory Data on Community Engagement & Public Service in Higher Education [Dataset]. http://doi.org/10.3886/E136322V1
    Explore at:
    Dataset updated
    Mar 30, 2021
    Dataset provided by
    Collaboratory
    Authors
    Kristin D. Medlin; Matthew Seto
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Collaboratory is a software product developed and maintained by HandsOn Connect Cloud Solutions. It is intended to help higher education institutions accurately and comprehensively track their relationships with the community through engagement and service activities. Institutions that use Collaboratory are given the option to opt-in to a data sharing initiative at the time of onboarding, which grants us permission to de-identify their data and make it publicly available for research purposes. HandsOn Connect is committed to making Collaboratory data accessible to scholars for research, toward the goal of advancing the field of community engagement and social impact.Collaboratory is not a survey, but is instead a dynamic software tool designed to facilitate comprehensive, longitudinal data collection on community engagement and public service activities conducted by faculty, staff, and students in higher education. We provide a standard questionnaire that was developed by Collaboratory’s co-founders (Janke, Medlin, and Holland) in the Institute for Community and Economic Engagement at UNC Greensboro, which continues to be closely monitored and adapted by staff at HandsOn Connect and academic colleagues. It includes descriptive characteristics (what, where, when, with whom, to what end) of activities and invites participants to periodically update their information in accordance with activity progress over time. Examples of individual questions include the focus areas addressed, populations served, on- and off-campus collaborators, connections to teaching and research, and location information, among others.The Collaboratory dataset contains data from 37 institutions beginning in March 2016and continues to grow as more institutions adopt Collaboratory and continue to expand its use. The data represent over 3,600 published activities (and additional associated content) across our user base.Please cite this data as:Medlin, Kristin and Seto, Matthew. Dataset on Higher Education Community Engagement and Public Service Activities, 2016-2021. Collaboratory [producer], 2021. Ann Arbor, MI: Inter-university Consortium for Political and Social Research [distributor], 2021-11-01. doi: _v1.When you cite this data, please also include: ORIGINS PAPER CITATION

  19. d

    Prescription Monitoring Program (PMP) Public Use Data

    • catalog.data.gov
    • data.wa.gov
    • +2more
    Updated May 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.wa.gov (2025). Prescription Monitoring Program (PMP) Public Use Data [Dataset]. https://catalog.data.gov/dataset/prescription-monitoring-program-pmp-public-use-data
    Explore at:
    Dataset updated
    May 24, 2025
    Dataset provided by
    data.wa.gov
    Description

    Washington’s PMP was created (RCW 70.225 (2007)) to improve patient care and to stop prescription drug misuse by collecting dispensing records for Schedule II, III, IV and V drugs, and by making the information available to medical providers and pharmacists as a patient care tool. Program rules, WAC 246-470, took effect August 27, 2011. The program started data collection from all dispensers October 7, 2011. Under RCW 70.225.040(5)(a), the department is authorized to publish public data after removing information that could be used directly or indirectly to identify individual patients, requestors, dispensers, prescribers, and persons who received prescriptions from dispensers. The data available here are de-identified, and exclude patient, prescriber, and dispenser related information in alignment with program rules WAC 246-470-080. No requestor information is available here. Prescriptions excluded from PMP include those dispensed outside of WA State, those prescribed for less than or equal to 24 hours, those administered or given to a patient in the hospital, and those dispensed from a Department of Corrections pharmacy (unless an offender is released with a prescription), an Opioid Treatment Program, and some federally operated pharmacies (Indian Health Services and Veterans Affairs report voluntarily since 2015). Further information on collection and management of PMP data at DOH can be found at www.doh.wa.gov/pmp/data.

  20. United States SB: DE: Outlook: FN: Identify & Hire Employees

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, United States SB: DE: Outlook: FN: Identify & Hire Employees [Dataset]. https://www.ceicdata.com/en/united-states/small-business-pulse-survey-by-state-south-region/sb-de-outlook-fn-identify--hire-employees
    Explore at:
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 27, 2021 - Apr 11, 2022
    Area covered
    United States
    Description

    United States SB: DE: Outlook: FN: Identify & Hire Employees data was reported at 37.000 % in 11 Apr 2022. This records a decrease from the previous number of 45.600 % for 04 Apr 2022. United States SB: DE: Outlook: FN: Identify & Hire Employees data is updated weekly, averaging 37.100 % from Nov 2021 (Median) to 11 Apr 2022, with 18 observations. The data reached an all-time high of 48.900 % in 14 Mar 2022 and a record low of 28.100 % in 28 Feb 2022. United States SB: DE: Outlook: FN: Identify & Hire Employees data remains active status in CEIC and is reported by U.S. Census Bureau. The data is categorized under Global Database’s United States – Table US.S051: Small Business Pulse Survey: by State: South Region: Weekly, Beg Monday (Discontinued).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dataintelo (2025). Data De-identification and Pseudonymity Software Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-data-de-identification-and-pseudonymity-software-market

Data De-identification and Pseudonymity Software Market Report | Global Forecast From 2025 To 2033

Explore at:
pptx, pdf, csvAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License

https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

Time period covered
2024 - 2032
Area covered
Global
Description

Data De-identification and Pseudonymity Software Market Outlook



The global data de-identification and pseudonymity software market is projected to grow significantly, reaching approximately USD 4.2 billion by 2032, driven primarily by increasing data privacy concerns and stringent regulatory requirements worldwide.



The primary growth factor in the data de-identification and pseudonymity software market is the surge in data breaches and cyber-attacks. With the exponential increase in data generation, organizations are more vulnerable to data breaches and unauthorized access. These security concerns have prompted businesses and governments to invest heavily in robust data protection solutions. Data de-identification and pseudonymity software provide a secure way to anonymize sensitive information, making it less susceptible to malicious activities. As data protection laws become more rigorous, the demand for such technologies will continue to rise, further propelling market growth.



Another significant factor contributing to market growth is the growing awareness and emphasis on data privacy among consumers. In recent years, consumers have become increasingly aware of how their data is being used and the potential risks associated with data misuse. This heightened awareness has put pressure on organizations to adopt comprehensive data protection measures. Data de-identification and pseudonymity software offer a means to protect personal information while still allowing organizations to utilize data for analytics and decision-making. This dual benefit is a key driver for the adoption of these technologies across various sectors.



Moreover, regulatory compliance is a crucial driver for the market. Regulations such as the General Data Protection Regulation (GDPR) in Europe, the Health Insurance Portability and Accountability Act (HIPAA) in the United States, and various other data protection laws worldwide mandate stringent measures for data protection. Non-compliance can result in hefty fines and legal repercussions. Therefore, organizations are increasingly adopting data de-identification and pseudonymity software to ensure compliance with these regulations. The need for regulatory compliance is expected to sustain market growth in the foreseeable future.



Regionally, North America currently dominates the global data de-identification and pseudonymity software market, accounting for the largest market share. This is attributed to the presence of major technology players, stringent data protection regulations, and high adoption rates of advanced technologies in the region. Europe follows closely, with significant market contributions from countries such as Germany, France, and the UK, driven by robust regulatory frameworks like GDPR. The Asia Pacific region is also expected to witness substantial growth, fueled by rapid digitalization, increasing cybersecurity threats, and growing awareness about data privacy in countries like China, India, and Japan.



Data Masking Tools play a pivotal role in enhancing the security framework of organizations by providing an additional layer of protection for sensitive information. These tools are designed to obscure specific data within a dataset, ensuring that unauthorized users cannot access or decipher the original information. As businesses increasingly rely on data-driven insights, the need for robust data masking solutions becomes more critical. By employing data masking tools, organizations can safely share data across departments or with third-party vendors without compromising privacy. This capability is especially beneficial in industries such as healthcare and finance, where data privacy is paramount. The integration of data masking tools with existing data protection strategies can significantly reduce the risk of data breaches and ensure compliance with regulatory standards.



Component Analysis



The data de-identification and pseudonymity software market can be segmented by component into software and services. The software segment is anticipated to hold the lion's share due to the increasing adoption of data protection solutions across various industries. Software solutions provide automated tools for anonymizing and pseudonymizing data, ensuring compliance with regulatory standards. These solutions are essential for organizations aiming to mitigate the risks associated with data breaches and unauthorized access. As cyber threats continue to evolve, the demand for advanced software solutions is exp

Search
Clear search
Close search
Google apps
Main menu