28 datasets found

c
Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation,...
cancerimagingarchive.net
csv, dicom, n/a +1
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2025). Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation, MIDI-B-Curated-Validation, MIDI-B-Synthetic-Test, MIDI-B-Curated-Test) [Dataset]. http://doi.org/10.7937/cf2p-aw56
Explore at:
sqlite and zip, dicom, csv, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/cf2p-aw56
Dataset updated
May 2, 2025
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 2, 2025
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
Abstract
These resources comprise a large and diverse collection of multi-site, multi-modality, and multi-cancer clinical DICOM images from 538 subjects infused with synthetic PHI/PII in areas encountered by TCIA curation teams. Also provided is a TCIA-curated version of the synthetic dataset, along with mapping files for mapping identifiers between the two.
This new MIDI data resource includes DICOM datasets used in the Medical Image De-Identification Benchmark (MIDI-B) challenge at MICCAI 2024. They are accompanied by ground truth answer keys and a validation script for evaluating the effectiveness of medical image de-identification workflows. The validation script systematically assesses de-identified data against an answer key outlining appropriate actions and values for proper de-identification of medical images, promoting safer and more consistent medical image sharing.
Introduction
Medical imaging research increasingly relies on large-scale data sharing. However, reliable de-identification of DICOM images still presents significant challenges due to the wide variety of DICOM header elements and pixel data where identifiable information may be embedded. To address this, we have developed an openly accessible synthetic dataset containing artificially generated protected health information (PHI) and personally identifiable information (PII).
These resources complement our earlier work (Pseudo-PHI-DICOM-data ) hosted on The Cancer Imaging Archive. As an example of its use, we also provide a version curated by The Cancer Imaging Archive (TCIA) curation team. This resource builds upon best practices emphasized by the MIDI Task Group who underscore the importance of transparency, documentation, and reproducibility in de-identification workflows, part of the themes at recent conferences (Synapse:syn53065760) and workshops (2024 MIDI-B Challenge Workshop).
This framework enables objective benchmarking of de-identification performance, promotes transparency in compliance with regulatory standards, and supports the establishment of consistent best practices for sharing clinical imaging data. We encourage the research community to use these resources to enhance and standardize their medical image de-identification workflows.
Methods
Subject Inclusion and Exclusion Criteria
The source data were selected from imaging already hosted in de-identified form on TCIA. Imaging containing faces were excluded, and no new human studies were performed for his project.
Data Acquisition
To build the synthetic dataset, image series were selected from TCIA’s curated datasets to represent a broad range of imaging modalities (CR, CT, DX, MG, MR, PT, SR, US) , manufacturers including (GE, Siemens, Varian , Confirma, Agfa, Eigen, Elekta, Hologic, KONICA MINOLTA, others) , scan parameters, and regions of the body. These were processed to inject the synthetic PHI/PII as described.
Data Analysis
Synthetic pools of PHI, like subject and scanning institution information, were generated using the Python package Faker (https://pypi.org/project/Faker/8.10.3/). These were inserted into DICOM metadata of selected imaging files using a system of inheritable rule-based templates outlining re-identification functions for data insertion and logging for answer key creation. Text was also burned-in to the pixel data of a number of images. By systematically embedding realistic synthetic PHI into image headers and pixel data, accompanied by a detailed ground-truth answer key, our framework enables users transparency, documentation, and reproducibility in de-identification practices, aligned with the HIPAA Safe Harbor method, DICOM PS3.15 Confidentiality Profiles, and TCIA best practices.
Usage Notes
This DICOM collection is split into two datasets, synthetic and curated. The synthetic dataset is the PHI/PII infused DICOM collection accompanied by a validation script and answer keys for testing, refining and benchmarking medical image de-identification pipelines. The curated dataset is a version of the synthetic dataset curated and de-identified by members of The Cancer Imaging Archive curation team. It can be used as a guide, an example of medical image curation best practices. For the purposes of the De-Identification challenge at MICCAI 2024, the synthetic and curated datasets each contain two subsets, a portion for Validation and the other for Testing.
To link a curated dataset to the original synthetic dataset and answer keys, a mapping between the unique identifiers (UIDs) and patient IDs must be provided in CSV format to the evaluation software. We include the mapping files associated with the TCIA-curated set as an example. Lastly, for both the Validation and Testing datasets, an answer key in sqlite.db format is provided. These components are for use with the Python validation script linked below (4). Combining these components, a user developing or evaluating de-identification methods can ensure they meet a specification for successfully de-identifying medical image data.
D
Data Quality Management Report
archivemarketresearch.com
doc, pdf, ppt
Updated Jun 16, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Data Quality Management Report [Dataset]. https://www.archivemarketresearch.com/reports/data-quality-management-558466
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Jun 16, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Data Quality Management (DQM) market is experiencing robust growth, driven by the increasing volume and velocity of data generated across various industries. Businesses are increasingly recognizing the critical need for accurate, reliable, and consistent data to support critical decision-making, improve operational efficiency, and comply with stringent data regulations. The market is estimated to be valued at $15 billion in 2025, exhibiting a Compound Annual Growth Rate (CAGR) of 12% from 2025 to 2033. This growth is fueled by several key factors, including the rising adoption of cloud-based DQM solutions, the expanding use of advanced analytics and AI in data quality processes, and the growing demand for data governance and compliance solutions. The market is segmented by deployment (cloud, on-premises), organization size (small, medium, large enterprises), and industry vertical (BFSI, healthcare, retail, etc.), with the cloud segment exhibiting the fastest growth. Major players in the DQM market include Informatica, Talend, IBM, Microsoft, Oracle, SAP, SAS Institute, Pitney Bowes, Syncsort, and Experian, each offering a range of solutions catering to diverse business needs. These companies are constantly innovating to provide more sophisticated and integrated DQM solutions incorporating machine learning, automation, and self-service capabilities. However, the market also faces some challenges, including the complexity of implementing DQM solutions, the lack of skilled professionals, and the high cost associated with some advanced technologies. Despite these restraints, the long-term outlook for the DQM market remains positive, with continued expansion driven by the expanding digital transformation initiatives across industries and the growing awareness of the significant return on investment associated with improved data quality.
E
Email Validation API Report
datainsightsmarket.com
doc, pdf, ppt
Updated Apr 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Email Validation API Report [Dataset]. https://www.datainsightsmarket.com/reports/email-validation-api-1390624
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Apr 14, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The Email Validation API market is experiencing robust growth, driven by the increasing need for businesses to maintain clean and accurate email lists for marketing and transactional communications. The market's expansion is fueled by several key factors: the rising adoption of email marketing strategies across various industries, a growing emphasis on data hygiene and compliance with regulations like GDPR and CCPA, and the increasing sophistication of email validation technologies. Segmentation reveals a significant portion of the market is dominated by large enterprises leveraging these APIs for bulk email validation and enhanced deliverability. However, the small and medium-sized enterprise (SME) segments are also demonstrating considerable growth, indicating a widespread adoption of email validation best practices across businesses of all sizes. The preferred formats show a diverse landscape with CSV, JSON, and TXT formats being commonly used, reflecting the flexibility required to integrate email validation seamlessly into existing workflows. The competitive landscape is dynamic, with numerous established and emerging players offering a range of features and pricing models. This makes it crucial for businesses to carefully evaluate different providers to find the best solution for their specific needs and budget. The projected Compound Annual Growth Rate (CAGR) suggests a consistently expanding market, implying continuous investment in email validation technologies. Geographic distribution shows a strong presence in North America and Europe, regions known for their advanced digital infrastructure and stringent data regulations. However, the Asia-Pacific region is expected to witness significant growth in the coming years, propelled by increasing internet penetration and rising adoption of digital marketing techniques. The challenges to market growth include the evolving nature of email service providers’ strategies, increasing concerns over data privacy, and the constant need for APIs to adapt to changing email landscape. Despite these challenges, the long-term outlook for the Email Validation API market remains positive, with significant opportunities for innovation and expansion across various geographic locations and business segments. The continued focus on improving email deliverability and maintaining data quality will fuel demand for these essential services for years to come.
d
Data from: Demonstrating the reliability of in vivo metabolomics-based...
datasets.ai
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+1more
0
Updated Mar 24, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Environmental Protection Agency (2024). Demonstrating the reliability of in vivo metabolomics-based chemical grouping: Towards best practice [Dataset]. https://datasets.ai/datasets/demonstrating-the-reliability-of-in-vivo-metabolomics-based-chemical-grouping-towards-best
Explore at:
0Available download formats
Dataset updated
Mar 24, 2024
Dataset authored and provided by
U.S. Environmental Protection Agency
Description
The experimental metabolomics data generated during the current study are available in the MetaboLights repository under the identifier MTBLS8274. Portions of this dataset are inaccessible because: L:\Priv\Metabolomics. They can be accessed through the following means: L:\Priv\Metabolomics. Format: L:\Priv\Metabolomics.

This dataset is associated with the following publication: Viant, M., E. Amstalden, T. Athersuch, M. Bouhifd, S. Camuzeaux, D. Crizer, P. Driemert, T. Ebbels, D. Ekman, B. Flick, V. Giri, M. G?mez-Romero, V. Haake, M. Herold, A. Kende, F. Lai, P. Leonards, P. Lim, G. Lloyd, J. Mosley, C. Namini, J. Rice, S. Romano, C. Sands, M. Smith, T. Sobansky, A. Southam, L. Swindale, B. van Ravenzwaay, T. Walk, R. Weber, F. Zickgraf, and H. Kamp. Demonstrating the reliability of in vivo metabolomics based chemical grouping: towards best practice. Archives of Toxicology. Springer, New York, NY, USA, 98: 1111-1123, (2024).
f
Table 1_Development and validation a methodology model for traditional...
figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Jan 31, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Su Li; Luan Zhang; Yangyang Wang; Runsheng Xie; Wenjia Chen; Myeong Soo Lee; Yasser Sami Amer; Amin Sharifan; Heba Hussein; Hui Li (2025). Table 1_Development and validation a methodology model for traditional Chinese medicine good practice recommendation: an exploratory sequential mixed methods study.docx [Dataset]. http://doi.org/10.3389/fphar.2025.1501634.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fphar.2025.1501634.s001
Dataset updated
Jan 31, 2025
Dataset provided by
Frontiers
Authors
Su Li; Luan Zhang; Yangyang Wang; Runsheng Xie; Wenjia Chen; Myeong Soo Lee; Yasser Sami Amer; Amin Sharifan; Heba Hussein; Hui Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundTo develop a rational and standardized traditional Chinese medicine (TCM) good practice recommendation (GPR) methodology model that guides the formulation of recommendations grounded in clinical experience.MethodsWe adopted an exploratory sequential mixed-method to develop a methodology model by coding systematically collected literature on methodology and TCM guidelines related to TCM GPR using a best-fit framework synthesis. Then based on real-world data (published TCM guidelines), saturation tests, structural rationality validation, and discriminability tests were conducted to validate methodology model.ResultsA total of 35 methodological literature and 190 TCM guidelines were included. A TCM GPR methodology model was developed, including 3 themes, 10 sub-themes, and the relationships between themes and subthemes. The information of TCM GPR methodology model achieved data saturation. The fit indices were within the acceptable range, and were able to distinguish the overall differences between guidelines from different literature sources, development organizations, guideline types, discipline categories, and funding categories.ConclusionThe study developed a TCM GPR methodology model which describes the definition of a TCM GPR, how to formulate it, and how to report it. The methodology modeldemonstrates good fit, discriminability, and data saturation. It can standardize the specific formulation of TCM GPRs, facilitate the scientific and rational formation of TCM GPRs, and provide theoretical and methodological guidance for the formation of TCM GPRs.
A
Best Practice Manual for Midwest Regional Carbon Sequestration Partnership...
data.amerigeoss.org
pdf
Updated Aug 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Energy Data Exchange (2019). Best Practice Manual for Midwest Regional Carbon Sequestration Partnership Phase II Geologic Sequestration Field Validation Tests [Dataset]. https://data.amerigeoss.org/ja/dataset/final-best-practice-geologic-sequestration-manual-final
Explore at:
pdf(3610881)Available download formats
Dataset updated
Aug 9, 2019
Dataset provided by
Energy Data Exchange
License
http://www.opendefinition.org/licenses/cc-by-sahttp://www.opendefinition.org/licenses/cc-by-sa
Description
Overview of best practives of MRCSP Phase II geologic sequestration field validation tests addressing public acceptance, evaluation qualified sites, initial characterization, reservoir simulations, permitting, CO2 supply and handling, well design and installation, and monitoring injection operations.
M
Global Thermal Validation System Market Industry Best Practices 2025-2032
statsndata.org
excel, pdf
Updated Jul 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Thermal Validation System Market Industry Best Practices 2025-2032 [Dataset]. https://www.statsndata.org/report/thermal-validation-system-market-315598
Explore at:
pdf, excelAvailable download formats
Dataset updated
Jul 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Thermal Validation System market plays a crucial role in ensuring the integrity and compliance of temperature-sensitive products across various industries, including pharmaceuticals, biotechnology, and food safety. These systems are essential for validating temperature-controlled environments, providing robust s
B
Bulk Email Verification and Validation Service Report
datainsightsmarket.com
doc, pdf, ppt
Updated May 18, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Bulk Email Verification and Validation Service Report [Dataset]. https://www.datainsightsmarket.com/reports/bulk-email-verification-and-validation-service-1447710
Explore at:
doc, ppt, pdfAvailable download formats
Dataset updated
May 18, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global bulk email verification and validation service market is experiencing robust growth, driven by the increasing reliance on email marketing as a primary communication channel for businesses of all sizes. The market's expansion is fueled by a rising need to maintain high email deliverability rates, improve sender reputation, and ultimately enhance marketing ROI. Businesses are increasingly adopting sophisticated email verification solutions to combat issues like hard bounces, spam traps, and invalid email addresses, which negatively impact campaign effectiveness and brand credibility. The transition towards cloud-based solutions is a prominent trend, offering scalability, cost-effectiveness, and ease of integration with existing marketing automation platforms. While the on-premise segment still holds a significant share, particularly among large enterprises with stringent data security requirements, the cloud segment is projected to dominate market growth over the forecast period. This growth is further segmented by business size, with both large enterprises and SMEs actively investing in email verification solutions, though the adoption rate might be higher among larger organizations with more extensive email lists and marketing budgets. Geographical growth is expected to be diverse, with North America and Europe leading in market adoption due to mature digital marketing ecosystems and stringent data privacy regulations, while the Asia-Pacific region demonstrates significant potential for future growth based on increasing internet and smartphone penetration. However, factors like the increasing sophistication of spam filters and the emergence of new email privacy regulations could pose challenges to market expansion. The market size in 2025 is estimated at $2.5 billion, growing at a compound annual growth rate (CAGR) of 15% from 2025 to 2033. This growth trajectory reflects the sustained demand for email verification services across various industries and regions. Competitive dynamics are intense, with established players and emerging startups constantly innovating to improve accuracy, speed, and integration capabilities of their offerings. The market is characterized by a diverse range of solutions, catering to different customer needs and budgets. The future outlook remains positive, with continued technological advancements and increasing awareness of the importance of email deliverability driving further market expansion. Key factors influencing future growth will include the evolution of email marketing best practices, regulatory changes concerning data privacy and email marketing, and the ongoing adoption of advanced analytics within email marketing campaigns. Ultimately, the effectiveness of email marketing remains significantly linked to the quality of the email list used, making bulk email verification and validation an essential investment for businesses of all sizes.
S
Global Carbon Credit Validation Verification and Certification Market...
statsndata.org
excel, pdf
Updated Aug 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stats N Data (2025). Global Carbon Credit Validation Verification and Certification Market Industry Best Practices 2025-2032 [Dataset]. https://www.statsndata.org/report/carbon-credit-validation-verification-and-certification-market-291026
Explore at:
excel, pdfAvailable download formats
Dataset updated
Aug 2025
Dataset authored and provided by
Stats N Data
License
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
Area covered
Global
Description
The Carbon Credit Validation, Verification, and Certification market plays a pivotal role in the global effort to combat climate change, offering a structured approach for businesses and organizations to measure, manage, and offset their carbon emissions. With growing environmental concerns and regulatory pressures,
d
Data from: Proxies in practice: calibration and validation of multiple...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Jul 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Matthew R. Falcy; Joshua L. McCormick; Shelly A. Miller (2025). Proxies in practice: calibration and validation of multiple indices of animal abundance [Dataset]. http://doi.org/10.5061/dryad.nk513
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.nk513
Dataset updated
Jul 1, 2025
Dataset provided by
Dryad Digital Repository
Authors
Matthew R. Falcy; Joshua L. McCormick; Shelly A. Miller
Time period covered
Jan 1, 2016
Description
The abundance of individuals in a population is a fundamental metric in basic and applied ecology, but sampling protocols yielding precise and unbiased estimates of abundance are often cost prohibitive. Proxies of abundance are therefore common, but require calibration and validation. There are many ways to calibrate a proxy, and it is not obvious which will perform best. We use data from eight populations of Chinook salmon (Oncorhynchus tshawytscha) on the Oregon coast where multiple proxies of abundance were obtained contemporaneously with independent mark-recapture estimates. We combined multiple proxy values associated with a single level of abundance into a univariate index and then calibrated that index to mark-recapture estimates using several different techniques. We tested our calibration methods using leave-one-out cross validation and simulation. Our cross-validation analysis did not definitively identify a single best calibration technique for all populations, but we could i...
w
Data from: Natural Fracture Diagnostics Validation
data.wu.ac.at
pdf
Updated Sep 29, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). Natural Fracture Diagnostics Validation [Dataset]. https://data.wu.ac.at/schema/edx_netl_doe_gov/NWRmYTZhZjEtZWI2OC00N2Q2LThkNmMtMTVjMTNiYjEzYzQx
Explore at:
pdf(198782.0)Available download formats
Dataset updated
Sep 29, 2016
Description
The goal of this project is to evaluate the ability of modern seismic methods to detect, map and analyze naturally fractured gas reservoirs, and assess future research needs in this area. During the last ten years there has been considerable research in developing and evaluating various seismic techniques for fracture characterization for petroleum and gas applications as well as for mining, geothermal and nuclear waste disposal. Current methods rely on gross definition of fracture properties using attributes such as P-wave anisotropy, AVO (Amplitude versus Offset) or AVA (Amplitude versus Angle). While useful for gross fracture detection, these approaches have not been able to define the specific fracture sets that control permeability. This effort investigated high-resolution seismic methods (including vertical seismic profiling [VSP] and single well seismic) for their ability to provide useful information on fracture properties. A key focus of this effort was to combine the surface seismic information with current borehole seismic methods. Field studies in the San Juan Basin in New Mexico were conducted to validate the most promising seismic characterization methods, and a validation well was drilled. The final stages of the work were to synthesize project data and information to produce a handbook of the best methods for fracture characterization using seismic and well data in the San Juan Basin.
Bioprocess Validation Market Analysis North America, Europe, Asia, Rest of...
technavio.com
pdf
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Technavio (2024). Bioprocess Validation Market Analysis North America, Europe, Asia, Rest of World (ROW) - US, Germany, China, Canada, Japan - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/bioprocess-validation-market-analysis
Explore at:
pdfAvailable download formats
Dataset updated
Jul 3, 2024
Dataset provided by
TechNavio
Authors
Technavio
Time period covered
2024 - 2028
Area covered
Germany, Japan, Europe, United States, China, Canada
Description
Snapshot img

Bioprocess Validation Market Size 2024-2028

The bioprocess validation market size is forecast to increase by USD 364 billion at a CAGR of 12.88% between 2023 and 2028.

The market is witnessing significant growth due to the increasing demand for biopharmaceuticals and the adoption of single-use technologies. Biopharmaceuticals are gaining popularity In the healthcare industry due to their ability to treat complex diseases, leading to a surge in demand for their production. Single-use technologies, which offer advantages such as reduced costs, improved product quality, and increased efficiency, are increasingly being adopted for bioprocess validation. However, the high costs associated with bioprocess validation remain a challenge for market growth. Bioprocess tecnology is a critical step in ensuring the safety and efficacy of biopharmaceuticals, making it essential for market players to invest in advanced technologies and techniques to streamline the validation process and reduce costs.The market is expected to continue its growth trajectory In the coming years, driven by these trends and the increasing focus on developing innovative biopharmaceutical products.

What will be the Size of the Bioprocess Validation Market During the Forecast Period?

Request Free Sample

The market encompasses the technologies and services employed to ensure the production of high-quality biopharmaceuticals, including impurities testing for vaccines, drug products, monoclonal antibodies, recombinant proteins, and biosimilars. With the ongoing development of precision medicines and vaccines for chronic diseases, such as the SARS-CoV-2 virus, the market's significance continues to grow. The market consists of various segments, including in-house and outsourcing services, with leading biopharmaceutical companies increasingly relying on outsourcing to manage bioproduction activities. The biopharmaceutical manufacturing sector's expansion is driven by socioeconomic factors, increasing demand for biologic generic drugs, and the need for compatibility, microbiological, physiochemical, and integrity testing services. Key components of bioprocess validation include filter elements, mixing systems, and other critical equipment used throughout the bioproduction process. The market's trends include the increasing use of advanced technologies for bioprocess validation, such as automation, artificial intelligence, and machine learning, to improve efficiency and accuracy.

How is this Bioprocess Validation Industry segmented and which is the largest segment?

The bioprocess validation industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

End-user Pharmaceutical companies Contract development and manufacturing organizations Others Type In-house Outsourced Geography North America Canada US Europe Germany Asia China Japan Rest of World (ROW)

By End-user Insights

The pharmaceutical companies segment is estimated to witness significant growth during the forecast period.

The market encompasses the validation of biopharmaceutical manufacturing processes for various pharmaceutical companies, including large enterprises and SMEs. Large companies, such as Pfizer, J and J, and Novartis, contribute significantly to the market due to their extensive resources, expertise, and adherence to industry best practices. They invest heavily in research and development (RD) expenditure for the production of complex biologics, including vaccines for SARS-CoV-2, monoclonal antibodies, recombinant proteins, and biosimilars. The market includes several segments, including impurities testing, vaccines, drug products, and biosimilars. Validation procedures involve analytical testing methods, cleaning procedures, and compliance with regulatory standards for drug safety. The market also includes services for precision medicines, cell therapy, and gene therapy.

Contract service providers offer digital tools, continuous process monitoring, real-time release testing, advanced analytics, and modelling techniques. The biopharmaceutical manufacturing sector is driven by the increasing demand for biologic drugs and bioproduction volumes. Automation technologies, including robotics and single-use systems, are also transforming the industry. Socioeconomic factors, such as chronic diseases and aging populations, further fuel market growth. The market includes services for extractable testing, microbiological testing, physiochemical testing, and compatibility testing, as well as bioprocess instruments, such as bioreactors, chromatography systems, and filtration elements.

Explore Bioprocess Validation Industry Segments Request Free Sample

The Pharmaceutic
r
Data from: Development and Validation of the Informal Supporter Readiness...
researchdata.edu.au
Updated Dec 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rock Adam; Rice Kylie; Davies Ryan; Ryan Laurence Davies; Kylie Rice; Davies Ryan; Davies Ryan; Adam Rock (2023). Development and Validation of the Informal Supporter Readiness Inventory (ISRI) - Dataset [Dataset]. http://doi.org/10.25952/0YRZ-F309
Explore at:
Unique identifier
https://doi.org/10.25952/0YRZ-F309
Dataset updated
Dec 5, 2023
Dataset provided by
University of New England, Australia
University of New England
Authors
Rock Adam; Rice Kylie; Davies Ryan; Ryan Laurence Davies; Kylie Rice; Davies Ryan; Davies Ryan; Adam Rock
Description
Objective: This article outlines the development and validation of the Informal Supporter Readiness Inventory (ISRI), based on the model developed by the present authors in Davies et al. (2023). This scale assesses the readiness of informal supporters to intervene or provide support in situations of intimate partner violence (IPV).
Methods: The research followed a three-phased procedure of item development, scale development, and scale evaluation; adhering to best practice guidelines for psychometric development and validation. This process provided empirical substantiation for the domains of the Model of Informal Supporter Readiness (Davies et al., 2023).
Results: The 57-item ISRI incorporates four primary factors: normative, individual, goodman-emotional, and situational-assessment. These factors demonstrated robust internal consistency and factor structures. Additionally, the ISRI evidenced strong test-retest reliability, and both convergent and divergent validity. Although aligning closely with the Model of Informal Supporter Readiness, the scale revealed a nuanced bifurcation of situational factors into situational-emotional and situational-assessment.
Discussion: The ISRI offers an important advancement in IPV research by highlighting the multifaceted nature of informal supporter intervention. The findings have several implications, from tailoring individualised supportive interventions to strengthening support networks and empowering survivors. The present study’s findings underscore the potential of adopting a social network-oriented approach to interventions in IPV scenarios. Applications for research and practice are discussed.
f
Data_Sheet_1_Good Practices for Species Distribution Modeling of Deep-Sea...
frontiersin.figshare.com
pdf
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arliss J. Winship; James T. Thorson; M. Elizabeth Clarke; Heather M. Coleman; Bryan Costa; Samuel E. Georgian; David Gillett; Arnaud Grüss; Mark J. Henderson; Thomas F. Hourigan; David D. Huff; Nissa Kreidler; Jodi L. Pirtle; John V. Olson; Matthew Poti; Christopher N. Rooper; Michael F. Sigler; Shay Viehman; Curt E. Whitmire (2023). Data_Sheet_1_Good Practices for Species Distribution Modeling of Deep-Sea Corals and Sponges for Resource Management: Data Collection, Analysis, Validation, and Communication.PDF [Dataset]. http://doi.org/10.3389/fmars.2020.00303.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.3389/fmars.2020.00303.s001
Dataset updated
Jun 3, 2023
Dataset provided by
Frontiers
Authors
Arliss J. Winship; James T. Thorson; M. Elizabeth Clarke; Heather M. Coleman; Bryan Costa; Samuel E. Georgian; David Gillett; Arnaud Grüss; Mark J. Henderson; Thomas F. Hourigan; David D. Huff; Nissa Kreidler; Jodi L. Pirtle; John V. Olson; Matthew Poti; Christopher N. Rooper; Michael F. Sigler; Shay Viehman; Curt E. Whitmire
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Resource managers in the United States and worldwide are tasked with identifying and mitigating trade-offs between human activities in the deep sea (e.g., fishing, energy development, and mining) and their impacts on habitat-forming invertebrates, including deep-sea corals, and sponges (DSCS). Related management decisions require information about where DSCS occur and in what densities. Species distribution modeling (SDM) provides a cost-effective means of identifying potential DSCS habitat over large areas to inform these management decisions and data collection. Here we describe good practices for DSCS SDM, especially in the context of data collection and management applications. Managers typically need information regarding DSCS encounter probabilities, densities, and sizes, defined at sub-regional to basin-wide scales and validated using subsequent, targeted data collections. To realistically achieve these goals, analysts should integrate available data sources in SDMs including fine-scale visual sampling and broad-scale resource surveys (e.g., fisheries trawl surveys), include environmental predictor variables representing multiple spatial scales, model residual spatial autocorrelation, and quantify prediction uncertainty. When possible, models fitted to presence-absence and density data are preferred over models fitted only to presence data, which are difficult to validate and can confound estimated probability of occurrence or density with sampling effort. Ensembles of models can provide robust predictions, while multi-species models leverage information across taxa, and facilitate community inference. To facilitate the use of models by managers, predictions should be expressed in units that are widely understood and validated at an appropriate spatial scale using a sampling design that provides strong statistical inference. We present three case studies for the Pacific Ocean that illustrate good practices with respect to data collection, modeling, and validation; these case studies demonstrate it is possible to implement our good practices in real-world settings.
Regulatory Company Data | Verified Profiles for Legal & Compliance...
datarade.ai
Updated Oct 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Success.ai (2021). Regulatory Company Data | Verified Profiles for Legal & Compliance Professionals | Best Price Guaranteed [Dataset]. https://datarade.ai/data-products/regulatory-company-data-verified-profiles-for-legal-compl-success-ai
Explore at:
.bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
Dataset updated
Oct 27, 2021
Dataset provided by
Area covered
Finland, French Polynesia, Ghana, Burundi, Rwanda, Togo, Equatorial Guinea, Kyrgyzstan, Czech Republic, Saint Kitts and Nevis
Description
Success.ai’s Regulatory Company Data provides organizations with access to verified profiles and contact details for legal and compliance professionals worldwide. Drawing from over 170 million verified professional profiles, this dataset includes work emails, direct phone numbers, and LinkedIn profiles of compliance officers, regulatory managers, attorneys, and other key decision-makers in corporate governance and regulatory affairs. Whether you’re addressing global compliance challenges, navigating complex legal frameworks, or offering specialized legal services, Success.ai ensures that your outreach is guided by accurate, up-to-date, and continuously validated contact data.

Why Choose Success.ai’s Regulatory Company Data?

Comprehensive Contact Information

Access verified work emails, phone numbers, and professional profiles of compliance officers, regulatory managers, and legal professionals across various industries and regions.

AI-driven validation ensures 99% accuracy, giving you confidence in the reliability and precision of the data.

Global Reach in Legal and Compliance Roles

Includes profiles of legal counsels, compliance directors, risk management officers, and corporate governance advisors in corporations, financial institutions, and regulatory agencies.

Covers North America, Europe, Asia-Pacific, South America, and the Middle East, enabling effective engagement with professionals in established and emerging markets.

Continuously Updated Datasets

Real-time updates help you stay current with evolving roles, titles, and responsibilities, keeping your outreach aligned with ongoing regulatory and legal shifts.

Ethical and Compliant

Adheres to GDPR, CCPA, and other global data privacy regulations, ensuring that your approach to connecting with legal and compliance professionals is always ethical and lawful.

Data Highlights:

170M+ Verified Professional Profiles: Includes legal and compliance professionals and decision-makers globally.

50M Work Emails: AI-validated for precise communication and minimized bounce rates.

30M Company Profiles: Gain organizational insights to understand corporate structures, helping you tailor outreach effectively.

700M Global Professional Profiles: Enriched datasets supporting a broad range of business development and market analysis initiatives.

Key Features of the Dataset

Decision-Maker Profiles in Compliance and Legal Domains

Identify and connect with GCs, CLOs, compliance officers, and regulatory managers influencing corporate policies, risk assessments, and regulatory adherence.

Advanced Filters for Precision Targeting

Refine outreach by industry, company size, location, or specific legal/compliance roles, ensuring your message reaches the right audience at the right time.

AI-Driven Enrichment

Profiles are enriched with actionable data, providing insights into areas of specialization, jurisdictional expertise, and regulatory focus, allowing for more personalized engagement.

Strategic Use Cases:

Risk Management and Compliance Solutions

Present compliance software, monitoring tools, or audit services directly to professionals managing regulatory risk and adherence within organizations.

Build relationships with decision-makers overseeing compliance training, enforcement, and remediation.

Legal Services and Advisory Campaigns

Offer legal services, regulatory consulting, or advisory support to GCs and legal teams seeking assistance with complex compliance mandates.

Target professionals involved in contract management, dispute resolution, and corporate governance.

Policy Advocacy and Regulatory Outreach

Engage with key influencers in corporate compliance to share policy insights, industry best practices, or research findings that inform regulations and guidelines.

Build networks that facilitate dialogue on legislative changes and compliance frameworks.

Technology and Automation Integration

Present solutions such as AI-driven compliance analytics, e-discovery tools, or contract automation software to legal and regulatory decision-makers.

Support digital transformation initiatives that streamline regulatory processes and reduce risk.

Why Choose Success.ai?

Best Price Guarantee

Access premium-quality verified data at competitive prices, ensuring your investments in outreach deliver maximum ROI.

Seamless Integration

Integrate verified contact data directly into your CRM or marketing platforms via APIs or downloadable formats, simplifying data management.

Data Accuracy with AI Validation

Trust in 99% accuracy to underpin data-driven decisions, improve targeting, and enhance the effectiveness of compliance and legal outreach campaigns.

Customizable and Scalable Solutions

Tailor datasets to f...
T
Temperature Mapping and Validation Service Report
archivemarketresearch.com
doc, pdf, ppt
Updated Mar 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Temperature Mapping and Validation Service Report [Dataset]. https://www.archivemarketresearch.com/reports/temperature-mapping-and-validation-service-52341
Explore at:
ppt, pdf, docAvailable download formats
Dataset updated
Mar 6, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global temperature mapping and validation service market is experiencing robust growth, driven by stringent regulatory compliance requirements across various industries, particularly pharmaceuticals, food and beverage, and cold chain logistics. The increasing need for ensuring product safety and quality, coupled with advancements in technology offering more precise and efficient mapping solutions, are key factors fueling market expansion. This market is estimated to be valued at $2.5 billion in 2025, exhibiting a Compound Annual Growth Rate (CAGR) of 7% from 2025 to 2033. This growth trajectory is projected to continue, reaching approximately $4.2 billion by 2033. Several factors are contributing to this positive outlook. The pharmaceutical industry's emphasis on Good Manufacturing Practices (GMP) and Good Distribution Practices (GDP) compliance necessitates rigorous temperature mapping and validation, driving substantial demand. Similarly, the food and beverage sector's focus on maintaining product integrity and preventing spoilage necessitates sophisticated temperature monitoring and validation systems. The expansion of e-commerce and the growing demand for cold chain logistics further contribute to this market's growth. While potential restraints such as high initial investment costs for sophisticated systems and the need for specialized expertise in validation protocols exist, the overall market trend points towards consistent growth fueled by regulatory pressures and technological advancements. The segment breakdown showcases the pharmaceutical and food & beverage sectors as major contributors, highlighting the crucial role of temperature control in these industries.
f
Data from: Good environmental practices check list for food services:...
datasetcatalog.nlm.nih.gov
Updated Feb 28, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Colares, Luciléia Granhen Tavares; Ferreira, Aline Alves; de Oliveira Figueiredo, Verônica; de Oliveira, Aline Gomes de Mello (2018). Good environmental practices check list for food services: elaboration, content validation and inter-rater reliability [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000605799
Explore at:
Dataset updated
Feb 28, 2018
Authors
Colares, Luciléia Granhen Tavares; Ferreira, Aline Alves; de Oliveira Figueiredo, Verônica; de Oliveira, Aline Gomes de Mello
Description
Abstract With the objective of elaborating, validating the content of a good environmental practices checklist (LVBPA-SA) for food services, and verifying the inter-rater reliability, an instrument was constructed based on a literature survey and the guidelines established by Brazilian Legislation (National Solid Waste Policy, National Water Resources Policy, National Policy on Conservation and the Rational Use of Energy). The LVBPA-SA was evaluated by a specialist panel to validate the contents according to the Delphi technique. To verify the concordance level between the specialists, the content validity index was used, and the instrument was considered validated when the content validity index was ≥ 80%. The form of presentation, semantic clarity, and the ease of understanding, filling in and using the instrument were evaluated, and the specialists could suggest alterations to the instrument. After validation, the instrument was applied by seven nutritionists in the same food service to evaluate the good environmental practices and verify the inter-rater reliability using the intra-class correlation coefficient (ICC) and Cronbach’s alpha at a significance level of 5%. The Kruskal-Wallis test was applied to compare the variance between responses. The validated LVBPA-SA contained five blocks and 68 evaluation items and 65% of the good environmental practices measurements were adopted by the food services. There was no statistically significant difference between the evaluations made by the nutritionists obtaining an ICC > 0.75 for 75% of the blocks. For the Cronbach's alpha, 100% of the blocks presented a coefficient ≥ 0.70, indicating excellent inter-rater agreement. Thus the contents of the LVBPA-SA were validated and showed internal consistency. In addition, it complied with the guidelines established by the Policies and lead to the adoption of good environmental practices, being an important instrument to be used in food services.
Additional file 1 of On the difficulty of validating molecular generative...
springernature.figshare.com
xlsx
Updated Aug 14, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koichi Handa; Morgan C. Thomas; Michiharu Kageyama; Takeshi Iijima; Andreas Bender (2024). Additional file 1 of On the difficulty of validating molecular generative models realistically: a case study on public and proprietary data [Dataset]. http://doi.org/10.6084/m9.figshare.26643346.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26643346.v1
Dataset updated
Aug 14, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Koichi Handa; Morgan C. Thomas; Michiharu Kageyama; Takeshi Iijima; Andreas Bender
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 1. Supporting_Dataset_1_DRD2.xlsx.
n
Data from: Validation of Quality-of-Life assessment tool for Ethiopian old...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jan 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Muhye (2023). Validation of Quality-of-Life assessment tool for Ethiopian old age people [Dataset]. http://doi.org/10.5061/dryad.zkh1893dq
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.zkh1893dq
Dataset updated
Jan 31, 2023
Dataset provided by
Bahir Dar University
Authors
Ahmed Muhye
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Ethiopia
Description
Background: Reliable quality of life assessment is critical for identifying health issues, evaluating health interventions, and establishing the best health policies and care packages. The World Health Organization Quality of Life-Old Module is a tool for assessing the subjective quality of life in old age people. It's validated and available in more than 20 languages, except Amharic. Hence, this study was intended to translate it into Amharic language and validate it among old age people in Ethiopia. Methods: A cross-sectional study was conducted among 180 community-dwelling old age people in Ethiopia, from January 16 to March 13, 2021. Psychometric validation was achieved through Cronbach’s alpha of the internal consistency reliability test, and construct validity from confirmatory factor analysis. Results: The study participants aged from 60 to 90 years old with a mean age of 69.44. Females made up 61.7% of the population, and 40% of them could not read and write. The results showed a relatively low level of quality of life, with the total transformed score of 58.58 ± 23.15. The Amharic version of the World Health Organization Quality of Life-Old Module showed a Cronbach’s Alpha value of 0.96 and corrected item-total correlations of more than 0.74. Confirmatory factor analysis confirmed the six-factor model with a chi-square (X2) of 341.98 with a p-value less than 0.001. The comparative fit index (CFI) was 0.98, Tucker-Lewis’s index (TCL) was 0.97, and the root mean square error of approximation (RMSEA) was 0.046. Conclusion: The Amharic version of the World Health Organization Quality of Life-Old Module indicated good internal consistency reliability and construct validity. The tool can be utilized to provide care to Ethiopian community-dwelling old age people. Methods Study setting This study was conducted in Ethiopia. Study design and period A cross-sectional study design was conducted from January 16 to March 13, 2021. Study population, sample size and sampling procedures
This study utilized two groups of the population. The first group was health care experts used for content validation, and the second group was community-dwelling old age people for psychometric validation. The detailed study methods for study population, sample size, and sampling procedures were described in the previous study. Validation process This tool validation study was conducted in three stepwise phases. The first phase was to review existing QoL assessment tools for old age people. In the second phase, selection, translation, and review of the tool by experts were conducted. In the last phase, psychometric validation among community-dwelling old age people was performed. The novel form contains a total of 24 items assembled into six domains, each with four items: autonomy (AUT), past, present, and future activities (PPF), sensory abilities (SAB), social participation (SOP), death and dying (DAD), and intimacy (INT). The module evaluates mostly the two-week duration of testing in self-report form. Although each object is rated on a Likert scale of 1 to 5, they differ in their anchors. Each domain provides an individual score ranging from 4 to 20. The component values can also be converted to a scale of 0 to 100. Furthermore, summing the individual item values yields total scores from 24 to 120, with higher scores indicating better QOL. Data collection Data were conducted from two groups: healthcare experts and community-dwelling old age people, in exploratory mixed qualitative and quantitative methods. Each expert evaluated the content validity of the tool through face-to-face contact. The experts' and old age people’s comments were used for words, grammar, clarity, appropriate scoring, and applicability of items. After incorporating the experts’ comments, psychometric validation was conducted among community-dwelling old age people. Six urban health extension workers and six BSc nurses collected the data after two days of training. The principal investigator and a master’s-degree-trained nutritionist supervised the data collection process. The data were collected through face-to-face interviews using the standardized Amharic version of the questionnaires. Assistance from family members or caregivers was also used.
f
Table_1_Sanger Validation of High-Throughput Sequencing in Genetic...
frontiersin.figshare.com
docx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosina De Cario; Ada Kura; Samuele Suraci; Alberto Magi; Andrea Volta; Rossella Marcucci; Anna Maria Gori; Guglielmina Pepe; Betti Giusti; Elena Sticchi (2023). Table_1_Sanger Validation of High-Throughput Sequencing in Genetic Diagnosis: Still the Best Practice?.DOCX [Dataset]. http://doi.org/10.3389/fgene.2020.592588.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2020.592588.s001
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Rosina De Cario; Ada Kura; Samuele Suraci; Alberto Magi; Andrea Volta; Rossella Marcucci; Anna Maria Gori; Guglielmina Pepe; Betti Giusti; Elena Sticchi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Next-generation sequencing (NGS)’s crucial role in supporting genetic diagnosis and personalized medicine leads to the definition of Guidelines for Diagnostic NGS by the European Society of Human Genetics. Factors of different nature producing false-positive/negative NGS data together with the paucity of internationally accepted guidelines providing specified NGS quality metrics to be followed for diagnostics purpose made the Sanger validation of NGS variants still mandatory. We reported the analysis of three cases of discrepancy between NGS and Sanger sequencing in a cohort of 218 patients. NGS was performed by Illumina MiSeq® and Haloplex/SureSelect protocols targeting 97 or 57 or 10 gene panels usually applied for diagnostics. Variants called following guidelines suggested by the Broad Institute and identified according to MAF 0.2 were Sanger validated. Three out of 945 validated variants showed a discrepancy between NGS and Sanger. In all three cases, a deep evaluation of the discrepant gene variant results and methodological approach allowed to confirm the NGS datum. Allelic dropout (ADO) occurrence during polymerase chain or sequencing reaction was observed, mainly related to incorrect variant zygosity. Our study extends literature data in which almost 100% “high quality” NGS variants are confirmed by Sanger; moreover, it demonstrates that in case of discrepancy between a high-quality NGS variant and Sanger validation, NGS call should not be a priori assumed to represent the source of the error. Actually, difficulties (i.e., ADO, unpredictable presence of private variants on primer-binding regions) of the so-called gold standard direct sequencing should be considered especially in light of the constantly implemented and accurate high-throughput technologies. Our data along with literature raise a discussion on the opportunity to establish a standardized quality threshold by International Guidelines for clinical NGS in order to limit Sanger confirmation to borderline conditions of variant quality parameters and verification of correct gene variant call/patient coupling on a different blood sample aliquot.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Cancer Imaging Archive (2025). Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation, MIDI-B-Curated-Validation, MIDI-B-Synthetic-Test, MIDI-B-Curated-Test) [Dataset]. http://doi.org/10.7937/cf2p-aw56

Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation, MIDI-B-Curated-Validation, MIDI-B-Synthetic-Test, MIDI-B-Curated-Test)

MIDI-B-Test-MIDI-B-Validation

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

sqlite and zip, dicom, csv, n/aAvailable download formats

Unique identifier

https://doi.org/10.7937/cf2p-aw56

Dataset updated

May 2, 2025

Dataset authored and provided by

The Cancer Imaging Archive

License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered

May 2, 2025

Dataset funded by

National Cancer Institutehttp://www.cancer.gov/

Description

Abstract

These resources comprise a large and diverse collection of multi-site, multi-modality, and multi-cancer clinical DICOM images from 538 subjects infused with synthetic PHI/PII in areas encountered by TCIA curation teams. Also provided is a TCIA-curated version of the synthetic dataset, along with mapping files for mapping identifiers between the two.

This new MIDI data resource includes DICOM datasets used in the Medical Image De-Identification Benchmark (MIDI-B) challenge at MICCAI 2024. They are accompanied by ground truth answer keys and a validation script for evaluating the effectiveness of medical image de-identification workflows. The validation script systematically assesses de-identified data against an answer key outlining appropriate actions and values for proper de-identification of medical images, promoting safer and more consistent medical image sharing.

Introduction

Medical imaging research increasingly relies on large-scale data sharing. However, reliable de-identification of DICOM images still presents significant challenges due to the wide variety of DICOM header elements and pixel data where identifiable information may be embedded. To address this, we have developed an openly accessible synthetic dataset containing artificially generated protected health information (PHI) and personally identifiable information (PII).

These resources complement our earlier work (Pseudo-PHI-DICOM-data ) hosted on The Cancer Imaging Archive. As an example of its use, we also provide a version curated by The Cancer Imaging Archive (TCIA) curation team. This resource builds upon best practices emphasized by the MIDI Task Group who underscore the importance of transparency, documentation, and reproducibility in de-identification workflows, part of the themes at recent conferences (Synapse:syn53065760) and workshops (2024 MIDI-B Challenge Workshop).

This framework enables objective benchmarking of de-identification performance, promotes transparency in compliance with regulatory standards, and supports the establishment of consistent best practices for sharing clinical imaging data. We encourage the research community to use these resources to enhance and standardize their medical image de-identification workflows.

Methods

Subject Inclusion and Exclusion Criteria

The source data were selected from imaging already hosted in de-identified form on TCIA. Imaging containing faces were excluded, and no new human studies were performed for his project.

Data Acquisition

To build the synthetic dataset, image series were selected from TCIA’s curated datasets to represent a broad range of imaging modalities (CR, CT, DX, MG, MR, PT, SR, US) , manufacturers including (GE, Siemens, Varian , Confirma, Agfa, Eigen, Elekta, Hologic, KONICA MINOLTA, others) , scan parameters, and regions of the body. These were processed to inject the synthetic PHI/PII as described.

Data Analysis

Synthetic pools of PHI, like subject and scanning institution information, were generated using the Python package Faker (https://pypi.org/project/Faker/8.10.3/). These were inserted into DICOM metadata of selected imaging files using a system of inheritable rule-based templates outlining re-identification functions for data insertion and logging for answer key creation. Text was also burned-in to the pixel data of a number of images. By systematically embedding realistic synthetic PHI into image headers and pixel data, accompanied by a detailed ground-truth answer key, our framework enables users transparency, documentation, and reproducibility in de-identification practices, aligned with the HIPAA Safe Harbor method, DICOM PS3.15 Confidentiality Profiles, and TCIA best practices.

Usage Notes

This DICOM collection is split into two datasets, synthetic and curated. The synthetic dataset is the PHI/PII infused DICOM collection accompanied by a validation script and answer keys for testing, refining and benchmarking medical image de-identification pipelines. The curated dataset is a version of the synthetic dataset curated and de-identified by members of The Cancer Imaging Archive curation team. It can be used as a guide, an example of medical image curation best practices. For the purposes of the De-Identification challenge at MICCAI 2024, the synthetic and curated datasets each contain two subsets, a portion for Validation and the other for Testing.

To link a curated dataset to the original synthetic dataset and answer keys, a mapping between the unique identifiers (UIDs) and patient IDs must be provided in CSV format to the evaluation software. We include the mapping files associated with the TCIA-curated set as an example. Lastly, for both the Validation and Testing datasets, an answer key in sqlite.db format is provided. These components are for use with the Python validation script linked below (4). Combining these components, a user developing or evaluating de-identification methods can ensure they meet a specification for successfully de-identifying medical image data.

Clear search

Close search

Google apps

Main menu

Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation,...

Abstract

Introduction

Methods

Subject Inclusion and Exclusion Criteria

Data Acquisition

Data Analysis

Usage Notes

Data Quality Management Report

Email Validation API Report

Data from: Demonstrating the reliability of in vivo metabolomics-based...

Table 1_Development and validation a methodology model for traditional...

Best Practice Manual for Midwest Regional Carbon Sequestration Partnership...

Global Thermal Validation System Market Industry Best Practices 2025-2032

Bulk Email Verification and Validation Service Report

Global Carbon Credit Validation Verification and Certification Market...

Data from: Proxies in practice: calibration and validation of multiple...

Data from: Natural Fracture Diagnostics Validation

Bioprocess Validation Market Analysis North America, Europe, Asia, Rest of...

Snapshot img

Data from: Development and Validation of the Informal Supporter Readiness...

Data_Sheet_1_Good Practices for Species Distribution Modeling of Deep-Sea...

Regulatory Company Data | Verified Profiles for Legal & Compliance...

Temperature Mapping and Validation Service Report

Data from: Good environmental practices check list for food services:...

Additional file 1 of On the difficulty of validating molecular generative...

Data from: Validation of Quality-of-Life assessment tool for Ethiopian old...

Table_1_Sanger Validation of High-Throughput Sequencing in Genetic...

Data in Support of the MIDI-B Challenge (MIDI-B-Synthetic-Validation, MIDI-B-Curated-Validation, MIDI-B-Synthetic-Test, MIDI-B-Curated-Test)

MIDI-B-Test-MIDI-B-Validation

Abstract

Introduction

Methods

Subject Inclusion and Exclusion Criteria

Data Acquisition

Data Analysis

Usage Notes