100+ datasets found

Cancer-Net PCa-Data
kaggle.com
zip
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hayden Gunraj (2023). Cancer-Net PCa-Data [Dataset]. https://www.kaggle.com/datasets/hgunraj/cancer-net-pca-data
Explore at:
zip(105639561 bytes)Available download formats
Dataset updated
May 16, 2023
Authors
Hayden Gunraj
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cancer-Net Open Source Initiative - Cancer-Net PCa-Data

Cancer-Net PCa-Data is an open access benchmark dataset of volumetric correlated diffusion imaging (CDIs) data acquisitions of prostate cancer patients. Cancer-Net PCa-Data is a part of the Cancer-Net open source initiative dedicated to advancement in machine learning and imaging research to aid clinicians in the global fight against cancer.

The volumetric CDIs data acquisitions in the Cancer-Net PCa-Data dataset were generated from a patient cohort of 200 patient cases acquired at Radboud University Medical Centre (Radboudumc) in the Prostate MRI Reference Center in Nijmegen, The Netherlands and made available as part of the SPIE-AAPM-NCI PROSTATEx Challenges. Masks derived from the PROSTATEx_masks repository are also provided which label regions of healthy prostate tissue, clinically significant prostate cancer (csPCa), and clinically insignificant prostate cancer (insPCa).

This dataset was used to investigate the relationship between PCa presence and CDIs hyperintensity.

Cancer-Net PCa-Data is released under a CC BY 4.0 license.

Example T2-weighted images of prostates with CDIs overlaid are shown below. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4364336%2Fc312a93e80813c9f4e5e418f1220d4e4%2FPROSTATEx-grid-top100.png?generation=1684256503310308&alt=media" alt="Grid of T2-weighted MRI images of the prostate with CDIs images overlaid.">

If you find our work useful for your research, please cite: @article{Wong2022, author={Alexander Wong and Hayden Gunraj and Vignesh Sivan and Masoom A. Haider}, title={Synthetic correlated diffusion imaging hyperintensity delineates clinically significant prostate cancer}, journal ={Scientific Reports}, volume={12}, year={2022}, number={3376}, doi={10.1038/s41598-022-06872-7} } and @article{Gunraj2023, author={Hayden Gunraj and Chi-en Amy Tai and Alexander Wong}, title={Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data}, journal ={NeurIPS Workshops}, year={2023} } Additionally, SPIE-AAPM-NCI PROSTATEx Challenges, PROSTATEx_masks, and The Cancer Imaging Archive (TCIA) should also be cited: @misc{Litjens2017, author={Geert Litjens and Oscar Debats and Jelle Barentsz and Nico Karssemeijer and Henkjan Huisman}, title={ProstateX Challenge data [data set]}, journal={The Cancer Imaging Archive}, year={2017}, doi={10.7937/K9TCIA.2017.MURS5CL } @article{Litjens2014, author={Geert Litjens and Oscar Debats and Jelle Barentsz and Nico Karssemeijer and Henkjan Huisman}, title={Computer-Aided Detection of Prostate Cancer in MRI}, journal={IEEE Transactions on Medical Imaging}, year={2014}, volume={33}, number={5}, pages={1083-1092}, doi={10.1109/TMI.2014.2303821} } @article{Cuocolo2021, author={Renato Cuocolo and Arnaldo Stanzione and Anna Castaldo and Davide Raffaele {De Lucia} and Massimo Imbriaco}, title={Quality control and whole-gland, zonal and lesion annotations for the PROSTATEx challenge public dataset}, journal={European Journal of Radiology}, volume={138}, pages={109647}, year={2021}, doi={10.1016/j.ejrad.2021.109647} } @article{Clark2013, author={Kenneth Clark and Bruce Vendt and Kirk Smith and John Freymann and Justin Kirby and Paul Koppel and Stephen Moore and Stanley Phillips and David Maffitt and Michael Pringle and Lawrence Tarbox and Fred Prior}, title={The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository}, journal={Journal of Digital Imaging}, year={2013}, volume={26}, number={6}, pages={1045-1057}, }

Core Cancer-Net Team

DarwinAI Corp., Canada and Vision and Image Processing Lab, University of Waterloo, Canada

Alexander Wong

Vision and Image Processing Lab, University of Waterloo, Canada

Amy Tai

Hayden Gunraj
d
Prescription Cost Analysis
digital.nhs.uk
csv, pdf, xlsx
Updated Mar 15, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Prescription Cost Analysis [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/prescription-cost-analysis
Explore at:
pdf(109.2 kB), csv(46.8 kB), pdf(162.2 kB), csv(4.1 MB), pdf(53.3 kB), xlsx(3.8 MB), xlsx(166.7 kB), xlsx(116.2 kB), xlsx(5.3 MB), xlsx(6.3 MB), pdf(309.7 kB)Available download formats
Dataset updated
Mar 15, 2018
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Jan 1, 2007 - Dec 31, 2017
Area covered
England
Description
Prescription Cost Analysis (PCA) provides details of the number of items and the net ingredient cost (NIC) of all prescriptions dispensed in the community in England. The drugs dispensed are listed by British National Formulary (BNF) therapeutic class. This publication includes data sheets for items dispensed and NIC from 2007 to 2017 at individual presentation level. The Prescribing by Dentists report is no longer published separately in April, however the data is already included in this publication and provided as a separate csv file purely with the dental data. Please note the two csv files (data and Dental data) should not be added together as the PCA data already includes the Dental data.
Prescription Cost Analysis 1998-2016 - drugs matched to BNF via fuzzy lookup...
figshare.com
xlsx
Updated Sep 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helen Curtis (2017). Prescription Cost Analysis 1998-2016 - drugs matched to BNF via fuzzy lookup [Dataset]. http://doi.org/10.6084/m9.figshare.5450323.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5450323.v1
Dataset updated
Sep 28, 2017
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Helen Curtis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Some drug names found in the PCA data do not exactly match any drug names in the current British National Formulary (BNF), e.g. formulation variants no longer available. A similar BNF presentation name could sometimes be found by using the “fuzzy” lookup add-on for Excel. These were validated manually by a pharmacist.
PCA Data Samples
kaggle.com
zip
Updated Mar 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Wolski (2021). PCA Data Samples [Dataset]. https://www.kaggle.com/datasets/alexwolski/pca-data-samples
Explore at:
zip(1526 bytes)Available download formats
Dataset updated
Mar 2, 2021
Authors
Alex Wolski
Description
Dataset

This dataset was created by Alex Wolski

Contents
b
Edmunds et al./FACS data for PCA - Datasets - data.bris
data.bris.ac.uk
Updated Sep 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Edmunds et al./FACS data for PCA - Datasets - data.bris [Dataset]. https://data.bris.ac.uk/data/dataset/a4f998421899685d22bf9c7d0139f855
Explore at:
Dataset updated
Sep 15, 2021
Description
Edmunds et al./FACS data for PCA
f
Principal component analysis (PCA) of behavioural data across the life...
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated May 24, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frýdlová, Petra; Frynta, Daniel; Šimková, Olga; Žampachová, Barbora; Landová, Eva (2017). Principal component analysis (PCA) of behavioural data across the life stages. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001770055
Explore at:
Dataset updated
May 24, 2017
Authors
Frýdlová, Petra; Frynta, Daniel; Šimková, Olga; Žampachová, Barbora; Landová, Eva
Description
Principal component analysis (PCA) of behavioural data across the life stages.
f
Environmental data used for PCA, and site age
datasetcatalog.nlm.nih.gov
figshare.com
Updated Oct 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fiechter, Lena; von der Lippe, Moritz; Whitehead, James; Hiller, Anne (2021). Environmental data used for PCA, and site age [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000914062
Explore at:
Dataset updated
Oct 15, 2021
Authors
Fiechter, Lena; von der Lippe, Moritz; Whitehead, James; Hiller, Anne
Description
A dataset established in 2017 containing environmental data from the CityScapeLabs research platform in Berlin, Germany.This dataset is a subset of the main CityScapeLabs dataset which was used in the paper titled "Soil physico-chemical properties change across an urbanity gradient in Berlin", currently in review.Data was collected by Lena Fiechter, Moritz von der Lippe and Anne Hiller, with one additional parameter added by James Whitehead.For more details on the CityScapeLab research platform please see:von der Lippe, M.; Buchholz, S.; Hiller, A.; Seitz, B.; Kowarik, I. CityScapeLab Berlin: A Research Platform for Untangling Urbanization Effects on Biodiversity. Sustainability 2020, 12, 2565. https://doi.org/10.3390/su12062565
Principal Components Analysis (PCA) Image used to characterize the...
catalog.data.gov
s.cnmilf.com
+1more
Updated Mar 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Commerce (DOC), National Oceanic and Atmospheric Administration (NOAA), National Ocean Service (NOS), Center for Coastal Monitoring and Assessment (CCMA), Biogeography Branch (Point of Contact) (2025). Principal Components Analysis (PCA) Image used to characterize the complexity of the seafloor around St. John, USVI [Dataset]. https://catalog.data.gov/dataset/principal-components-analysis-pca-image-used-to-characterize-the-complexity-of-the-seafloor-aro4
Explore at:
Dataset updated
Mar 22, 2025
Dataset provided by
National Oceanic and Atmospheric Administrationhttp://www.noaa.gov/
United States Department of Commercehttp://commerce.gov/
Area covered
Saint John, U.S. Virgin Islands
Description
Eight complexity surfaces (mean depth, standard deviation of depth, curvature, plan curvature, profile curvature, rugosity, slope, and slope of slope) were stacked and exported to create one image with several different bands (each band representing a specific metric). This image was transformed into its first three principal components using the "Principal Components Analysis" (PCA) function in ENVI 4.6. The transformation reduced the dimensionality of the dataset by removing information that was redundant among the different bands. The resulting 2x2 meter resolution, three band PCA image only contains information that uniquely described the complexity and structure of the seafloor. Coral reef habitat types were delineated and classified from this PCA image. Acoustic imagery was acquired for the VICRNM on two separate missions onboard the NOAA ship, Nancy Foster. The first mission took place from 2/18/04 to 3/5/04. The second mission took place from 2/1/05 to 2/12/05. On both missions, seafloor depths between 14 to 55 m were mapped using a RESON SeaBat 8101 ER (240 kHz) MBES sensor. This pole-mounted system measured water depths across a 150 degree swath consisting of 101 individual 1.5 degree x 1.5 degree beams. The beams to the port and starboard of nadir (i.e., directly underneath the ship) overlapped adjacent survey lines by approximately 10 m. The vessel survey speed was between 5 and 8 kn. In 2004, the ship's location was determined by a Trimble DSM 132 DGPS system, which provided a RTCM differential data stream from the U.S. Coast Guard Continually Operating Reference Station (CORS) at Port Isabel, Puerto Rico. Gyro, heave, pitch and roll correctors were acquired using an Ixsea Octans gyrocompass. In 2005, the ship's positioning and orientation were determined by the Applanix POS/MV 320 V4, which is a GPS aided Inertial Motion Unit (IMU) providing measurements of roll, pitch and heading. The POS/MV obtained its positions from two dual frequency Trimble Zephyr GPS antennae. An auxiliary Trimble DSM 132 DGPS system provided a RTCM differential data stream from the U.S. Coast Guard CORS at Port Isabel, Puerto Rico. For both years, CTD (conductivity, temperature and depth) measurements were taken approximately every 4 hours using a Seabird Electronics SBE-19 to correct for the changing sound velocities in the water column. In 2004, raw data were logged in .xtf (extended triton format) using Triton ISIS software 6.2. In 2005, raw data were logged in .gsf (generic sensor format) using SAIC ISS 2000 software. Data from 2004 were referenced to the WGS84 UTM 20 N horizontal coordinate system, and data from 2005 were referenced to the NAD83 UTM 20 N horizontal coordinate system. Data from both projects were referenced to the Mean Lower Low Water (MLLW) vertical tidal coordinate system. The 2004 and 2005 MBES bathymetric data were both corrected for sensor offsets, latency, roll, pitch, yaw, static draft, the changing speed of sound in the water column and the influence of tides in CARIS Hips & Sips 5.3 and 5.4, respectively. The 2004 data was then binned to create a 1 x 1 m raster surface, and the 2005 data was binned to a create 2 x 2 m raster surface. After these final surfaces were created, the datum for the 2004 bathymetric surfaces was transformed from WGS84 to NAD83 using the "Project Raster" function in ArcGIS 9.1. The 2004 surface was transformed so that it would have the same datum as the 2005 surface. The 2004 bathymetric surface was then down sampled from 1 x 1 to 2 x 2 m using the "Resample" function in ArcGIS 9.1. The 2004 surface was resampled so it would have the same spatial resolution as the 2005 surface. Having the same coordinate systems and spatial resolutions, the final 2004 and 2005 bathymetry rasters were then merged using the Raster Calculator function "Merge" in ArcGIS's Spatial Analyst Extension to create a seamless bathymetry surface for the entire VICRNM area south of St. John. For a complete description of the data acquisition and processing parameters, please see the data acquisition and processing reports (DAPRs) for projects: NF-04-06-VI and NF-05-05-VI (Monaco & Rooney, 2004; Battista & Lazar, 2005).
v
Global import data of Pca
volza.com
csv
Updated Nov 17, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global import data of Pca [Dataset]. https://www.volza.com/imports-india/india-import-data-of-pca-from-mexico
Explore at:
csvAvailable download formats
Dataset updated
Nov 17, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of importers, Sum of import value, 2014-01-01/2021-09-30, Count of import shipments
Description
413 Global import shipment records of Pca with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
H
PCA data
dataverse.harvard.edu
search.dataone.org
Updated Jan 23, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sheng-Chin Kao (2017). PCA data [Dataset]. http://doi.org/10.7910/DVN/1Y8YAI
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/1Y8YAI
Dataset updated
Jan 23, 2017
Dataset provided by
Harvard Dataverse
Authors
Sheng-Chin Kao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Data analyzed in the manuscript "The Association between Frequent Alcohol Drinking and Opioid Consumption after Abdominal Surgery: A Retrospective Analysis"
p
PCA Locations Data for United States
poidata.io
csv, json
Updated Oct 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Business Data Provider (2025). PCA Locations Data for United States [Dataset]. https://poidata.io/brand-report/pca/united-states
Explore at:
csv, jsonAvailable download formats
Dataset updated
Oct 31, 2025
Dataset authored and provided by
Business Data Provider
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
United States
Variables measured
Website URL, Phone Number, Review Count, Business Name, Email Address, Business Hours, Customer Rating, Business Address, Brand Affiliation, Geographic Coordinates
Description
Comprehensive dataset containing 28 verified PCA locations in United States with complete contact information, ratings, reviews, and location data.
D
Replication Data for: Uncertainty-Aware Principal Component Analysis
darus.uni-stuttgart.de
Updated Dec 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jochen Görtler; Thilo Spinner; Daniel Weiskopf; Oliver Deussen (2022). Replication Data for: Uncertainty-Aware Principal Component Analysis [Dataset]. http://doi.org/10.18419/DARUS-2321
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-2321
Dataset updated
Dec 7, 2022
Dataset provided by
DaRUS
Authors
Jochen Görtler; Thilo Spinner; Daniel Weiskopf; Oliver Deussen
License
https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-2321https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-2321
Dataset funded by
DFG
Description
This dataset contains the source code for uncertainty-aware principal component analysis (UA-PCA) and a series of images that show dimensionality reduction plots created with UA-PCA. The software is a JavaScript library for performing principal component analysis and dimensionality reduction on datasets consisting of multivariate probability distributions. Each plot of the image series used UA-PCA to project a dataset consisting of multivariate normal distributions. The covariance matrices of the dataset instances were scaled with different factors resulting in different UA-PCA projections. The projected probability distributions are displayed using isolines of their probability density functions. As the scaling value increases, the projection changes, showing the sensitivity of UA-PCA to changes in variance.
c
Cancer Moonshot Biobank - Prostate Cancer Collection
cancerimagingarchive.net
dicom, n/a +1
Updated Dec 17, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2024). Cancer Moonshot Biobank - Prostate Cancer Collection [Dataset]. http://doi.org/10.7937/25T7-6Y12
Explore at:
n/a, dicom, svs and jsonAvailable download formats
Unique identifier
https://doi.org/10.7937/25T7-6Y12
Dataset updated
Dec 17, 2024
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
Oct 10, 2025
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Moonshot Biobank is a National Cancer Institute initiative to support current and future investigations into drug resistance and sensitivity and other NCI-sponsored cancer research initiatives, with an aim of improving researchers' understanding of cancer and how to intervene in cancer initiation and progression. During the course of this study, biospecimens (blood and tissue removed during medical procedures) and associated data will be collected longitudinally from at least 1000 patients across at least 10 cancer types, who are receiving standard of care cancer treatment at multiple NCI Community Oncology Research Program (NCORP) sites.
This collection contains de-identified radiology and histopathology imaging procured from subjects in NCI’s Cancer Moonshot Biobank - Prostate Cancer (CMB-PCA) cohort. Associated genomic, phenotypic and clinical data will be hosted by The Database of Genotypes and Phenotypes (dbGaP) and other NCI databases. A summary of Cancer Moonshot Biobank imaging efforts can be found on the Cancer Moonshot Biobank Imaging page.
f
Table4_Dynamic Meta-data Network Sparse PCA for Cancer Subtype Biomarker...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated May 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xie, Sheng-Li; Cai, Jie; Dong, Xin; Li, Shao; Lo, Sio-Long; Yang, Kuo; Mei, Xin-Yue; Dang, Qi; Liang, Yong; Miao, Rui; Liu, Xiao-Ying (2022). Table4_Dynamic Meta-data Network Sparse PCA for Cancer Subtype Biomarker Screening.XLSX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000308074
Explore at:
Dataset updated
May 31, 2022
Authors
Xie, Sheng-Li; Cai, Jie; Dong, Xin; Li, Shao; Lo, Sio-Long; Yang, Kuo; Mei, Xin-Yue; Dang, Qi; Liang, Yong; Miao, Rui; Liu, Xiao-Ying
Description
Previous research shows that each type of cancer can be divided into multiple subtypes, which is one of the key reasons that make cancer difficult to cure. Under these circumstances, finding a new target gene of cancer subtypes has great significance on developing new anti-cancer drugs and personalized treatment. Due to the fact that gene expression data sets of cancer are usually high-dimensional and with high noise and have multiple potential subtypes’ information, many sparse principal component analysis (sparse PCA) methods have been used to identify cancer subtype biomarkers and subtype clusters. However, the existing sparse PCA methods have not used the known cancer subtype information as prior knowledge, and their results are greatly affected by the quality of the samples. Therefore, we propose the Dynamic Metadata Edge-group Sparse PCA (DM-ESPCA) model, which combines the idea of meta-learning to solve the problem of sample quality and uses the known cancer subtype information as prior knowledge to capture some gene modules with better biological interpretations. The experiment results on the three biological data sets showed that the DM-ESPCA model can find potential target gene probes with richer biological information to the cancer subtypes. Moreover, the results of clustering and machine learning classification models based on the target genes screened by the DM-ESPCA model can be improved by up to 22–23% of accuracies compared with the existing sparse PCA methods. We also proved that the result of the DM-ESPCA model is better than those of the four classic supervised machine learning models in the task of classification of cancer subtypes.
h
INF-PCA-Data
huggingface.co
Updated Aug 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
LiuChong (2023). INF-PCA-Data [Dataset]. https://huggingface.co/datasets/LiuChong/INF-PCA-Data
Explore at:
Dataset updated
Aug 30, 2023
Authors
LiuChong
Description
LiuChong/INF-PCA-Data dataset hosted on Hugging Face and contributed by the HF Datasets community
v
Global exporters importers-export import data of Sodium pca
volza.com
csv
Updated May 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Volza FZ LLC (2025). Global exporters importers-export import data of Sodium pca [Dataset]. https://www.volza.com/p/sodium-pca/
Explore at:
csvAvailable download formats
Dataset updated
May 11, 2025
Dataset authored and provided by
Volza FZ LLC
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Variables measured
Count of exporters, Count of importers, Count of shipments, Sum of export import value
Description
108 Global exporters importers export import shipment records of Sodium pca with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.
d
Data from: Permutation-validated principal components analysis of microarray...
catalog.data.gov
healthdata.gov
+1more
Updated Sep 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Permutation-validated principal components analysis of microarray data [Dataset]. https://catalog.data.gov/dataset/permutation-validated-principal-components-analysis-of-microarray-data
Explore at:
Dataset updated
Sep 7, 2025
Dataset provided by
National Institutes of Health
Description
Background In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure. Results We used PCA to detect the major sources of variance underlying the hybridization conditions followed by gene selection based on PCA-derived and permutation-based test statistics. We validated our method by applying it to well characterized yeast cell-cycle data and to two datasets from our laboratory. We could describe the major sources of variance, select informative genes and visualize the relationship of genes and arrays. We observed differences in the level of the explained variance and the interpretability of the selected genes. Conclusions Combining data visualization and permutation-based gene selection, permutation-validated PCA enables one to illustrate gene-expression variance between several conditions and to select genes by taking into account the relationship of between-group to within-group variance of genes. The method can be used to extract the leading sources of variance from microarray data, to visualize relationships between genes and hybridizations and to select informative genes in a statistically reliable manner. This selection accounts for the level of reproducibility of replicates or group structure as well as gene-specific scatter. Visualization of the data can support a straightforward biological interpretation.
f
Data from: A General Treatment of Solubility. 3. Principal Component...
acs.figshare.com
application/cdfv2
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alan R. Katritzky; Indrek Tulp; Dan C. Fara; Antonino Lauria; Uko Maran; William E. Acree (2023). A General Treatment of Solubility. 3. Principal Component Analysis (PCA) of the Solubilities of Diverse Solutes in Diverse Solvents [Dataset]. http://doi.org/10.1021/ci0496189.s001
Explore at:
application/cdfv2Available download formats
Unique identifier
https://doi.org/10.1021/ci0496189.s001
Dataset updated
May 31, 2023
Dataset provided by
ACS Publications
Authors
Alan R. Katritzky; Indrek Tulp; Dan C. Fara; Antonino Lauria; Uko Maran; William E. Acree
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
A phenomenological study of solubility has been conducted using a combination of quantitative structure−property relationship (QSPR) and principal component analysis (PCA). A solubility database of 4540 experimental data points was used that utilized available experimental data into a matrix of 154 solvents times 397 solutes. Methodology in which QSPR and PCA are combined was developed to predict the missing values and to fill the data matrix. PCA on the resulting filled matrix, where solutes are observations and solvents are variables, shows 92.55% of coverage with three principal components. The corresponding transposed matrix, in which solvents are observations and solutes are variables, showed 62.96% of coverage with four principal components.
PCA-DATA
kaggle.com
zip
Updated Oct 7, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
inaho (2022). PCA-DATA [Dataset]. https://www.kaggle.com/datasets/ylic204/pca-data
Explore at:
zip(26535 bytes)Available download formats
Dataset updated
Oct 7, 2022
Authors
inaho
Description
Dataset

This dataset was created by inaho

Contents
s
India Pca Export | List of Pca Exporters & Suppliers
seair.co.in
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim Solutions, India Pca Export | List of Pca Exporters & Suppliers [Dataset]. https://www.seair.co.in/pca-export-data.aspx
Explore at:
.text/.csv/.xml/.xls/.binAvailable download formats
Dataset authored and provided by
Seair Exim Solutions
Area covered
India
Description
Explore Indian Pca export data with HS codes, pricing, ports, and a verified list of Pca exporters and suppliers from India with complete shipment insights.

Facebook

Twitter

Click to copy link

Link copied

Cite

Hayden Gunraj (2023). Cancer-Net PCa-Data [Dataset]. https://www.kaggle.com/datasets/hgunraj/cancer-net-pca-data

Cancer-Net PCa-Data

Correlated diffusion imaging data of prostate cancer cases.

Explore at:

4 scholarly articles cite this dataset (View in Google Scholar)

zip(105639561 bytes)Available download formats

Dataset updated

May 16, 2023

Authors

Hayden Gunraj

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Cancer-Net Open Source Initiative - Cancer-Net PCa-Data

Cancer-Net PCa-Data is an open access benchmark dataset of volumetric correlated diffusion imaging (CDIs) data acquisitions of prostate cancer patients. Cancer-Net PCa-Data is a part of the Cancer-Net open source initiative dedicated to advancement in machine learning and imaging research to aid clinicians in the global fight against cancer.

The volumetric CDIs data acquisitions in the Cancer-Net PCa-Data dataset were generated from a patient cohort of 200 patient cases acquired at Radboud University Medical Centre (Radboudumc) in the Prostate MRI Reference Center in Nijmegen, The Netherlands and made available as part of the SPIE-AAPM-NCI PROSTATEx Challenges. Masks derived from the PROSTATEx_masks repository are also provided which label regions of healthy prostate tissue, clinically significant prostate cancer (csPCa), and clinically insignificant prostate cancer (insPCa).

This dataset was used to investigate the relationship between PCa presence and CDIs hyperintensity.

Cancer-Net PCa-Data is released under a CC BY 4.0 license.

Example T2-weighted images of prostates with CDIs overlaid are shown below. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4364336%2Fc312a93e80813c9f4e5e418f1220d4e4%2FPROSTATEx-grid-top100.png?generation=1684256503310308&alt=media" alt="Grid of T2-weighted MRI images of the prostate with CDIs images overlaid.">

If you find our work useful for your research, please cite: @article{Wong2022, author={Alexander Wong and Hayden Gunraj and Vignesh Sivan and Masoom A. Haider}, title={Synthetic correlated diffusion imaging hyperintensity delineates clinically significant prostate cancer}, journal ={Scientific Reports}, volume={12}, year={2022}, number={3376}, doi={10.1038/s41598-022-06872-7} } and @article{Gunraj2023, author={Hayden Gunraj and Chi-en Amy Tai and Alexander Wong}, title={Cancer-Net PCa-Data: An Open-Source Benchmark Dataset for Prostate Cancer Clinical Decision Support using Synthetic Correlated Diffusion Imaging Data}, journal ={NeurIPS Workshops}, year={2023} } Additionally, SPIE-AAPM-NCI PROSTATEx Challenges, PROSTATEx_masks, and The Cancer Imaging Archive (TCIA) should also be cited: @misc{Litjens2017, author={Geert Litjens and Oscar Debats and Jelle Barentsz and Nico Karssemeijer and Henkjan Huisman}, title={ProstateX Challenge data [data set]}, journal={The Cancer Imaging Archive}, year={2017}, doi={10.7937/K9TCIA.2017.MURS5CL } @article{Litjens2014, author={Geert Litjens and Oscar Debats and Jelle Barentsz and Nico Karssemeijer and Henkjan Huisman}, title={Computer-Aided Detection of Prostate Cancer in MRI}, journal={IEEE Transactions on Medical Imaging}, year={2014}, volume={33}, number={5}, pages={1083-1092}, doi={10.1109/TMI.2014.2303821} } @article{Cuocolo2021, author={Renato Cuocolo and Arnaldo Stanzione and Anna Castaldo and Davide Raffaele {De Lucia} and Massimo Imbriaco}, title={Quality control and whole-gland, zonal and lesion annotations for the PROSTATEx challenge public dataset}, journal={European Journal of Radiology}, volume={138}, pages={109647}, year={2021}, doi={10.1016/j.ejrad.2021.109647} } @article{Clark2013, author={Kenneth Clark and Bruce Vendt and Kirk Smith and John Freymann and Justin Kirby and Paul Koppel and Stephen Moore and Stanley Phillips and David Maffitt and Michael Pringle and Lawrence Tarbox and Fred Prior}, title={The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository}, journal={Journal of Digital Imaging}, year={2013}, volume={26}, number={6}, pages={1045-1057}, }

Core Cancer-Net Team

DarwinAI Corp., Canada and Vision and Image Processing Lab, University of Waterloo, Canada
- Alexander Wong
Vision and Image Processing Lab, University of Waterloo, Canada
- Amy Tai
- Hayden Gunraj

Clear search

Close search

Google apps

Main menu

Cancer-Net PCa-Data

Cancer-Net Open Source Initiative - Cancer-Net PCa-Data

Core Cancer-Net Team

Prescription Cost Analysis

Prescription Cost Analysis 1998-2016 - drugs matched to BNF via fuzzy lookup...

PCA Data Samples

Dataset

Contents

Edmunds et al./FACS data for PCA - Datasets - data.bris

Principal component analysis (PCA) of behavioural data across the life...

Environmental data used for PCA, and site age

Principal Components Analysis (PCA) Image used to characterize the...

Global import data of Pca

PCA data

PCA Locations Data for United States

Replication Data for: Uncertainty-Aware Principal Component Analysis

Cancer Moonshot Biobank - Prostate Cancer Collection

Table4_Dynamic Meta-data Network Sparse PCA for Cancer Subtype Biomarker...

INF-PCA-Data

Global exporters importers-export import data of Sodium pca

Data from: Permutation-validated principal components analysis of microarray...

Data from: A General Treatment of Solubility. 3. Principal Component...

PCA-DATA

Dataset

Contents

India Pca Export | List of Pca Exporters & Suppliers

Cancer-Net PCa-Data

Correlated diffusion imaging data of prostate cancer cases.

Cancer-Net Open Source Initiative - Cancer-Net PCa-Data

Core Cancer-Net Team