Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contained R scripts and data sets used to generate clustering dendogram and heatmaps as shown Fig. 3.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.To construct the reference TIL atlas, we obtained single-cell gene expression matrices from the following GEO entries: GSE124691, GSE116390, GSE121478, GSE86028; and entry E-MTAB-7919 from Array-Express. Data from GSE124691 contained samples from tumor and from tumor-draining lymph nodes, and were therefore treated as two separate datasets. For the TIL projection examples (OVA Tet+, miR-155 KO and Regnase-KO), we obtained the gene expression counts from entries GSE122713, GSE121478 and GSE137015, respectively.Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g. Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non T cell genes (e.g. Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat 3. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal and non-coding genes, as well as genes expressed in less than 0.1% or more than 90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We determined pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat3, providing the anchor set determined by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets.Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.6, reduction=”umap”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).
CalCENv1 co-expression network was projected to two dimensions using UMAP and 18 clusters were identified and annotated through gene set enrichment analysis.
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
The top locations where reported collisions occurred between pedestrians and motor vehicles have been identified. The crash cluster analysis methodology for the top pedestrian clusters uses a fixed meter search distance of 100 meters (328 ft.) to merge crash clusters together. Located crashes between motor vehicles and pedestrians were identified by using the non-motorist type code as well as first harmful events and most harmful events within the CDS database. Furthermore, the methodology uses the Equivalent Property Damage Only (EPDO) weighting to rank the clusters. EPDO is based any type of injury crash (including fatal, incapacitating, non-incapacitating and possible) having a weighting of 21 compared to a property damage only crash (which has weighting of 1). However, because of the relatively small number of reported pedestrian crashes in the crash data file, the clustering analysis used crashes from the ten year period from 2010-2019. Additionally, due to the larger geographic area encompassed by the pedestrian crash clusters, it was difficult to name them so they were left unnamed but can be viewed spatially.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains R Seurat objects associated with our study titled "A single-cell atlas characterizes dysregulation of the bone marrow immune microenvironment associated with outcomes in multiple myeloma".
Single cell data contained within this object comes from MMRF Immune Atlas Consortium work.
The .rds files contains a Seurat object saved with version 4.3. This can be loaded in R with the readRDS command.
Two .RDS files are included in this version of the release.
--
The discovery object contains two assays:
Currently, the validation object only includes the uncorrected RNA assay.
--
The object contains two umaps in the reduction slot:
--
Each sample has three different identifiers:
Each cell has the following annotation information:
--
Each sample has the following information indicating shipment batches, for batch correction
--
Each public_id has limited demographic information based on publicly available information in the MMRF CoMMpass study.
d_specimen_visit_id contains two data points providing limited information about the visit
All the single-cell raw data, along with outcome and cytogenetic information, is available at MMRF’s VLAB shared resource. Requests to access these data will be reviewed by data access committee at MMRF and any data shared will be released under a data transfer agreement that will protect the identities of patients involved in the study. Other information from the CoMMpass trial can also generally be
The top locations where reported collisions occurred at intersections have been identified. The crash cluster analysis methodology for the top intersection clusters uses a fixed meter search distance of 25 meters (82 ft.) to merge crash clusters together. This analysis was based on crashes where a police officer specified one of the following junction types: Four way intersection, T-intersection, Y-intersection, five point or more. Furthermore, the methodology uses the Equivalent Property Damage Only (EPDO) weighting to rank the clusters. EPDO is based any type of injury crash (including fatal, incapacitating, non-incapacitating and possible) having a weighting of 21 compared to a property damage only crash (which has weighting of 1). The clustering analysis used crashes from the three year period from 2019-2021. The area encompassing the crash cluster may cover a larger area than just the intersection so it is critical to view these spatially.
This study subdivides the Potter Cove, King George Island, Antarctica, into seafloor regions using multivariate statistical methods. These regions are categories used for comparing, contrasting and quantifying biogeochemical processes and biodiversity between ocean regions geographically but also regions under development within the scope of global change. The division obtained is characterized by the dominating components and interpreted in terms of ruling environmental conditions. The analysis includes in total 42 different environmental variables, interpolated based on samples taken during Australian summer seasons 2010/2011 and 2011/2012. The statistical errors of several interpolation methods (e.g. IDW, Indicator, Ordinary and Co-Kriging) with changing settings have been compared and the most reasonable method has been applied. The multivariate mathematical procedures used are regionalized classification via k means cluster analysis, canonical-correlation analysis and multidimensional scaling. Canonical-correlation analysis identifies the influencing factors in the different parts of the cove. Several methods for the identification of the optimum number of clusters have been tested and 4, 7, 10 as well as 12 were identified as reasonable numbers for clustering the Potter Cove. Especially the results of 10 and 12 clusters identify marine-influenced regions which can be clearly separated from those determined by the geological catchment area and the ones dominated by river discharge.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionAdolescents’ physical activity (PA) behavior can be driven by several psychosocial determinants at the same time. Most analyses use a variable-based approach that examines relations between PA-related determinants and PA behavior on the between-person level. Using this approach, possible coexistences of different psychosocial determinants within one person cannot be examined. Therefore, by applying a person-oriented approach, this study examined (a) which profiles regarding PA-related psychosocial variables typically occur in female sixth-graders, (b) if these profiles deliver a self-consistent picture according to theoretical assumptions, and (c) if the profiles contribute to the explanation of PA.Materials and MethodsThe sample comprised 475 female sixth-graders. Seventeen PA-related variables were assessed: support for autonomy, competence and relatedness in PE as well as their satisfaction in PE and leisure-time; behavioral regulation of exercise (five subscales); self-efficacy and social support from friends and family (two subscales). Moderate-to-vigorous PA was measured using accelerometers. Data were analyzed using the self-organizing maps (SOM) analysis, a cluster analysis including an unsupervised algorithm for non-linear models.ResultsAccording to the respective level of psychosocial resources, a positive, a medium and a negative cluster were identified. This superordinate cluster solution represented a self-consistent picture that was in line with theoretical assumptions. The three-cluster solution contributed to the explanation of PA behavior, with the positive cluster accumulating an average of 6 min more moderate-to-vigorous PA per day than the medium cluster and 10 min more than the negative cluster. Additionally, SOM detected a subgroup within the positive cluster that benefited from a specific combination of intrinsic and external regulations with regard to PA.DiscussionThe results underline the relevance of the assessed psychosocial determinants of PA behavior in female sixth-graders. The results further indicate that the different psychosocial resources within a given person do not develop independently of one another, which supports the use of a person-oriented approach. In addition, the SOM analysis identified subgroups with specific characteristics, which would have remained undetected using variable-based approaches. Thus, this approach offers the possibility to reduce data complexity without overlooking subgroups with special demands that go beyond the superordinate cluster solution.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the set of data shown in the paper "Relevant, hidden, and frustrated information in high-dimensional analyses of complex dynamical systems with internal noise", published on arXiv (DOI: 10.48550/arXiv.2412.09412).
The scripts contained herein are:
To reproduce the data of this work you should start form SOAP-Component-Analysis.py to calculate the SOAP descriptor and select the components that are interesting for you, then you can calculate the PCA with PCA-Analysis.py, and applying the clustering based on your necessities (OnionClustering-1d.py, OnionClustering-2d.py, Hierarchical-Clustering.py). Further modifications of the Onion plot can be done with the script: OnionClustering-plot.py. Umap can be calculated with UMAP.py.
Additional data contained herein are:
The data related to the Quincke rollers can be found here: https://zenodo.org/records/10638736
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection. Single-cell data to build the virus-specific CD8 T cell reference map were downloaded from GEO under the following entries: GSE131535, GSE134139 and GSE119943, selecting only samples in wild type conditions. Data for the Ptpn2-KO, Tox-KO and CD4-depletion projections were obtained from entries GSE134139, GSE119943, and GSE137007 and were not included in the construction of the reference map. To construct the LCMV reference map, we split the dataset into five batches that displayed strong batch effects, and applied STACAS (https://github.com/carmonalab/STACAS) to mitigate its confounding effects. We computed 800 variable genes per batch, excluding cell cycling genes, ribosomal and mitochondrial genes, and computed pairwise anchors using 200 integration genes, and otherwise default STACAS parameters. Anchors were filtered at the default threshold 0.8 percentile, and integration was performed with the IntegrateData Seurat3 function with the guide tree suggested by STACAS. Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.4, reduction=”pca”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The data set presented here contains the MAU% data for the selected hyena-made and leopard-made faunal assemblages with which the Misiam assemblage is compared. Misiam is a recently discovered modern faunal accumulation found at Olduvai Gorge (Tanzania) interpreted as a palimpsest resulting from the action of leopards (main transporting agents) and hyenas (secondary scavengers). It is the first open-air reported leopard-made faunal accumulation. Defining the anatomical and taphonomic characteristics of such an assembllage is important for the interpretation of prehistoric faunal assemblages created by carnivores. It is also relevant for modern ecological studies. In this particular case, the bulk of the assemblage is composed of wildebeests. This is usually not the target of leopards; however, their seasonal abundance during the wildebeest migration on the plains adjacent to Olduvai Gorge prompts this rather exceptional highly-specialized behavior by usually eclectic leopards. In the present work, a thorough taphonomic analysis is carried out and the main taxonomic, anatomical and taphonomic characteristics of this felid-hyenic modified assemblage is decribed. The analytical approach adopted uses the data presented here. Methods The Misiam data were collected in the field. The bone assemblage lay on the surface of a densely-vegetated ravine. Bones were simply collected and in one particular area an excavation was m,ade to retrieve bones sub-surficially, In order to compare skeletal profiles in felid and hyenid assemblages, we will use some of the most representative assemblages in the literature. For spotted hyena dens, we will use data from the Koobi Fora Hyena Den 1 (KFHD1) , the Amboseli den, the Maasai Mara den, and the Syokimau den, all of them in Kenya, and the Eyasi (Kisima Ngeda) Hyena Den 2 (KND2) (Tanzania). We used these assemblages also because they are either dominated by size 3 carcasses or these make up a significant part of the assemblage.
When comparing long bone shaft breakage patterns, we also used additional hyena-made assemblages: Dumali, Heraide, Yangula Ari, Oboley (spotted hyenas), Datagabou (striped hyena, Djibouti), and Uniab (brown hyena, Namibia). These assemblages are almost completely dominated by very small fauna (Capra hircus), and several of them constitute significantly smaller sample sizes than the hyena dens mentioned above.
The leopard lairs used for comparison are: Portsmut and Hakos River (Namibia), and WU/BA-001 (South Africa). Portsmut and Hakos River show a low density of remains, probably also modified by porcupines or other agents. The remains belonging to larger animals show an interesting contrast with those documented in hyena dens: the presence of axial and compact bones is high. These latter bones are also well represented in smaller carcasses. This characteristic is more marked in WU/BA-001; the least altered leopard lair documented to date. This lair was monitored for 7 years.
All the comparative assemblages were transformed into %MAU to account for differential inter-assemblage quantitative representation. First, they were analyzed using Generalized Low Rank Models (GLRM) as an exploratory method. Then, we used a Uniform Manifold Approximation and Projection (UMAP), to classify leopards´ and hyenas´ bone assemblages, especially according to each feature. Lastly, we used a cluster analysis with variance-dependent phylogenetic tree to show the actual distances among all the assemblages compared.
GLRM are a series of methods for dimensionality reduction that use several loss function types and can implement regularization functions. Whereas principal component analysis (PCA) is based on orthogonal projections of linear relationships, in cases where relationships are non-linear, the PCA underperforms compared to other more flexible methods. GLRM decomposes a table into two distinctive matrices X and Y. X contains the same number of rows as the original table, but all variables are condensed into k factors. Y has k rows and the same number of columns as features (i.e., variables) in the original table. Each of the rows is an archetypal feature derived from the columns (i.e., variables) of the original table. Each row of X corresponds to a row of the original table projected into this reduced dimension feature space. Data are compressed by the low-rank representation derived from k feature reduction. An advantage of GLRM over PCA is that it can handle mixed datasets containing numeric, categorical and Boolean data. GLRM admits several types of loss functions: Huber, Poisson, quadratic, periodic or hinge. It also allows the use of regularization functions, including: Lasso, Ridge, OneSparse, Simplex, UnitOneSparse, and quadratic. Loss functions are used to select the optimal archetypal values. Regularization is used to limit X and Y archetypal values. This impacts the effect of negative data, multicollinearity and overfitting. In the present analysis, GLRM was performed with the “h2o” R library (www.r-project.org).
UMAPs is a non-linear dimension-reduction method based on finding inter-case distances in a low-dimensional feature space. The key of UMAP over other dimension-reduction non-linear methods, like t-distributed stochastic neighbor embedding (t-SNE), is that distances are generated along a “manifold”. A manifold is a n-dimensional geometric shape constituted of the path(s) among the points. Every point is referenced according to a small two-dimensional neighborhood around it. The UMAP algorithm searches for a multi-dimensional space delimited by the location of points. UMAP uses a nearest-neighbor approach, by eventually connecting all the points along its search regions. This forces a uniform distribution of points. The distances of points along this manifold are then derived through Euclidean distances. Several optimization methods can be used to reproduce inter-point distances. For the latter process, the UMAP approach that we will use is based on a cross-entropy loss function. For the UMAP analysis, we have used the “umap” R library (www.r-project.org). We have also used a search grid combining ranges of values for number of neighbors, minimal distance between neighbors, distance metric, and number of epochs (i.e., iterations of the optimization process).
Finally, a hierarchical cluster analysis, using an Euclidean distance matrix on the %MAU dataset, was carried out. The method used was the “average” linkage, which represents the average distance between the points. The combination of the three methods was used to study agent-specific variability in inter-assemblage element representation.
R script for clustering specimens based on measurement dataScript for performing UPGMA hierarchical cluster analysis on the R platform.R script for clustering.pdfR script for Principal Component Analysis (PCA): specimens on a morphometric ordination spaceScript for performing Principal Component Analysis (PCA) on the R platform.R script for PCA.pdfScript for mapping the distribution of Stigmatomma species in Madagascar and SeychellesR code for making species distribution maps. Note: Our maps use the ecoregion outlines of Madagascar, which were based on the vector data disclosed by the Terrestrial Ecoregions of the World (available at the WWF website). However, the original outlines were slightly mismatching the relief of Madagascar. To solve this, we combined the original ecoregion data with data from the Remaining Primary Vegetation of Madagascar (available at the Kew Royal Botanic Gardens website), which has more natural outlines.Linear morphometry of Stigmatomma species in the Malaga...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
These data are a set of raster maps of community-level predictions of deep-sea coral and sponge taxa distributions off the continental U.S. west coast, spanning depths from 50 to 1200 m. The raster files come in two versions: one where predicted distribution suitability range from 0 - 1 and one where the predicted suitability is classified into five classes; very low (0–0.2), low (0.21–0.40), moderate (0.41–0.60), high (0.61–0.80) and very high (0.81–1.00). These raster maps were derived from 40 habitat suitability models (HSMs) conducted at the genus- and species-level maps done by Poti et al. (2020). A cluster analysis of the original individually-modeled taxa identified 10 groups whose member HSMs were stacked and averaged to produce a stacked species distribution model (S-SDM). Further details about the generation of the S-SDMs and their interpretation can be found in Shantharam et al. (2025).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistical separation of UMAP clusters between each real dataset and its GAN simulated equivalent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistical separation of UMAP clusters between real datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Statistical separation of UMAP clusters between GAN generated datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
With the increasing focus on patient-centred care, this study sought to understand priorities considered by patients and healthcare providers from their experience with head and neck cancer treatment, and to compare how patients’ priorities compare to healthcare providers’ priorities. Group concept mapping was used to actively identify priorities from participants (patients and healthcare providers) in two phases. In phase one, participants brainstormed statements reflecting considerations related to their experience with head and neck cancer treatment. In phase two, statements were sorted based on their similarity in theme and rated in terms of their priority. Multidimensional scaling and cluster analysis were performed to produce multidimensional maps to visualize the findings. Two-hundred fifty statements were generated by participants in the brainstorming phase, finalized to 94 statements that were included in phase two. From the sorting activity, a two-dimensional map with stress value of 0.2213 was generated, and eight clusters were created to encompass all statements. Timely care, education, and person-centred care were the highest rated priorities for patients and healthcare providers. Overall, there was a strong correlation between patient and healthcare providers’ ratings (r = 0.80). Our findings support the complexity of the treatment planning process in head and neck cancer, evident by the complex maps and highly interconnected statements related to the experience of treatment. Implications for improving the quality of care delivered and care experience of head and cancer are discussed.
High-throughput metabolic phenotyping is a challenge, but it provides an alternative and comprehensive access to the rapid and accurate characterization of plants. In addition to the technical issues of obtaining quantitative data of plenty of metabolic traits from numerous samples, a suitable data processing and statistical evaluation strategy must be developed. We present a simple, robust and highly scalable strategy for the comparison of multiple chemical profiles from coffee and tea leaf extracts, based on direct-injection electrospray mass spectrometry (DIESI-MS) and hierarchical cluster analysis (HCA). More than 3500 individual Coffea canephora and Coffea arabica trees from experimental fields in Mexico were sampled and processed using this method. Our strategy permits the classification of trees according to their metabolic fingerprints and the screening for families with desired characteristics, such as extraordinarily high or low caffeine content in their leaves.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Characteristics of the ecocentric vs. social-ecological clusters identified in respondents’ individual cognitive maps (ICMs).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This folder contained R scripts and data sets used to generate clustering dendogram and heatmaps as shown Fig. 3.