18 datasets found

f
Cluster tendency assessment in neuronal spike data
plos.figshare.com
pdf
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sara Mahallati; James C. Bezdek; Milos R. Popovic; Taufik A. Valiante (2023). Cluster tendency assessment in neuronal spike data [Dataset]. http://doi.org/10.1371/journal.pone.0224547
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0224547
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Sara Mahallati; James C. Bezdek; Milos R. Popovic; Taufik A. Valiante
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Sorting spikes from extracellular recording into clusters associated with distinct single units (putative neurons) is a fundamental step in analyzing neuronal populations. Such spike sorting is intrinsically unsupervised, as the number of neurons are not known a priori. Therefor, any spike sorting is an unsupervised learning problem that requires either of the two approaches: specification of a fixed value k for the number of clusters to seek, or generation of candidate partitions for several possible values of c, followed by selection of a best candidate based on various post-clustering validation criteria. In this paper, we investigate the first approach and evaluate the utility of several methods for providing lower dimensional visualization of the cluster structure and on subsequent spike clustering. We also introduce a visualization technique called improved visual assessment of cluster tendency (iVAT) to estimate possible cluster structures in data without the need for dimensionality reduction. Experimental results are conducted on two datasets with ground truth labels. In data with a relatively small number of clusters, iVAT is beneficial in estimating the number of clusters to inform the initialization of clustering algorithms. With larger numbers of clusters, iVAT gives a useful estimate of the coarse cluster structure but sometimes fails to indicate the presumptive number of clusters. We show that noise associated with recording extracellular neuronal potentials can disrupt computational clustering schemes, highlighting the benefit of probabilistic clustering models. Our results show that t-Distributed Stochastic Neighbor Embedding (t-SNE) provides representations of the data that yield more accurate visualization of potential cluster structure to inform the clustering stage. Moreover, The clusters obtained using t-SNE features were more reliable than the clusters obtained using the other methods, which indicates that t-SNE can potentially be used for both visualization and to extract features to be used by any clustering algorithm.

DrCyZ: Techniques for analyzing and extracting useful information from CyZ.

data.niaid.nih.gov

Updated Jan 19, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

de Zarzà, I. (2022). DrCyZ: Techniques for analyzing and extracting useful information from CyZ. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5816857

Explore at:

Dataset updated

Jan 19, 2022

Dataset provided by

de Curtò, J.
de Zarzà, I.

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

DrCyZ: Techniques for analyzing and extracting useful information from CyZ.

Samples from NASA Perseverance and set of GAN generated synthetic images from Neural Mars.

Repository: https://github.com/decurtoidiaz/drcyz

Subset of samples from (includes tools to visualize and analyse the dataset):

CyZ: MARS Space Exploration Dataset. [https://doi.org/10.5281/zenodo.5655473]

Images from NASA missions of the celestial body.

Repository: https://github.com/decurtoidiaz/cyz

Authors:

J. de Curtò c@decurto.be

I. de Zarzà z@dezarza.be

File Information from DrCyZ-1.1

• Subset of samples from Perseverance (drcyz/c).
  ∙ png (drcyz/c/png).
    PNG files (5025) selected from NASA Perseverance (CyZ-1.1) after t-SNE and K-means Clustering. 
  ∙ csv (drcyz/c/csv).
    CSV file.


• Resized samples from Perseverance (drcyz/c+).
  ∙ png 64x64; 128x128; 256x256; 512x512; 1024x1024 (drcyz/c+/drcyz_64-1024).
    PNG files resized at the corresponding size. 
  ∙ TFRecords 64x64; 128x128; 256x256; 512x512; 1024x1024 (drcyz/c+/tfr_drcyz_64-1024).
    TFRecord resized at the corresponding size to import on Tensorflow.


• Synthetic images from Neural Mars generated using Stylegan2-ada (drcyz/drcyz+).
  ∙ png 100; 1000; 10000 (drcyz/drcyz+/drcyz_256_100-10000)
    PNG files subset of 100, 1000 and 10000 at size 256x256.


• Network Checkpoint from Stylegan2-ada trained at size 256x256 (drcyz/model_drcyz).
  ∙ network-snapshot-000798-drcyz.pkl


• Notebooks in python to analyse the original dataset and reproduce the experiments; K-means Clustering, t-SNE, PCA, synthetic generation using Stylegan2-ada and instance segmentation using Deeplab (https://github.com/decurtoidiaz/drcyz/tree/main/dr_cyz+).
  ∙ clustering_curiosity_de_curto_and_de_zarza.ipynb
    K-means Clustering and PCA(2) with images from Curiosity.
  ∙ clustering_perseverance_de_curto_and_de_zarza.ipynb
    K-means Clustering and PCA(2) with images from Perseverance.
  ∙ tsne_curiosity_de_curto_and_de_zarza.ipynb
    t-SNE and PCA (components selected to explain 99% of variance) with images from Curiosity.
  ∙ tsne_perseverance_de_curto_and_de_zarza.ipynb
    t-SNE and PCA (components selected to explain 99% of variance) with images from Perseverance.
  ∙ Stylegan2-ada_de_curto_and_de_zarza.ipynb
    Stylegan2-ada trained on a subset of images from NASA Perseverance (DrCyZ).
  ∙ statistics_perseverance_de_curto_and_de_zarza.ipynb
    Compute statistics from synthetic samples generated by Stylegan2-ada (DrCyZ) and images from NASA Perseverance (CyZ).
  ∙ DeepLab_TFLite_ADE20k_de_curto_and_de_zarza.ipynb
    Example of instance segmentation using Deeplab with a sample from NASA Perseverance (DrCyZ).

f
Supplementary Table 2: All Genes TSNE-clusters
figshare.com
xlsx
Updated Feb 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amarjit Saini; Linda Björkhem-Bergman; Johan Boström; Mats Lilja; Michael Melin; lena ekström; Peter Bergman; Mikael Altun; Eric Rullman; Thomas Gustafsson (2019). Supplementary Table 2: All Genes TSNE-clusters [Dataset]. http://doi.org/10.6084/m9.figshare.7732187.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7732187.v1
Dataset updated
Feb 18, 2019
Dataset provided by
figshare
Authors
Amarjit Saini; Linda Björkhem-Bergman; Johan Boström; Mats Lilja; Michael Melin; lena ekström; Peter Bergman; Mikael Altun; Eric Rullman; Thomas Gustafsson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary Table 2: All Genes TSNE-clusters RNA-seq data
d
Replication Data for the \"Keratoconus severity identification using...
search.dataone.org
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yousefi, Siamak (2023). Replication Data for the \"Keratoconus severity identification using unsupervised machine learning\", Siamak Yousefi, Ebrahim Yousefi, Hidenori Takahashi, Takahiko Hayashi, Hironobu Tampo, Satoru Inoda, Yusuke Arai, and Penny Asbell, PLOS One 2018 [Dataset]. http://doi.org/10.7910/DVN/G2CRMO
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/G2CRMO
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Yousefi, Siamak
Description
Dataset and labels for the article of Keratoconus severity identification using unsupervised machine learning by Siamak Yousefi
Additional file 5 of GECO: gene expression clustering optimization app for...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A. N. Habowski; T. J. Habowski; M. L. Waterman (2023). Additional file 5 of GECO: gene expression clustering optimization app for non-linear data visualization of patterns [Dataset]. http://doi.org/10.6084/m9.figshare.13642382.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13642382.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
A. N. Habowski; T. J. Habowski; M. L. Waterman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 5: CSV file of bulk RNA-seq data of F. nucleatum infection time course used for GECO UMAP generation.
f
Scripts for Analysis
figshare.com
txt
Updated Jul 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6783569.v2
Dataset updated
Jul 18, 2018
Dataset provided by
figshare
Authors
Sneddon Lab UCSF
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
d
Data from: Reference transcriptomics of porcine peripheral immune cells...
catalog.data.gov
agdatacommons.nal.usda.gov
+3more
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. https://catalog.data.gov/dataset/data-from-reference-transcriptomics-of-porcine-peripheral-immune-cells-created-through-bul-e667c
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Service
Description
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Additional file 4 of GECO: gene expression clustering optimization app for...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A. N. Habowski; T. J. Habowski; M. L. Waterman (2023). Additional file 4 of GECO: gene expression clustering optimization app for non-linear data visualization of patterns [Dataset]. http://doi.org/10.6084/m9.figshare.13642379.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13642379.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
A. N. Habowski; T. J. Habowski; M. L. Waterman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 4: CSV file of colon crypt bulk RNA-seq data used for GECO UMAP generation.
f
Table_2_A Novel Computational Framework for Precision Diagnosis and Subtype...
frontiersin.figshare.com
xlsx
Updated Jun 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fei Xia; Xiaojun Xie; Zongqin Wang; Shichao Jin; Ke Yan; Zhiwei Ji (2023). Table_2_A Novel Computational Framework for Precision Diagnosis and Subtype Discovery of Plant With Lesion.XLSX [Dataset]. http://doi.org/10.3389/fpls.2021.789630.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpls.2021.789630.s003
Dataset updated
Jun 15, 2023
Dataset provided by
Frontiers
Authors
Fei Xia; Xiaojun Xie; Zongqin Wang; Shichao Jin; Ke Yan; Zhiwei Ji
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Plants are often attacked by various pathogens during their growth, which may cause environmental pollution, food shortages, or economic losses in a certain area. Integration of high throughput phenomics data and computer vision (CV) provides a great opportunity to realize plant disease diagnosis in the early stage and uncover the subtype or stage patterns in the disease progression. In this study, we proposed a novel computational framework for plant disease identification and subtype discovery through a deep-embedding image-clustering strategy, Weighted Distance Metric and the t-stochastic neighbor embedding algorithm (WDM-tSNE). To verify the effectiveness, we applied our method on four public datasets of images. The results demonstrated that the newly developed tool is capable of identifying the plant disease and further uncover the underlying subtypes associated with pathogenic resistance. In summary, the current framework provides great clustering performance for the root or leave images of diseased plants with pronounced disease spots or symptoms.
m
GERDA datasets including NGS and SGA data
data.mendeley.com
Updated Apr 26, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabian Otte (2023). GERDA datasets including NGS and SGA data [Dataset]. http://doi.org/10.17632/8c4zbxfvwk.3
Explore at:
Unique identifier
https://doi.org/10.17632/8c4zbxfvwk.3
Dataset updated
Apr 26, 2023
Authors
Fabian Otte
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets linked to publication "Revealing viral and cellular dynamics of HIV-1 at the single-cell level during early treatment periods", Otte et al 2023 published in Cell Reports Methods pre-ART (antiretroviral therapy) cryo-conserved and and whole blood specimen were sampled for HIV-1 virus reservoir determination in HIV-1 positive individuals from the Swiss HIV Study Cohort. Patients were monitored for proviral (DNA), poly-A transcripts (RNA), late protein translation (Gag and Envelope reactivation co-detection assay, GERDA) and intact viruses (golden standard: viral outgrowth assay, VOA). In this dataset we deposited the pipeline for the multidimensional data analysis of our newly established GERDA method, using DBScan and tSNE. For further comprehension NGS and Sanger sequencing data were attached as processed and raw data (GenBank).

Resubmitted to Cell Reports Methods (Jan-2023), accepted in principal (Mar-2023)

GERDA is a new detection method to decipher the HIV-1 cellular reservoir in blood (tissue or any other specimen). It integrates HIV-1 Gag and Env co-detection along with cellular surface markers to reveal 1) what cells still contain HIV-1 translation competent virus and 2) which marker the respective infected cells express. The phenotypic marker repertoire of the cells allow to make predictions on potential homing and to assess the HIV-1 (tissue) reservoir. All FACS data were acquired on a LSRFortessa BD FACS machine (markers: CCR7, CD45RA, CD28, CD4, CD25, PD1, IntegrinB7, CLA, HIV-1 Env, HIV-1 Gag) Raw FACS data (pre-gated CD4CD3+ T-cells) were arcsin transformed and dimensionally reduced using optsne. Data was further clustered using DBSCAN and either individual clusters were further analyzed for individual marker expression or expression profiles of all relevant clusters were analyzed by heatmaps. Sequences before/after therapy initiation and during viral outgrowth cultures were monitored for individuals P01-46 and P04-56 by Next-generation sequencing (NGS of HIV-1 Envelope V3 loop only) and by Sanger (single genome amplification, SGA)

data normalization code (by Julian Spagnuolo) FACS normalized data as CSV (XXX_arcsin.csv) OMIQ conText file (_OMIQ-context_XXX) arcsin normalized FACS data after optsne dimension reduction with OMIQ.ai as CSV file (XXXarcsin.csv.csv) R pipeline with codes (XXX_commented.R) P01_46-NGS and Sanger sequences P04_56-NGS and Sanger sequences
f
clustering and annotation metadata
figshare.com
zip
Updated Apr 7, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Geoff Stanley (2020). clustering and annotation metadata [Dataset]. http://doi.org/10.6084/m9.figshare.12093471.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12093471.v1
Dataset updated
Apr 7, 2020
Dataset provided by
figshare
Authors
Geoff Stanley
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
There are 3 files in seurat_results.zip: one containing the principal component values used for dimensionality reduction and clustering of all MSN, one containing the computed tSNE values, and one containing the louvain clusters. The metadata_final.csv file contains the annotated major cell types and subtypes.
f
Dataset name, reference, dimensions and cell type composition.
plos.figshare.com
xls
Updated Dec 13, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuta Hozumi; Guo-Wei Wei (2024). Dataset name, reference, dimensions and cell type composition. [Dataset]. http://doi.org/10.1371/journal.pone.0311791.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0311791.t001
Dataset updated
Dec 13, 2024
Dataset provided by
PLOS ONE
Authors
Yuta Hozumi; Guo-Wei Wei
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset name, reference, dimensions and cell type composition.
Cell and gene data for testicular single-cell RNA-Seq
figshare.com
xlsx
Updated Sep 18, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soeren Lukassen; Elisabeth Bosch; Arif B. Ekici; Andreas Winterpacht (2019). Cell and gene data for testicular single-cell RNA-Seq [Dataset]. http://doi.org/10.6084/m9.figshare.6139469.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6139469.v2
Dataset updated
Sep 18, 2019
Dataset provided by
Figsharehttp://figshare.com/
Authors
Soeren Lukassen; Elisabeth Bosch; Arif B. Ekici; Andreas Winterpacht
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Tables containing additional information on genes and cell obtained from single-cell RNA-Seq analysis of mouse testis data. SuppTableCells.xlsx contains the 10X Barcode as identifier, the replicate ID, position in the t-SNE plot, UMI and gene count per cell, the proportion of mitochondrial transcripts, cluster ID obtained by k-Means clustering (k=9), inferred cell type, and pseudotime information obtained using monocle and Scrat.SuppTableGenes contains the average expression value, fold-change compared to other cell types, and p-value relative to other cell types for each gene that was expressed in the dataset. Throughout the data, the following abbreviations for cell types are used: Spg=spermatogonia, SC=spermatocytes, RS=round spermatids, ES=elongating spermatids, CS=condensed/condensing spermatids. In cases where several clusters were identified per cell type, the earlier cluster was designated as 1.
f
Top 10 GO biological processes for BRCA1 coexpressed genes from an EnrichR...
figshare.com
xls
Updated Oct 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Miguel-Angel Cortes-Guzman; Víctor Treviño (2024). Top 10 GO biological processes for BRCA1 coexpressed genes from an EnrichR analysis from top 100 genes coexpressed at the tissue-level or at the system-level. [Dataset]. http://doi.org/10.1371/journal.pone.0309961.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0309961.t002
Dataset updated
Oct 4, 2024
Dataset provided by
PLOS ONE
Authors
Miguel-Angel Cortes-Guzman; Víctor Treviño
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Top 10 GO biological processes for BRCA1 coexpressed genes from an EnrichR analysis from top 100 genes coexpressed at the tissue-level or at the system-level.
f
Cell Atlas of the Xenopus Laevis at Single-Cell Resolution
figshare.com
zip
Updated Jul 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guoji Guo; Xiaoping Han (2022). Cell Atlas of the Xenopus Laevis at Single-Cell Resolution [Dataset]. http://doi.org/10.6084/m9.figshare.19152839.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19152839.v2
Dataset updated
Jul 23, 2022
Dataset provided by
figshare
Authors
Guoji Guo; Xiaoping Han
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
dge_raw_data.zip: contains separate raw count matrices for 17 adult Xenopus tissues (bladder, bone marrow, brain, eye, heart, intestine, kidney, liver, lung, muscle, ovary, oviduct, pancreas, skin, spleen, stomach and testis) and 4 larva stages (St48, St54, St59 and St66) dge_cell_info.zip: contains the cell annotations corresponding to dge_raw_data.zip, including tSNE coordinates, clusters, stages and cell-type annotations.Xenopus_Figure1.h5ad: contains both raw count and normalized datasets (contains only highly variable genes) for 501,358 cells in XCL, which can be processed into the python environment with SCANPY directly.Xenopus_Figure1_cell_info.csv: contains the cell annotations for 501,358 cells in XCL, including tSNE coordinates, clusters, tissue origins, stages and cell-type annotations.Larva_Figure4.h5ad: contains both raw count and normalized datasets (contains only highly variable genes) for 188,020 larva cells in XCL, which can be processed into the python environment with SCANPY directly.Larva_Figure4_cell_info.csv: contains the cell annotations for 188,020 larva cells in XCL, including tSNE coordinates, clusters, stages and cell-type annotations.
f
Details of overall clustering results.
plos.figshare.com
xls
Updated Dec 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahneela Pitafi; Toni Anwar; I Dewa Made Widia; Zubair Sharif; Boonsit Yimwadsana (2024). Details of overall clustering results. [Dataset]. http://doi.org/10.1371/journal.pone.0313890.t005
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0313890.t005
Dataset updated
Dec 19, 2024
Dataset provided by
PLOS ONE
Authors
Shahneela Pitafi; Toni Anwar; I Dewa Made Widia; Zubair Sharif; Boonsit Yimwadsana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Perimeter Intrusion Detection Systems (PIDS) are crucial for protecting any physical locations by detecting and responding to intrusions around its perimeter. Despite the availability of several PIDS, challenges remain in detection accuracy and precise activity classification. To address these challenges, a new machine learning model is developed. This model utilizes the pre-trained InceptionV3 for feature extraction on PID intrusion image dataset, followed by t-SNE for dimensionality reduction and subsequent clustering. When handling high-dimensional data, the existing Density-Based Spatial Clustering of Applications with Noise (DBSCAN) algorithm faces efficiency issues due to its complexity and varying densities. To overcome these limitations, this research enhances the traditional DBSCAN algorithm. In the enhanced DBSCAN, distances between minimal points are determined using an estimation for the epsilon values with the Manhattan distance formula. The effectiveness of the proposed model is evaluated by comparing it to state-of-the-art techniques found in the literature. The analysis reveals that the proposed model achieved a silhouette score of 0.86, while comparative techniques failed to produce similar results. This research contributes to societal security by improving location perimeter protection, and future researchers can utilize the developed model for human activity recognition from image datasets.
f
t-SNE embedding illustrates the distribution of cell populations.
plos.figshare.com
zip
Updated Jun 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Changhai Long; Biao Ma; Xingshun Zhong; Mingzhi Zou; Kai Li; Sijing Liu (2025). t-SNE embedding illustrates the distribution of cell populations. [Dataset]. http://doi.org/10.1371/journal.pone.0326872.s007
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0326872.s007
Dataset updated
Jun 26, 2025
Dataset provided by
PLOS ONE
Authors
Changhai Long; Biao Ma; Xingshun Zhong; Mingzhi Zou; Kai Li; Sijing Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
t-SNE embedding illustrates the distribution of cell populations.
f
Parameters for the t-distributed Stochastic Neighbor Embedding (t-SNE).
plos.figshare.com
xls
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Doina Bucur (2023). Parameters for the t-distributed Stochastic Neighbor Embedding (t-SNE). [Dataset]. http://doi.org/10.1371/journal.pone.0272270.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0272270.t002
Dataset updated
Jun 14, 2023
Dataset provided by
PLOS ONE
Authors
Doina Bucur
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Parameters for the t-distributed Stochastic Neighbor Embedding (t-SNE).
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sara Mahallati; James C. Bezdek; Milos R. Popovic; Taufik A. Valiante (2023). Cluster tendency assessment in neuronal spike data [Dataset]. http://doi.org/10.1371/journal.pone.0224547

Cluster tendency assessment in neuronal spike data

Explore at:

10 scholarly articles cite this dataset (View in Google Scholar)

pdfAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0224547

Dataset updated

Jun 5, 2023

Dataset provided by

PLOS ONE

Authors

Sara Mahallati; James C. Bezdek; Milos R. Popovic; Taufik A. Valiante

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Sorting spikes from extracellular recording into clusters associated with distinct single units (putative neurons) is a fundamental step in analyzing neuronal populations. Such spike sorting is intrinsically unsupervised, as the number of neurons are not known a priori. Therefor, any spike sorting is an unsupervised learning problem that requires either of the two approaches: specification of a fixed value k for the number of clusters to seek, or generation of candidate partitions for several possible values of c, followed by selection of a best candidate based on various post-clustering validation criteria. In this paper, we investigate the first approach and evaluate the utility of several methods for providing lower dimensional visualization of the cluster structure and on subsequent spike clustering. We also introduce a visualization technique called improved visual assessment of cluster tendency (iVAT) to estimate possible cluster structures in data without the need for dimensionality reduction. Experimental results are conducted on two datasets with ground truth labels. In data with a relatively small number of clusters, iVAT is beneficial in estimating the number of clusters to inform the initialization of clustering algorithms. With larger numbers of clusters, iVAT gives a useful estimate of the coarse cluster structure but sometimes fails to indicate the presumptive number of clusters. We show that noise associated with recording extracellular neuronal potentials can disrupt computational clustering schemes, highlighting the benefit of probabilistic clustering models. Our results show that t-Distributed Stochastic Neighbor Embedding (t-SNE) provides representations of the data that yield more accurate visualization of potential cluster structure to inform the clustering stage. Moreover, The clusters obtained using t-SNE features were more reliable than the clusters obtained using the other methods, which indicates that t-SNE can potentially be used for both visualization and to extract features to be used by any clustering algorithm.

Clear search

Close search

Google apps

Main menu

Cluster tendency assessment in neuronal spike data

DrCyZ: Techniques for analyzing and extracting useful information from CyZ.

File Information from DrCyZ-1.1

Supplementary Table 2: All Genes TSNE-clusters

Replication Data for the \"Keratoconus severity identification using...

Additional file 5 of GECO: gene expression clustering optimization app for...

Scripts for Analysis

Data from: Reference transcriptomics of porcine peripheral immune cells...

Additional file 4 of GECO: gene expression clustering optimization app for...

Table_2_A Novel Computational Framework for Precision Diagnosis and Subtype...

GERDA datasets including NGS and SGA data

clustering and annotation metadata

Dataset name, reference, dimensions and cell type composition.

Cell and gene data for testicular single-cell RNA-Seq

Top 10 GO biological processes for BRCA1 coexpressed genes from an EnrichR...

Cell Atlas of the Xenopus Laevis at Single-Cell Resolution

Details of overall clustering results.

t-SNE embedding illustrates the distribution of cell populations.

Parameters for the t-distributed Stochastic Neighbor Embedding (t-SNE).

Cluster tendency assessment in neuronal spike data