Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spectral flow cytometry provides greater insights into cellular heterogeneity by simultaneous measurement of up to 50 markers. However, analyzing such high-dimensional (HD) data is complex through traditional manual gating strategy. To address this gap, we developed CAFE as an open-source Python-based web application with a graphical user interface. Built with Streamlit, CAFE incorporates libraries such as Scanpy for single-cell analysis, Pandas and PyArrow for efficient data handling, and Matplotlib, Seaborn, Plotly for creating customizable figures. Its robust toolset includes density-based down-sampling, dimensionality reduction, batch correction, Leiden-based clustering, cluster merging and annotation. Using CAFE, we demonstrated analysis of a human PBMC dataset of 350,000 cells identifying 16 distinct cell clusters. CAFE can generate publication-ready figures in real time via interactive slider controls and dropdown menus, eliminating the need for coding expertise and making HD data analysis accessible to all. CAFE is licensed under MIT and is freely available at https://github.com/mhbsiam/cafe.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides a structured workflow for Lattice Light-Sheet Microscopy image processing, including raw data acquisition (.czi), summarised data (extract the .zarr compressed file), metadata extraction, and image enhancement techniques such as deskewing and deconvolution that can be found as a script (main.py). The dataset is intended for researchers working with high-resolution microscopy data.
Raw Data: Original microscopy images in CZI format
Metadata: Embedded data extracted from Zeiss software can be found directly after processing .czi file, while external metadata is synthetically generated (https://github.com/onionsp/Synthetic-WGS-Dataset-Generator/).
Processing Scripts: Python scripts (as found in main.py
) for deskewing, deconvolution, and data summarization.
Summarized Data: Processed image outputs in .zarr/.tiff format, reducing storage overhead while maintaining key insights.
Data Transfer Agreement: Documentation regarding data sharing policies and agreements.
Deskewing: Corrects image distortions caused during acquisition.
Deconvolution: Enhances image clarity and sharpness.
Downsampling: Reduces resolution for efficient processing and sharing.
Conversion: CZI to Zarr or TIFF format for optimized storage and computational use.
The dataset, including raw and processed files, is hosted on Zenodo.
Users are encouraged to download downsampled versions for testing before using full-resolution data.
Processing scripts enable reproducibility and customization for different research applications.
Data transfer policies are outlined in the included Data Transfer Agreement.
https://github.com/DBK333/Omero-DataPortal/tree/main/OmeroImageSamples
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Tulu 3 Unfiltered
This is an 'unfiltered' version of the Tulu 3 SFT mixture, created by collating the original Tulu 3 sources and avoiding downsampling.
Details
The dataset consists of a mix of :
CoCoNot (ODC-BY-1.0) (Brahman et al., 2024) FLAN v2 (Apache 2.0) (Longpre et al., 2023) No Robots (CC-BY-NC-4.0) (Rajani et al. 2023) OpenAssistant Guanaco (Apache 2.0) (Kopf et al., 2024) Tulu 3 Persona MATH (ODC-BY-1.0) Tulu 3 Persona GSM (ODC-BY-1.0) Tulu 3 Persona Python… See the full description on the dataset page: https://huggingface.co/datasets/hamishivi/tulu-3-unfiltered.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spectral flow cytometry provides greater insights into cellular heterogeneity by simultaneous measurement of up to 50 markers. However, analyzing such high-dimensional (HD) data is complex through traditional manual gating strategy. To address this gap, we developed CAFE as an open-source Python-based web application with a graphical user interface. Built with Streamlit, CAFE incorporates libraries such as Scanpy for single-cell analysis, Pandas and PyArrow for efficient data handling, and Matplotlib, Seaborn, Plotly for creating customizable figures. Its robust toolset includes density-based down-sampling, dimensionality reduction, batch correction, Leiden-based clustering, cluster merging and annotation. Using CAFE, we demonstrated analysis of a human PBMC dataset of 350,000 cells identifying 16 distinct cell clusters. CAFE can generate publication-ready figures in real time via interactive slider controls and dropdown menus, eliminating the need for coding expertise and making HD data analysis accessible to all. CAFE is licensed under MIT and is freely available at https://github.com/mhbsiam/cafe.