3 datasets found
  1. Downsampled data from FlowRepository: FR-FCM-Z3WR

    • figshare.com
    csv
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Tyrrell (2024). Downsampled data from FlowRepository: FR-FCM-Z3WR [Dataset]. http://doi.org/10.6084/m9.figshare.27940719.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Tyrrell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spectral flow cytometry provides greater insights into cellular heterogeneity by simultaneous measurement of up to 50 markers. However, analyzing such high-dimensional (HD) data is complex through traditional manual gating strategy. To address this gap, we developed CAFE as an open-source Python-based web application with a graphical user interface. Built with Streamlit, CAFE incorporates libraries such as Scanpy for single-cell analysis, Pandas and PyArrow for efficient data handling, and Matplotlib, Seaborn, Plotly for creating customizable figures. Its robust toolset includes density-based down-sampling, dimensionality reduction, batch correction, Leiden-based clustering, cluster merging and annotation. Using CAFE, we demonstrated analysis of a human PBMC dataset of 350,000 cells identifying 16 distinct cell clusters. CAFE can generate publication-ready figures in real time via interactive slider controls and dropdown menus, eliminating the need for coding expertise and making HD data analysis accessible to all. CAFE is licensed under MIT and is freely available at https://github.com/mhbsiam/cafe.

  2. Lattice Light-Sheet Microscopy Datasets and Workflow for Omero

    • zenodo.org
    bin, csv +2
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). Lattice Light-Sheet Microscopy Datasets and Workflow for Omero [Dataset]. http://doi.org/10.5281/zenodo.14807429
    Explore at:
    zip, text/x-python, bin, csvAvailable download formats
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 5, 2025
    Description

    Dataset Overview

    This dataset provides a structured workflow for Lattice Light-Sheet Microscopy image processing, including raw data acquisition (.czi), summarised data (extract the .zarr compressed file), metadata extraction, and image enhancement techniques such as deskewing and deconvolution that can be found as a script (main.py). The dataset is intended for researchers working with high-resolution microscopy data.

    Contents

    1. Raw Data: Original microscopy images in CZI format

      • It is recommended to store the raw data (e.g., CZI files) as a baseline for reproducibility. If raw data is too large (e.g., 500 GB), consider downsampling it for testing and archival purposes.
    2. Metadata: Embedded data extracted from Zeiss software can be found directly after processing .czi file, while external metadata is synthetically generated (https://github.com/onionsp/Synthetic-WGS-Dataset-Generator/).

    3. Processing Scripts: Python scripts (as found in main.py) for deskewing, deconvolution, and data summarization.

      • Use the provided processing scripts to perform deskewing, deconvolution, and other preprocessing steps. Note that processed data can become significantly larger (e.g., a 500 GB raw dataset may expand to 700 GB after processing).
    4. Summarized Data: Processed image outputs in .zarr/.tiff format, reducing storage overhead while maintaining key insights.

      • Save summarized data to reduce storage requirements. Summarized data could include key metrics, visualizations, or compressed outputs.
    5. Data Transfer Agreement: Documentation regarding data sharing policies and agreements.

    Workflow Overview

    • Deskewing: Corrects image distortions caused during acquisition.

    • Deconvolution: Enhances image clarity and sharpness.

    • Downsampling: Reduces resolution for efficient processing and sharing.

    • Conversion: CZI to Zarr or TIFF format for optimized storage and computational use.

    Data Access & Usage

    • The dataset, including raw and processed files, is hosted on Zenodo.

    • Users are encouraged to download downsampled versions for testing before using full-resolution data.

    • Processing scripts enable reproducibility and customization for different research applications.

    • Data transfer policies are outlined in the included Data Transfer Agreement.

    https://github.com/DBK333/Omero-DataPortal/tree/main/OmeroImageSamples

    https://github.com/BioimageAnalysisCoreWEHI/napari_lattice

  3. h

    tulu-3-unfiltered

    • huggingface.co
    Updated Feb 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hamish Ivison (2025). tulu-3-unfiltered [Dataset]. https://huggingface.co/datasets/hamishivi/tulu-3-unfiltered
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 26, 2025
    Authors
    Hamish Ivison
    License

    https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/

    Description

    Tulu 3 Unfiltered

    This is an 'unfiltered' version of the Tulu 3 SFT mixture, created by collating the original Tulu 3 sources and avoiding downsampling.

      Details
    

    The dataset consists of a mix of :

    CoCoNot (ODC-BY-1.0) (Brahman et al., 2024) FLAN v2 (Apache 2.0) (Longpre et al., 2023) No Robots (CC-BY-NC-4.0) (Rajani et al. 2023) OpenAssistant Guanaco (Apache 2.0) (Kopf et al., 2024) Tulu 3 Persona MATH (ODC-BY-1.0) Tulu 3 Persona GSM (ODC-BY-1.0) Tulu 3 Persona Python… See the full description on the dataset page: https://huggingface.co/datasets/hamishivi/tulu-3-unfiltered.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Daniel Tyrrell (2024). Downsampled data from FlowRepository: FR-FCM-Z3WR [Dataset]. http://doi.org/10.6084/m9.figshare.27940719.v1
Organization logo

Downsampled data from FlowRepository: FR-FCM-Z3WR

Explore at:
csvAvailable download formats
Dataset updated
Dec 2, 2024
Dataset provided by
Figsharehttp://figshare.com/
Authors
Daniel Tyrrell
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Spectral flow cytometry provides greater insights into cellular heterogeneity by simultaneous measurement of up to 50 markers. However, analyzing such high-dimensional (HD) data is complex through traditional manual gating strategy. To address this gap, we developed CAFE as an open-source Python-based web application with a graphical user interface. Built with Streamlit, CAFE incorporates libraries such as Scanpy for single-cell analysis, Pandas and PyArrow for efficient data handling, and Matplotlib, Seaborn, Plotly for creating customizable figures. Its robust toolset includes density-based down-sampling, dimensionality reduction, batch correction, Leiden-based clustering, cluster merging and annotation. Using CAFE, we demonstrated analysis of a human PBMC dataset of 350,000 cells identifying 16 distinct cell clusters. CAFE can generate publication-ready figures in real time via interactive slider controls and dropdown menus, eliminating the need for coding expertise and making HD data analysis accessible to all. CAFE is licensed under MIT and is freely available at https://github.com/mhbsiam/cafe.

Search
Clear search
Close search
Google apps
Main menu