89 datasets found
  1. C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Sep 22, 2010
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2010). C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/c-mapss-aircraft-engine-simulator-data
    Explore at:
    Dataset updated
    Sep 22, 2010
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    SPECIAL NOTE: C-MAPSS and C-MAPSS40K ARE CURRENTLY UNAVAILABLE FOR DOWNLOAD. Glenn Research Center management is reviewing the availability requirements for these software packages. We are working with Center management to get the review completed and issues resolved in a timely manner. We will post updates on this website when the issues are resolved. We apologize for any inconvenience. Please contact Jonathan Litt, jonathan.s.litt@nasa.gov, if you have any questions in the meantime. Subject Area: Engine Health Description: This data set was generated with the C-MAPSS simulator. C-MAPSS stands for 'Commercial Modular Aero-Propulsion System Simulation' and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions with a reasonable linear transition period to allow the engine to change from one flight condition to the next. The flight conditions are arranged to cover a typical ascent from sea level to 35K ft and descent back down to sea level. The fault was injected at a given time in one of the flights and persists throughout the remaining flights, effectively increasing the age of the engine. The intent is to identify which flight and when in the flight the fault occurred. How Data Was Acquired: The data provided is from a high fidelity system level engine simulation designed to simulate nominal and fault engine degradation over a series of flights. The simulated data was created with a Matlab Simulink tool called C-MAPSS. Sample Rates and Parameter Description: The flights are full flight recordings sampled at 1 Hz and consist of 30 engine and flight condition parameters. Each flight contains 7 unique flight conditions for an approximately 90 min flight including ascent to cruise at 35K ft and descent back to sea level. The parameters for each flight are the flight conditions, health indicators, measurement temperatures and pressure measurements. Faults/Anomalies: Faults arose from the inlet engine fan, the low pressure compressor, the high pressure compressor, the high pressure turbine and the low pressure turbine.

  2. d

    Hepatitis C – Advanced Liver Disease Disparities Dashboard

    • catalog.data.gov
    • data.va.gov
    • +4more
    Updated Aug 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Veterans Affairs (2025). Hepatitis C – Advanced Liver Disease Disparities Dashboard [Dataset]. https://catalog.data.gov/dataset/hepatitis-c-advanced-liver-disease-disparities-dashboard
    Explore at:
    Dataset updated
    Aug 2, 2025
    Dataset provided by
    Department of Veterans Affairs
    Description

    The Office of Health Equity (OHE) created the Hepatitis C-Advanced Liver Disease (HCV-ALD) dashboard to raise awareness of potential disparities among vulnerable Veterans for this life-threatening condition. The purpose of the HCV-ALD Dashboard is promote equitable diagnosis and treatment of underserved Veterans with hepatitis C and advanced liver disease. The Hepatitis C-ALD Dashboard utilizes a set of criteria - age, gender, geography, service era, race/ethnicity - to characterize Veteran groups with ALD due to hepatitis C who may require targeted intervention to improve their health. The dashboard advances the vision for quality care and improved access to care as identified in the VHA Blueprint for Excellence.

  3. R

    Data from: C Project Dataset

    • universe.roboflow.com
    zip
    Updated Sep 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    seoyh rb (2022). C Project Dataset [Dataset]. https://universe.roboflow.com/seoyh-rb-gdiwi/c-project/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 29, 2022
    Dataset authored and provided by
    seoyh rb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Tooth Bounding Boxes
    Description

    C Project

    ## Overview
    
    C Project is a dataset for object detection tasks - it contains Tooth annotations for 713 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  4. d

    Dataset inventory

    • catalog.data.gov
    • data.sfgov.org
    • +1more
    Updated Nov 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). Dataset inventory [Dataset]. https://catalog.data.gov/dataset/dataset-inventory
    Explore at:
    Dataset updated
    Nov 30, 2025
    Dataset provided by
    data.sfgov.org
    Description

    A. SUMMARY The dataset inventory provides a list of data maintained by departments that are candidates for open data publishing or have already been published and is collected in accordance with Chapter 22D of the Administrative Code. The inventory will be used in conjunction with department publishing plans to track progress toward meeting plan goals for each department. B. HOW THE DATASET IS CREATED This dataset is collated through 2 ways: 1. Ongoing updates are made throughout the year to reflect new datasets, this process involves DataSF staff reconciling publishing records after datasets are published 2. Annual bulk updates - departments review their inventories and identify changes and updates and submit those to DataSF for a once a year bulk update - not all departments will have changes or their changes will have been captured over the course of the prior year already as ongoing updates C. UPDATE PROCESS The dataset is synced automatically daily, but the underlying data changes manually throughout the year as needed D. HOW TO USE THIS DATASET Interpreting dates in this dataset This dataset has 2 dates: 1. Date Added - when the dataset was added to the inventory itself 2. First Published - the open data portal automatically captures the date the dataset was first created, this is that system generated date Note that in certain cases we may have published a dataset prior to it being added to the inventory. We do our best to have an accurate accounting of when something was added to this inventory and when it was published. In most cases the inventory addition will happen prior to publishing, but in certain cases it will be published and we will have missed updating the inventory as this is a manual process. First published will give an accounting of when it was actually available on the open data catalog and date added when it was added to this list. E. RELATED DATASETS Inventory of citywide enterprise systems of record Dataset Inventory: Column-Level Details

  5. ARISE C-130 Aircraft Merge Data Files - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Apr 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). ARISE C-130 Aircraft Merge Data Files - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/arise-c-130-aircraft-merge-data-files-e402f
    Explore at:
    Dataset updated
    Apr 1, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    ARISE_Merge_Data_1 is the Arctic Radiation - IceBridge Sea & Ice Experiment (ARISE) 2014 pre-generated aircraft (C-130) merge data files. This product is a result of a joint effort of the Radiation Sciences, Cryospheric Sciences and Airborne Sciences programs of the Earth Science Division in NASA's Science Mission Directorate in Washington. Data collection is complete.ARISE was NASA's first Arctic airborne campaign designed to take simultaneous measurements of ice, clouds and the levels of incoming and outgoing radiation, the balance of which determined the degree of climate warming. Over the past few decades, an increase in global temperatures led to decreased Arctic summer sea ice. Typically, Arctic sea ice reflects sunlight from the Earth. However, a loss of sea ice means there is more open water to absorb heat from the sun, enhancing warming in the region. More open water can also cause the release of more moisture into the atmosphere. This additional moisture could affect cloud formation and the exchange of heat from Earth’s surface to space. Conducted during the peak of summer ice melt (August 28, 2014-October 1, 2014), ARISE was designed to study and collect data on thinning sea ice, measure cloud and atmospheric properties in the Arctic, and to address questions about the relationship between retreating sea ice and the Arctic climate. During the campaign, instruments on NASA’s C-130 aircraft conducted measurements of spectral and broadband radiative flux profiles, quantified surface characteristics, cloud properties, and atmospheric state parameters under a variety of Arctic atmospheric and surface conditions (e.g. open water, sea ice, and land ice). When possible, C-130 flights were coordinated to fly under satellite overpasses. The primary aerial focus of ARISE was over Arctic sea ice and open water, with minor coverage over Greenland land ice. Through these efforts, the ARISE field campaign helped improve cloud and sea ice computer modeling in the Arctic.

  6. R

    Data from: Blood C Dataset

    • universe.roboflow.com
    zip
    Updated May 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LuigiAIRLABAPC (2023). Blood C Dataset [Dataset]. https://universe.roboflow.com/luigiairlabapc/blood-c
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 10, 2023
    Dataset authored and provided by
    LuigiAIRLABAPC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Blood Cells Bounding Boxes
    Description

    BLOOD C

    ## Overview
    
    BLOOD C is a dataset for object detection tasks - it contains Blood Cells annotations for 364 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  7. Leash-Bio-processed-dataset

    • kaggle.com
    Updated May 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hengck23 (2024). Leash-Bio-processed-dataset [Dataset]. https://www.kaggle.com/datasets/hengck23/leash-bio-processed-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    hengck23
    Description

    Processed dataset for https://www.kaggle.com/competitions/leash-BELKA.

    For any b2z file, It is recommend to be parallel bzip decompressor (https://github.com/mxmlnkn/indexed_bzip2) for speed.

    Last update : 22-may-2024

    In summary:

    See forum discussion for details of [1],[2]: https://www.kaggle.com/competitions/leash-BELKA/discussion/492846

    [1] reduced data

    • train.reduced.parquet : 98_415_610 training SMILES and their information
    • train.bind.npz : 98_415_610 x 3 target matrix
    • test.reduced.parquet : 878_022 test SMILES
    • all_buildingblock.csv: building blocks id used in train.reduced.parquet/test.reduced.parquet
    • fold0.parquet: train_share,valid_share,valid_nonshare splits for the experiments in the discussion

    [2] extracted ECFP4 fingerprints

    • train.ecfp4.packed.npz : Features extracted using rdkit
      • AllChem.GetMorganFingerprintAsBitVect(mol, 2, 2048)
      • repack with np.packbits() to give 98_415_610 x 256 feature matrix
    • test.ecfp4.packed.npz : similarly processed for the test SMILES

    This is somehow obsolete as the competition progresses. ecfp6 gives better results and can be extracted fast with scikit-fingerprints.

    See forum discussion for details of [3]: https://www.kaggle.com/competitions/leash-BELKA/discussion/498858 https://www.kaggle.com/code/hengck23/lb6-02-graph-nn-example

    [3] graph NN processed data

    • test/train-replace-c.smiles.bytestring.bz2 : replace linker [Dy] with C. Note that these are bytestrings and not strings.
    • train-replace-c-30m.graph.pickle.**.b2z : 98_415_610 molecule graph split into 3 files. test graphs are not provided as they are be generated on the fly.

    See forum discussion for details of [4]: https://www.kaggle.com/competitions/leash-BELKA/discussion/505985 https://www.kaggle.com/code/hengck23/conforge-open-source-conformer-generator

    [4] conformer. i.e. molecule estimated xyz data

    • test-replace-c.conforge.sdf.bz2 : conformer in sdf file. you can read the file using rdkit Chem.SDMolSupplier().
    • test-replace-c.conforge.status.parquet:
      • 'status col' shows the status of conformer. 0 means success. for failure cases, sdf store a dummy 'CC' molecule.
      • 'idx col' shows the idx (primary key) to test.reduced.parquet. use this to retrieve SMILES strings. Note that conformer is based on test-replace-c.smiles.bytestring.bz2, i.e. [Dy] is replaced by C.
    • train-replace-c.sub-[split].conforge.sdf.bz2/status.parquet: smiliar format as describe above. [split] are:
      • train: 1000250+(1001610*3) molecules
      • valid: 40000
      • nonshare: about 61674
  8. NDN Attack Traffic: Tree & DFN

    • kaggle.com
    zip
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Raga Titipan (2025). NDN Attack Traffic: Tree & DFN [Dataset]. https://www.kaggle.com/datasets/ragatitipan/ndn-attack-traffic-tree-and-dfn
    Explore at:
    zip(32733333 bytes)Available download formats
    Dataset updated
    Oct 16, 2025
    Authors
    Muhammad Raga Titipan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Description This dataset contains comprehensive network traffic data captured during simulated attacks on Named Data Networking (NDN) environments across two distinct network topologies: Tree and DFN (Deutsches ForschungsNetz). All data was generated through controlled experiments using miniNDN simulation on Ubuntu.

    Dataset Overview Named Data Networking (NDN) represents a future internet architecture that focuses on content retrieval rather than host-to-host communication. As this architecture gains traction, understanding its security vulnerabilities becomes increasingly important. This dataset provides researchers with real traffic patterns observed during various attack scenarios on NDN networks.

    The dataset captures traffic parameters across:

    1. Tree Topology: A hierarchical network structure commonly used in organizational networks
    2. DFN Topology: Based on the German Research Network topology, representing a more complex, real-world network configuration

    Data Collection Methodology All data was systematically collected through:

    1. Setting up miniNDN environments on Ubuntu
    2. Configuring both Tree and DFN network topologies
    3. Executing controlled attack scenarios
    4. Capturing comprehensive network traffic parameters
    5. Labeling data with attack types and relevant metadata

    Features https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17020645%2F9e3da0ea20cf30dd62d34a2ab7a1c58b%2Ftree.png?generation=1747460661383218&alt=media" alt=""> https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F17020645%2Ff1459323e4195379bd2a8e8ea186eef1%2Fdfn.png?generation=1747460683137915&alt=media" alt=""> The dataset includes essential NDN traffic parameters:

    1. Packet type (Interest, Data, Nack)
    2. Node and interface identifiers
    3. Packet size and hop count metrics
    4. Interest lifetime values
    5. Content Store, PIT, and FIB entries
    6. Attack classification labels
    7. Topology identifiers

    Applications This dataset is valuable for:

    • Developing NDN-specific intrusion detection systems
    • Comparing attack propagation across different network architectures
    • Training machine learning models for attack detection
    • Benchmarking security solutions for content-centric networks
    • Understanding how topology affects security vulnerability
  9. h

    fineweb-c

    • huggingface.co
    Updated Jan 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Is Better Together (2025). fineweb-c [Dataset]. https://huggingface.co/datasets/data-is-better-together/fineweb-c
    Explore at:
    Dataset updated
    Jan 14, 2025
    Dataset authored and provided by
    Data Is Better Together
    Description

    FineWeb-C: Educational content in many languages, labelled by the community

    Multilingual data is better together!

    Note: We are not actively working on this project anymore. You can continue to contribute annotations and we'll occasionally refresh the exported data.

      What is this?
    

    FineWeb-C is a collaborative, community-driven project that expands upon the FineWeb2 dataset. The goal is to create high-quality educational content annotations across hundreds of… See the full description on the dataset page: https://huggingface.co/datasets/data-is-better-together/fineweb-c.

  10. VDH-COVID-19-PublicUseDataset-MIS-C - RETIRED Dataset

    • data.virginia.gov
    • opendata.winchesterva.gov
    csv
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Virginia Department of Health (2025). VDH-COVID-19-PublicUseDataset-MIS-C - RETIRED Dataset [Dataset]. https://data.virginia.gov/dataset/vdh-covid-19-publicusedataset-mis-c
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 2, 2025
    Dataset authored and provided by
    Virginia Department of Healthhttps://www.vdh.virginia.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    This dataset was retired on 2/7/2024.

    This dataset switched to a weekly M-F cadence on 12/27/2022..

    This data set includes the cumulative (total) number of Multisystem Inflammatory Syndrome in Children (MIS-C) cases and deaths in Virginia by report date. This data set was first published on May 24, 2020. When you download the data set, the dates will be sorted in ascending order, meaning that the earliest date will be at the top. To see data for the most recent date, please scroll down to the bottom of the data set. The Virginia Department of Health’s Thomas Jefferson Health District (TJHD) will be renamed to Blue Ridge Health District (BRHD), effective January 2021. More information about this change can be found here: https://www.vdh.virginia.gov/blue-ridge/name-change/

  11. Kaggle Data Science Survey 2017-2021

    • kaggle.com
    zip
    Updated Nov 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrada (2021). Kaggle Data Science Survey 2017-2021 [Dataset]. https://www.kaggle.com/datasets/andradaolteanu/kaggle-data-science-survey-20172021/code
    Explore at:
    zip(18555433 bytes)Available download formats
    Dataset updated
    Nov 26, 2021
    Authors
    Andrada
    Description

    Context

    I have created this dataset for an easier way to analyse the progression of answers from the respondents that are participating each year in the very famous Data Science Kaggle Survey.

    The sources of the present data are: * 2017: https://www.kaggle.com/kaggle/kaggle-survey-2017 * 2018: https://www.kaggle.com/kaggle/kaggle-survey-2018 * 2019: https://www.kaggle.com/c/kaggle-survey-2019/data * 2020: https://www.kaggle.com/c/kaggle-survey-2020/data * 2021: https://www.kaggle.com/c/kaggle-survey-2021/data

    Methodology

    This dataset was created by manually aggregating each of the 5 tables mentioned above. The full methodology was as follows:

    • The 2021 table was took as refference, as it is the latest and most "up to date" in regards with the questions and the Data Science Industry overall evolution.
    • Each year in descending order was fully analysed one by one in order to find all questions (and answers) that were the same to the ones found in 2021.
    • As we go back in time, the questions lose their completeness more and more, so I would highly suggest analysing percentages on Year, rather than absolute numbers.

    The aggregation was done manually, as the questions order, naming and types of answers differ from one year to another. Hence, the most accurate way (although not the most efficient), was to read, order and pick the questions with regards to the base table (which was the 2021 Survey).

    Content

    This dataset contains the following:

    • kaggle_survey_2017_2021.csv: the tabular dataset containing the aggregated data from 2017 to 2021.
    • style.css: a file that serves as custom styling for my notebook on this competition.
    • images folder: all images I have used for my notebook on this competition.

    Note: Notebook can be found here.

    Acknowledgements

    Thank you so much to the Kaggle Team for hosting these surveys and sharing with us all the data, so we can take the pulse of the community each year.

    Inspiration

    The Kaggle Survey is reach in information as is, but what can you find by adding another layer of information - the year? Evolutions in time could be fascinating.

  12. CMAPSS Jet Engine Simulated Data - Dataset - NASA Open Data Portal

    • data.nasa.gov
    Updated Oct 15, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2008). CMAPSS Jet Engine Simulated Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/cmapss-jet-engine-simulated-data
    Explore at:
    Dataset updated
    Oct 15, 2008
    Dataset provided by
    NASAhttp://nasa.gov/
    Description

    Data sets consists of multiple multivariate time series. Each data set is further divided into training and test subsets. Each time series is from a different engine i.e., the data can be considered to be from a fleet of engines of the same type. Each engine starts with different degrees of initial wear and manufacturing variation which is unknown to the user. This wear and variation is considered normal, i.e., it is not considered a fault condition. There are three operational settings that have a substantial effect on engine performance. These settings are also included in the data. The data is contaminated with sensor noise. The engine is operating normally at the start of each time series, and develops a fault at some point during the series. In the training set, the fault grows in magnitude until system failure. In the test set, the time series ends some time prior to system failure. The objective of the competition is to predict the number of remaining operational cycles before failure in the test set, i.e., the number of operational cycles after the last cycle that the engine will continue to operate. Also provided a vector of true Remaining Useful Life (RUL) values for the test data. The data are provided as a zip-compressed text file with 26 columns of numbers, separated by spaces. Each row is a snapshot of data taken during a single operational cycle, each column is a different variable. The columns correspond to: 1) unit number 2) time, in cycles 3) operational setting 1 4) operational setting 2 5) operational setting 3 6) sensor measurement 1 7) sensor measurement 2 ... 26) sensor measurement 26 Data Set: FD001 Train trjectories: 100 Test trajectories: 100 Conditions: ONE (Sea Level) Fault Modes: ONE (HPC Degradation) Data Set: FD002 Train trjectories: 260 Test trajectories: 259 Conditions: SIX Fault Modes: ONE (HPC Degradation) Data Set: FD003 Train trjectories: 100 Test trajectories: 100 Conditions: ONE (Sea Level) Fault Modes: TWO (HPC Degradation, Fan Degradation) Data Set: FD004 Train trjectories: 248 Test trajectories: 249 Conditions: SIX Fault Modes: TWO (HPC Degradation, Fan Degradation) Reference: A. Saxena, K. Goebel, D. Simon, and N. Eklund, ‘Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation’, in the Proceedings of the 1st International Conference on Prognostics and Health Management (PHM08), Denver CO, Oct 2008.

  13. Physical Gene Regulatory Networks in C.elegans

    • kaggle.com
    zip
    Updated Feb 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Physical Gene Regulatory Networks in C.elegans [Dataset]. https://www.kaggle.com/datasets/thedevastator/physical-gene-regulatory-networks-in-c-elegans
    Explore at:
    zip(543510 bytes)Available download formats
    Dataset updated
    Feb 10, 2023
    Authors
    The Devastator
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Physical Gene Regulatory Networks in C.elegans

    239,001 Regulatory Interactions from 289 Wild-type Young Adult Datasets

    By [source]

    About this dataset

    This dataset provides highly complex physical gene regulatory networks in young adult wild-type (WT) C.elegans worms. With a total of 239,001 regulatory interactions collected from 289 datasets, this dataset is a great resource for studying gene regulation and exploring how this gene activity contributes to organism function under varying bio-environmental conditions. Our collection of datasets contains 126 genes and 495 transcription factors, along with functional knockdown data that has been used to validate the physical gene regulatory networks present in the young adult C.elegans worms. Moreover, researchers and biologists can leverage this data to gain valuable insights on how various genotypes, ages and strains are associated with different perturbations in their biological features and ultimately uncover new discoveries about the network of relationships that exist between these genes inside animals. This comprehensive dataset will be essential for conducting research related to such topics as life development processes or age-related diseases - further enriching our understanding of life!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This guide will help you understand how to use this dataset of physical gene regulatory networks to research and analyze young adult C.elegans worms.

    • Understand the columns in the dataset: In this dataset, there are 239,001 regulatory interactions from 289 datasets consisting of 126 genes and 495 transcription factors registered with their genotype, age, strain, perturbation type, data type, data source and source used. Additionally, comments and regulator are also included in the columns for more information about each interaction.

    • Know your research goal: Determine what it is you wish to discover when working with this dataset so that you can work efficiently when sorting or exploring the data within it. Knowing your goals for the analysis will be helpful for deciding which column may provide valuable insights in relation to our project objectives when doing any kind of filter or sorting within the internal structure of our database file itself.

    • Analyzing Specific Types Of Data: Once your goals have been established it is then important to start analyzing specific types of data that are relevant for achieving those objectives as we go further into understanding what kind of database structures we will need to read from on a molecular level (this includes focusing on different types such as transcription factor levels). When looking at all these individual components together they can offer insight into how regulation may be changing within a cell’s environment & which pathways could become activated/ deactivated due its presence or absence throughout different conditions).

    4 Keeping Logs And Documents Up To Date: Once done with some sortings or filters on certain columns make sure that your logs/documents stay up-to-date and match up with any changes made during analysis so as not mix-up usage across different documents/sessions throughout our project lifespan itself! This is highly recommended as having an organized record keeping system helps ensure accuracy when dealing with large volumes of information over time periods (thus making sure nothing gets overlooked accidentally!).

    We hope these tips help get you started into exploring Physical Gene Regulatory Networks in C Elegans’! If you have any questions feel free to reach out via message – we would love hearing about how things go after implementing them into practice!

    Research Ideas

    • Training machine-learning algorithms to develop automated approaches in predicting gene expression levels of individual regulatory networks.
    • Using this dataset alongside data from RNA-seq experiments to investigate how genetic mutations, environmental changes, and other factors can affect gene regulation across C.elegans populations.
    • Exploring the correlation between transcription factor binding sites and gene expression levels to predict potential target genes for a given transcription factor

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, m...

  14. k

    Comprehensive battery aging dataset: capacity and impedance fade...

    • radar.kit.edu
    • service.tib.eu
    • +1more
    tar
    Updated Mar 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthias Luh; Thomas Blank (2024). Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell [dataset] [Dataset]. http://doi.org/10.35097/1947
    Explore at:
    tar(69375563264 bytes)Available download formats
    Dataset updated
    Mar 7, 2024
    Dataset provided by
    Karlsruhe Institute of Technology
    Authors
    Matthias Luh; Thomas Blank
    Description

    The data is described in detail in the open-access publication "Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell" published in Nature Scientific Data under the DOI: 10.1038/s41597-024-03831-x, also see “Related identifier”. An updated dataset is published under the DOI 10.35097/1969 (result data, e.g., capacity fade and impedance increase) and 10.35097/kww7jv8ajuvchcah (log data), also see “Related identifier”. Python example code to read, process, and visualize the data is provided in the GitHub repository: https://github.com/energystatusdata/bat-age-data-scripts/ Note: The "cell_eisv2.zip" file in this dataset is incomplete and only contains data for cells P001_1 to P044_2. The corrected file "cell_eisv2_fixed.zip" containing data for all 228 cells P001_1 to P076_3 can be found in the dataset “Addendum to "Comprehensive battery aging dataset: capacity and impedance fade measurements of a lithium-ion NMC/C-SiO cell [dataset]"” with the DOI 10.35097/krk531nmj4bsshha (see “Related identifier”).

  15. Z

    Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • data.niaid.nih.gov
    Updated Oct 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yfantidou, Sofia; Karagianni, Christina; Efstathiou, Stefanos; Vakali, Athena; Palotti, Joao; Giakatos, Dimitrios Panteleimon; Marchioro, Thomas; Kazlouski, Andrei; Ferrari, Elena; Girdzijauskas, Šarūnas (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6826682
    Explore at:
    Dataset updated
    Oct 20, 2022
    Dataset provided by
    University of Insubria
    Foundation for Research and Technology Hellas
    Earkick
    KTH Royal Institute of Technology
    Aristotle University of Thessaloniki
    Authors
    Yfantidou, Sofia; Karagianni, Christina; Efstathiou, Stefanos; Vakali, Athena; Palotti, Joao; Giakatos, Dimitrios Panteleimon; Marchioro, Thomas; Kazlouski, Andrei; Ferrari, Elena; Girdzijauskas, Šarūnas
    Description

    LifeSnaps Dataset Documentation

    Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction.

    The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication.

    Data Import: Reading CSV

    For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command.

    Data Import: Setting up a MongoDB (Recommended)

    To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database.

    To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here.

    For the Fitbit data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c fitbit

    For the SEMA data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c sema

    For surveys data, run the following:

    mongorestore --host localhost:27017 -d rais_anonymized -c surveys

    If you have access control enabled, then you will need to add the --username and --password parameters to the above commands.

    Data Availability

    The MongoDB database contains three collections, fitbit, sema, and surveys, containing the Fitbit, SEMA3, and survey data, respectively. Similarly, the CSV files contain related information to these collections. Each document in any collection follows the format shown below:

    { _id: id (or user_id): type: data: }

    Each document consists of four fields: id (also found as user_id in sema and survey collections), type, and data. The _id field is the MongoDB-defined primary key and can be ignored. The id field refers to a user-specific ID used to uniquely identify each user across all collections. The type field refers to the specific data type within the collection, e.g., steps, heart rate, calories, etc. The data field contains the actual information about the document e.g., steps count for a specific timestamp for the steps type, in the form of an embedded object. The contents of the data object are type-dependent, meaning that the fields within the data object are different between different types of data. As mentioned previously, all times are stored in local time, and user IDs are common across different collections. For more information on the available data types, see the related publication.

    Surveys Encoding

    BREQ2

    Why do you engage in exercise?

        Code
        Text
    
    
        engage[SQ001]
        I exercise because other people say I should
    
    
        engage[SQ002]
        I feel guilty when I don’t exercise
    
    
        engage[SQ003]
        I value the benefits of exercise
    
    
        engage[SQ004]
        I exercise because it’s fun
    
    
        engage[SQ005]
        I don’t see why I should have to exercise
    
    
        engage[SQ006]
        I take part in exercise because my friends/family/partner say I should
    
    
        engage[SQ007]
        I feel ashamed when I miss an exercise session
    
    
        engage[SQ008]
        It’s important to me to exercise regularly
    
    
        engage[SQ009]
        I can’t see why I should bother exercising
    
    
        engage[SQ010]
        I enjoy my exercise sessions
    
    
        engage[SQ011]
        I exercise because others will not be pleased with me if I don’t
    
    
        engage[SQ012]
        I don’t see the point in exercising
    
    
        engage[SQ013]
        I feel like a failure when I haven’t exercised in a while
    
    
        engage[SQ014]
        I think it is important to make the effort to exercise regularly
    
    
        engage[SQ015]
        I find exercise a pleasurable activity
    
    
        engage[SQ016]
        I feel under pressure from my friends/family to exercise
    
    
        engage[SQ017]
        I get restless if I don’t exercise regularly
    
    
        engage[SQ018]
        I get pleasure and satisfaction from participating in exercise
    
    
        engage[SQ019]
        I think exercising is a waste of time
    

    PANAS

    Indicate the extent you have felt this way over the past week

        P1[SQ001]
        Interested
    
    
        P1[SQ002]
        Distressed
    
    
        P1[SQ003]
        Excited
    
    
        P1[SQ004]
        Upset
    
    
        P1[SQ005]
        Strong
    
    
        P1[SQ006]
        Guilty
    
    
        P1[SQ007]
        Scared
    
    
        P1[SQ008]
        Hostile
    
    
        P1[SQ009]
        Enthusiastic
    
    
        P1[SQ010]
        Proud
    
    
        P1[SQ011]
        Irritable
    
    
        P1[SQ012]
        Alert
    
    
        P1[SQ013]
        Ashamed
    
    
        P1[SQ014]
        Inspired
    
    
        P1[SQ015]
        Nervous
    
    
        P1[SQ016]
        Determined
    
    
        P1[SQ017]
        Attentive
    
    
        P1[SQ018]
        Jittery
    
    
        P1[SQ019]
        Active
    
    
        P1[SQ020]
        Afraid
    

    Personality

    How Accurately Can You Describe Yourself?

        Code
        Text
    
    
        ipip[SQ001]
        Am the life of the party.
    
    
        ipip[SQ002]
        Feel little concern for others.
    
    
        ipip[SQ003]
        Am always prepared.
    
    
        ipip[SQ004]
        Get stressed out easily.
    
    
        ipip[SQ005]
        Have a rich vocabulary.
    
    
        ipip[SQ006]
        Don't talk a lot.
    
    
        ipip[SQ007]
        Am interested in people.
    
    
        ipip[SQ008]
        Leave my belongings around.
    
    
        ipip[SQ009]
        Am relaxed most of the time.
    
    
        ipip[SQ010]
        Have difficulty understanding abstract ideas.
    
    
        ipip[SQ011]
        Feel comfortable around people.
    
    
        ipip[SQ012]
        Insult people.
    
    
        ipip[SQ013]
        Pay attention to details.
    
    
        ipip[SQ014]
        Worry about things.
    
    
        ipip[SQ015]
        Have a vivid imagination.
    
    
        ipip[SQ016]
        Keep in the background.
    
    
        ipip[SQ017]
        Sympathize with others' feelings.
    
    
        ipip[SQ018]
        Make a mess of things.
    
    
        ipip[SQ019]
        Seldom feel blue.
    
    
        ipip[SQ020]
        Am not interested in abstract ideas.
    
    
        ipip[SQ021]
        Start conversations.
    
    
        ipip[SQ022]
        Am not interested in other people's problems.
    
    
        ipip[SQ023]
        Get chores done right away.
    
    
        ipip[SQ024]
        Am easily disturbed.
    
    
        ipip[SQ025]
        Have excellent ideas.
    
    
        ipip[SQ026]
        Have little to say.
    
    
        ipip[SQ027]
        Have a soft heart.
    
    
        ipip[SQ028]
        Often forget to put things back in their proper place.
    
    
        ipip[SQ029]
        Get upset easily.
    
    
        ipip[SQ030]
        Do not have a good imagination.
    
    
        ipip[SQ031]
        Talk to a lot of different people at parties.
    
    
        ipip[SQ032]
        Am not really interested in others.
    
    
        ipip[SQ033]
        Like order.
    
    
        ipip[SQ034]
        Change my mood a lot.
    
    
        ipip[SQ035]
        Am quick to understand things.
    
    
        ipip[SQ036]
        Don't like to draw attention to myself.
    
    
        ipip[SQ037]
        Take time out for others.
    
    
        ipip[SQ038]
        Shirk my duties.
    
    
        ipip[SQ039]
        Have frequent mood swings.
    
    
        ipip[SQ040]
        Use difficult words.
    
    
        ipip[SQ041]
        Don't mind being the centre of attention.
    
    
        ipip[SQ042]
        Feel others' emotions.
    
    
        ipip[SQ043]
        Follow a schedule.
    
    
        ipip[SQ044]
        Get irritated easily.
    
    
        ipip[SQ045]
        Spend time reflecting on things.
    
    
        ipip[SQ046]
        Am quiet around strangers.
    
    
        ipip[SQ047]
        Make people feel at ease.
    
    
        ipip[SQ048]
        Am exacting in my work.
    
    
        ipip[SQ049]
        Often feel blue.
    
    
        ipip[SQ050]
        Am full of ideas.
    

    STAI

    Indicate how you feel right now

        Code
        Text
    
    
        STAI[SQ001]
        I feel calm
    
    
        STAI[SQ002]
        I feel secure
    
    
        STAI[SQ003]
        I am tense
    
    
        STAI[SQ004]
        I feel strained
    
    
        STAI[SQ005]
        I feel at ease
    
    
        STAI[SQ006]
        I feel upset
    
    
        STAI[SQ007]
        I am presently worrying over possible misfortunes
    
    
        STAI[SQ008]
        I feel satisfied
    
    
        STAI[SQ009]
        I feel frightened
    
    
        STAI[SQ010]
        I feel comfortable
    
    
        STAI[SQ011]
        I feel self-confident
    
    
        STAI[SQ012]
        I feel nervous
    
    
        STAI[SQ013]
        I am jittery
    
    
        STAI[SQ014]
        I feel indecisive
    
    
        STAI[SQ015]
        I am relaxed
    
    
        STAI[SQ016]
        I feel content
    
    
        STAI[SQ017]
        I am worried
    
    
        STAI[SQ018]
        I feel confused
    
    
        STAI[SQ019]
        I feel steady
    
    
        STAI[SQ020]
        I feel pleasant
    

    TTM

    Do you engage in regular physical activity according to the definition above? How frequently did each event or experience occur in the past month?

        Code
        Text
    
    
        processes[SQ002]
        I read articles to learn more about physical
    
  16. E

    Data from: Global hydrological dataset of daily streamflow data from the...

    • catalogue.ceh.ac.uk
    • hosted-metadata.bgs.ac.uk
    • +3more
    zip
    Updated May 28, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    S. Turner; J. Hannaford; L.J. Barker; G. Suman; R. Armitage; A. Killeen; A. Griffin; H. Davies; A. Kumar; H. Dixon; M.T.D. Albuquerque; N. Almeida Ribeiro; C. Alvarez-Garreton; E. Amoussou; B. Arheimer; Y. Asano; T. Berezowski; A. Bodian; H. Boutaghane; R. Capell; H. Dakhaoui; J. Daňhelka; H.X. Do; C. Ekkawatpanit; E.M. El Khalki; A.K. Fleig; R. Fonseca; J.D. Giraldo-Osorio; A.B.T. Goula; M. Hanel; G Hodgkins; S. Horton; C. Kan; D.G. Kingston; G. Laaha; R. Laugesen; W. Lopes; S. Mager; Y. Markonis; L. Mediero; G. Midgley; C. Murphy; P. O'Connor; A.I. Pedersen; H.T. Pham; M. Piniewski; M. Rachdane; B. Renard; M.E. Saidi; P. Schmocker-Facker; K. Stahl; M. Thyler; M. Toucher; Y. Tramblay; J. Uusikivi; N. Venegas-Cordero; S. Vissesri; A. Watson; S. Westra; P.H. Whitfield (2024). Global hydrological dataset of daily streamflow data from the Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN), 1863 - 2022 [Dataset]. http://doi.org/10.5285/3b077711-f183-42f1-bac6-c892922c81f4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 28, 2024
    Dataset provided by
    NERC EDS Environmental Information Data Centre
    Authors
    S. Turner; J. Hannaford; L.J. Barker; G. Suman; R. Armitage; A. Killeen; A. Griffin; H. Davies; A. Kumar; H. Dixon; M.T.D. Albuquerque; N. Almeida Ribeiro; C. Alvarez-Garreton; E. Amoussou; B. Arheimer; Y. Asano; T. Berezowski; A. Bodian; H. Boutaghane; R. Capell; H. Dakhaoui; J. Daňhelka; H.X. Do; C. Ekkawatpanit; E.M. El Khalki; A.K. Fleig; R. Fonseca; J.D. Giraldo-Osorio; A.B.T. Goula; M. Hanel; G Hodgkins; S. Horton; C. Kan; D.G. Kingston; G. Laaha; R. Laugesen; W. Lopes; S. Mager; Y. Markonis; L. Mediero; G. Midgley; C. Murphy; P. O'Connor; A.I. Pedersen; H.T. Pham; M. Piniewski; M. Rachdane; B. Renard; M.E. Saidi; P. Schmocker-Facker; K. Stahl; M. Thyler; M. Toucher; Y. Tramblay; J. Uusikivi; N. Venegas-Cordero; S. Vissesri; A. Watson; S. Westra; P.H. Whitfield
    License

    https://eidc.ac.uk/licences/ogl/plainhttps://eidc.ac.uk/licences/ogl/plain

    Time period covered
    Jan 1, 1863 - Dec 31, 2022
    Area covered
    Earth
    Dataset funded by
    Natural Environment Research Councilhttps://www.ukri.org/councils/nerc
    Description

    The Reference Observatory of Basins for INternational hydrological climate change detection (ROBIN) dataset is a global hydrological dataset containing publicly available daily flow data for 2,386 gauging stations across the globe which have natural or near-natural catchments. Metadata is also provided alongside these stations for the Full ROBIN Dataset consisting of 3,060 gauging stations. Data were quality controlled by the central ROBIN team before being added to the dataset, and two levels of data quality are applied to guide users towards appropriate the data usage. Most records have data of at least 40 years with minimal missing data with data records starting in the late 19th Century for some sites through to 2022. ROBIN represents a significant advance in global-scale, accessible streamflow data. The project was funded the UK Natural Environment Research Council Global Partnership Seedcorn Fund - NE/W004038/1 and the NC-International programme [NE/X006247/1] delivering National Capability

  17. e

    Simple download service (Atom) of the dataset: Type C maps of areas where...

    • data.europa.eu
    Updated Feb 18, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Simple download service (Atom) of the dataset: Type C maps of areas where limit values are exceeded (night) for the Côte-d’Or rail network (3 rd deadline) [Dataset]. https://data.europa.eu/data/datasets/fr-120066022-srv-dbd03929-90b1-44b4-ab98-104727685fb1/
    Explore at:
    inspire download serviceAvailable download formats
    Dataset updated
    Feb 18, 2022
    Description

    These maps also referred to as “type C cards” represent the parts of territories likely to contain buildings exceeding the limit values referred to in Article L571-6 of the Environmental Code and laid down by Article 7 of the Order of 4 April 2006.

    They concern the Côte-d’Or rail network.

    Geographic objects have been aggregated and cut together to avoid overlap.

    The limit values correspond to a Ln of 62 dB(A). These limit values apply to residential buildings, as well as healthcare and educational establishments.

    These aggregated data are published for use in mapping purposes. It is advisable to load the detail data for more accurate use.

  18. h

    dataset-openmoji

    • huggingface.co
    Updated Aug 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RayT (2024). dataset-openmoji [Dataset]. https://huggingface.co/datasets/Kray-C/dataset-openmoji
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 2, 2024
    Authors
    RayT
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Dataset OpenMoji

    Creator: https://www.kaggle.com/krayc81This is base on https://openmoji.org/License https://creativecommons.org/licenses/by-sa/4.0
    Files:

    README.md this :) data.csv containing all data see bellow description openmoji folder containing the image files

    The data.csv contains:

    idx the character as int character text representation bytes representation hex representation (replace Ox with U+ for unicode) description of the emoji path_black path to the bw image… See the full description on the dataset page: https://huggingface.co/datasets/Kray-C/dataset-openmoji.

  19. Data from: LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive...

    • data.europa.eu
    • zenodo.org
    unknown
    Updated Jul 12, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2022). LifeSnaps: a 4-month multi-modal dataset capturing unobtrusive snapshots of our lives in the wild [Dataset]. https://data.europa.eu/data/datasets/oai-zenodo-org-6832242?locale=fr
    Explore at:
    unknown(642961582)Available download formats
    Dataset updated
    Jul 12, 2022
    Dataset authored and provided by
    Zenodohttp://zenodo.org/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    LifeSnaps Dataset Documentation Ubiquitous self-tracking technologies have penetrated various aspects of our lives, from physical and mental health monitoring to fitness and entertainment. Yet, limited data exist on the association between in the wild large-scale physical activity patterns, sleep, stress, and overall health, and behavioral patterns and psychological measurements due to challenges in collecting and releasing such datasets, such as waning user engagement, privacy considerations, and diversity in data modalities. In this paper, we present the LifeSnaps dataset, a multi-modal, longitudinal, and geographically-distributed dataset, containing a plethora of anthropological data, collected unobtrusively for the total course of more than 4 months by n=71 participants, under the European H2020 RAIS project. LifeSnaps contains more than 35 different data types from second to daily granularity, totaling more than 71M rows of data. The participants contributed their data through numerous validated surveys, real-time ecological momentary assessments, and a Fitbit Sense smartwatch, and consented to make these data available openly to empower future research. We envision that releasing this large-scale dataset of multi-modal real-world data, will open novel research opportunities and potential applications in the fields of medical digital innovations, data privacy and valorization, mental and physical well-being, psychology and behavioral sciences, machine learning, and human-computer interaction. The following instructions will get you started with the LifeSnaps dataset and are complementary to the original publication. Data Import: Reading CSV For ease of use, we provide CSV files containing Fitbit, SEMA, and survey data at daily and/or hourly granularity. You can read the files via any programming language. For example, in Python, you can read the files into a Pandas DataFrame with the pandas.read_csv() command. Data Import: Setting up a MongoDB (Recommended) To take full advantage of the LifeSnaps dataset, we recommend that you use the raw, complete data via importing the LifeSnaps MongoDB database. To do so, open the terminal/command prompt and run the following command for each collection in the DB. Ensure you have MongoDB Database Tools installed from here. For the Fitbit data, run the following: mongorestore --host localhost:27017 -d rais_anonymized -c fitbit

  20. Z

    WormSwin: C. elegans Video Datasets

    • data.niaid.nih.gov
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Deserno, Maurice; Bozek, Katarzyna (2024). WormSwin: C. elegans Video Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7456802
    Explore at:
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    University of Cologne
    Authors
    Deserno, Maurice; Bozek, Katarzyna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data used for our paper "WormSwin: Instance Segmentation of C. elegans using Vision Transformer".This publication is divided into three parts:

    CSB-1 Dataset

    Synthetic Images Dataset

    MD Dataset

    The CSB-1 Dataset consists of frames extracted from videos of Caenorhabditis elegans (C. elegans) annotated with binary masks. Each C. elegans is separately annotated, providing accurate annotations even for overlapping instances. All annotations are provided in binary mask format and as COCO Annotation JSON files (see COCO website).

    The videos are named after the following pattern:

    <"worm age in hours"_"mutation"_"irradiated (binary)"_"video index (zero based)">

    For mutation the following values are possible:

    wild type

    csb-1 mutant

    csb-1 with rescue mutation

    An example video name would be 24_1_1_2 meaning it shows C. elegans with csb-1 mutation, being 24h old which got irradiated.

    Video data was provided by M. Rieckher; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.The Synthetic Images Dataset was created by cutting out C. elegans (foreground objects) from the CSB-1 Dataset and placing them randomly on background images also taken from the CSB-1 Dataset. Foreground objects were flipped, rotated and slightly blurred before placed on the background images.The same was done with the binary mask annotations taken from CSB-1 Dataset so that they match the foreground objects in the synthetic images. Additionally, we added rings of random color, size, thickness and position to the background images to simulate petri-dish edges.

    This synthetic dataset was generated by M. Deserno.The Mating Dataset (MD) consists of 450 grayscale image patches of 1,012 x 1,012 px showing C. elegans with high overlap, crawling on a petri-dish.We took the patches from a 10 min. long video of size 3,036 x 3,036 px. The video was downsampled from 25 fps to 5 fps before selecting 50 random frames for annotating and patching.Like the other datasets, worms were annotated with binary masks and annotations are provided as COCO Annotation JSON files.

    The video data was provided by X.-L. Chu; Instance Segmentation Annotations were created under supervision of K. Bozek and M. Deserno.

    Further details about the datasets can be found in our paper.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
nasa.gov (2010). C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal [Dataset]. https://data.nasa.gov/dataset/c-mapss-aircraft-engine-simulator-data
Organization logo

C-MAPSS Aircraft Engine Simulator Data - Dataset - NASA Open Data Portal

Explore at:
25 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Sep 22, 2010
Dataset provided by
NASAhttp://nasa.gov/
Description

SPECIAL NOTE: C-MAPSS and C-MAPSS40K ARE CURRENTLY UNAVAILABLE FOR DOWNLOAD. Glenn Research Center management is reviewing the availability requirements for these software packages. We are working with Center management to get the review completed and issues resolved in a timely manner. We will post updates on this website when the issues are resolved. We apologize for any inconvenience. Please contact Jonathan Litt, jonathan.s.litt@nasa.gov, if you have any questions in the meantime. Subject Area: Engine Health Description: This data set was generated with the C-MAPSS simulator. C-MAPSS stands for 'Commercial Modular Aero-Propulsion System Simulation' and it is a tool for the simulation of realistic large commercial turbofan engine data. Each flight is a combination of a series of flight conditions with a reasonable linear transition period to allow the engine to change from one flight condition to the next. The flight conditions are arranged to cover a typical ascent from sea level to 35K ft and descent back down to sea level. The fault was injected at a given time in one of the flights and persists throughout the remaining flights, effectively increasing the age of the engine. The intent is to identify which flight and when in the flight the fault occurred. How Data Was Acquired: The data provided is from a high fidelity system level engine simulation designed to simulate nominal and fault engine degradation over a series of flights. The simulated data was created with a Matlab Simulink tool called C-MAPSS. Sample Rates and Parameter Description: The flights are full flight recordings sampled at 1 Hz and consist of 30 engine and flight condition parameters. Each flight contains 7 unique flight conditions for an approximately 90 min flight including ascent to cruise at 35K ft and descent back to sea level. The parameters for each flight are the flight conditions, health indicators, measurement temperatures and pressure measurements. Faults/Anomalies: Faults arose from the inlet engine fan, the low pressure compressor, the high pressure compressor, the high pressure turbine and the low pressure turbine.

Search
Clear search
Close search
Google apps
Main menu