9 datasets found
  1. Z

    UWB Motion Detection Data Set

    • data.niaid.nih.gov
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Klemen Bregar (2022). UWB Motion Detection Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4613124
    Explore at:
    Dataset updated
    Feb 11, 2022
    Dataset provided by
    Mihael Mohorčič
    Andrej Hrovat
    Klemen Bregar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.

    Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.

    The data set will require around 10 GB of total free space after extraction.

    The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.

    Data Set Structure

    The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:

    data subfolder: contains all original .csv and intermediate .npy data files.

    models

    pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.

    pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

    generate_pdp_diff.py

    validation subfolder: contains data for motion detection validation

    events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.

    pdp: this folder contains raw PDP measurements in .csv format.

    pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

    generate_events_0.py

    generate_events_1.py

    generate_events_2.py

    generate_pdp_diff.py

    figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.

    Process subfolder: contains Jupyter notebooks with data processing and motion detection code.

    MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection

    OnlineModels: presents the development process of online models definitions

    PDP_diff: presents the basic principle of PDP differences used in the motion detection

    Validation: presents a motion detection validation process

    Raw data structure

    All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:

    unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]

  2. H

    LGAR code in Python as of publication of related paper

    • beta.hydroshare.org
    • hydroshare.org
    • +1more
    zip
    Updated Dec 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter La Follette; Fred L. Ogden (2024). LGAR code in Python as of publication of related paper [Dataset]. http://doi.org/10.4211/hs.90951d952b034e7aa592898ab6d264eb
    Explore at:
    zip(43.3 MB)Available download formats
    Dataset updated
    Dec 19, 2024
    Dataset provided by
    HydroShare
    Authors
    Peter La Follette; Fred L. Ogden
    Description

    This resource contains the Python version of the Layered Green & Ampt with Redistribution (LGAR) model at the time of publication of the LGAR paper: "Layered Green and Ampt Infiltration With Redistribution", La Follette et al. 2023. While LGAR was initially developed in Python, we no longer maintain the Python version of LGAR; development and maintenance have switched to C. LGAR as implemented in C can be found here: https://github.com/NOAA-OWP/LGAR-C . We strongly recommend the use of the C version. The C version has undergone substantial stability testing and development since the publication of the manuscript, and the python version is no longer supported. LGAR as implemented in Python can also be found on github: https://github.com/NOAA-OWP/LGAR-Py/tree/LGAR-Py_public .

    Description

    LGAR is a model which partitions precipitation into infiltration and runoff, and is designed for use in arid or semi arid climates. LGAR's main selling point is that it closely mimics precipitation partitioning results as simulated by the Richards equation (RE), without the inherent reliability and stability challenges the RE poses. Therefore, this model is useful when accurate, stable precipitation partitioning simulations are desired in arid or semi arid areas. LGAR as implemented in Python is BMI compatible. LGAR has been converted to C as well; we recommend the use of the C version ( https://github.com/NOAA-OWP/LGAR-C ) as the Python version is no longer maintained.

    Config file contents

    The contents of the first two blocks of the config file for a model run are described here.

    First block:

    time_step: this is the model time step, expressed in hours. It defaults to a value of 300/3600, or 5 minutes expressed in hours.

    initial_psi: this is the uniform capillary head throughout the model domain expressed in mm. Note that LGAR uses absolute values for capillary head, such that a value of 20000 mm for initial_psi physically represents soil with a capillary head of -20000 mm.

    verbose: this can be True or False, where no output is printed to the screen during a model run if it is False.

    length_of_simulation: This is the length of the simulation, in time steps.

    closed_form_capillary_drive_term: set to True or False. This determines if the capillary drive term G is calculated with the numeric integral of hydraulic conductivity with respect to capillary head, or if the equivalence relations between van Genuchten and Brooks - Corey hydraulic models is used to achieve an approximate, closed form equation for G. Setting this value to True generally significantly increases the speed while insignificantly altering model results.

    Second block:

    output_file_name_fluxes: this will be the path and name of the output file which contains the fluxes for each time step of the simulation

    params_file: this is the path and name of the parameters file

    forcing_data_file: this is the forcing data file that is in the correct format for LGAR-Py

    parameter_files

    Each parameter file, in the parameter_files folder, only has to be edited in the first block, which contains options related to soil hydraulic parameters, number of layers, maximum ponded head, and options related to the PET -> AET correction.

    forcing_data_files

    This folder contains sub folders for each model run. All that is necessary to run LGAR is correctly formatted forcing data as a .csv. Raw USDA SCAN data and notebooks that convert these raw data to the format usable by LGAR are also provided. Currently, it is necessary that the forcing data resolution is the same as time_step as specified in the config file.

    other useful files

    LGAR also requires the files LGAR_compute.py, test_env.py, and BMI_base_class.py, to be in the same directory.

    The Jupyter notebooks (in vis_files) are useful for visualization of results. HYDRUS_files contains HYDRUS-1D model runs which are set up to simulate the same soil hydraulic conditions and forcing data as various LGAR runs.

    Running the model

    In order to run LGAR, ensure that the correct config file is indicated in test_env.py, and then navigate in a terminal to the directory containing test_env.py and enter "python test_env.py".

    How to test the software

    In the outputs folder, there are 6 complete simulation outputs, including 3 simulations of USDA SCAN sites, and 3 simulations with synthetically generated forcing datasets. All of the files necessary to run these simulations are also included. In order to check if LGAR is working properly for you, you can run these simulations and compare your results agains the outputs stored in this repo.

    Limitations

    LGAR is designed for arid and semi arid areas only. Essentially, LGAR should not be used for groundwater simulations, and should only be used in semi arid or arid areas. Please see "Layered Green and Ampt Infiltration With Redistribution" (La Follette et al., 2023) for more details.

  3. Z

    Data from: UA - Gaussian Depth Disc (GDD dataset)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Dec 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Van Der Jeught, Sam (2023). UA - Gaussian Depth Disc (GDD dataset) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10404433
    Explore at:
    Dataset updated
    Dec 19, 2023
    Dataset provided by
    Van Der Jeught, Sam
    Dirckx, Joris
    Devlieghere, Ester
    Keijzer, Robrecht
    Evans, Rhys
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dear reader,

    Welcome! You must be an avid profilometry person to be interested in downloading our dataset.

    Before you start tinkering with the dataset package please do install the requirements.txt libraries for a more easy step into operating this system.

    We hope to have made the hierarchy of the package as clear as possible! Also note that this system was written in VScode.

    Find your way to the examples folder, there you can find "entire_dataset". This folder contains a script to divide the original h5 file containing all data

    into whatever sub-options you'd like. An example divided dataset has already been given namely the 80/20 division of respectively training and validation

    data in the "example_dataset" folder.

    In the folder models you will find the two models mentioned in the publication related to this dataset. These two were published with the dataset since they

    had either the highest performance on the training and validation dataset (DenseNet) or on the random physical object test (UNet).

    A training script is included (training_script.py) to show you how these models were created, so if you wish to add new models to the networks.py file in the classes folder, you can!

    The validation jupyter notebook contains two visualisation tools to quickly and neatly show the performance of your model on the recorded dataset.

    Lastly to test on the recorded object you can run the "test_physical_data.py" script.

    We hope this helps you in your research and we hope it further improves any and all research within the single shot profilometry field! 😊

    Kind regards,

    Rhys Evans,

    InViLab,

    University of Antwerp, Belgium

  4. Replication Package: Unboxing Default Argument Breaking Changes in Scikit...

    • zenodo.org
    application/gzip
    Updated Aug 23, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Ghizlane El Boussaidi; Marco Tulio Valente; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Ghizlane El Boussaidi; Marco Tulio Valente (2023). Replication Package: Unboxing Default Argument Breaking Changes in Scikit Learn [Dataset]. http://doi.org/10.5281/zenodo.8132450
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Ghizlane El Boussaidi; Marco Tulio Valente; João Eduardo Montandon; Luciana Lourdes Silva; Cristiano Politowski; Ghizlane El Boussaidi; Marco Tulio Valente
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Replication Package

    This repository contains data and source files needed to replicate our work described in the paper "Unboxing Default Argument Breaking Changes in Scikit Learn".

    Requirements

    We recommend the following requirements to replicate our study:

    1. Internet access
    2. At least 100GB of space
    3. Docker installed
    4. Git installed

    Package Structure

    We relied on Docker containers to provide a working environment that is easier to replicate. Specifically, we configure the following containers:

    • data-analysis, an R-based Container we used to run our data analysis.
    • data-collection, a Python Container we used to collect Scikit's default arguments and detect them in client applications.
    • database, a Postgres Container we used to store clients' data, obtainer from Grotov et al.
    • storage, a directory used to store the data processed in data-analysis and data-collection. This directory is shared in both containers.
    • docker-compose.yml, the Docker file that configures all containers used in the package.

    In the remainder of this document, we describe how to set up each container properly.

    Using VSCode to Setup the Package

    We selected VSCode as the IDE of choice because its extensions allow us to implement our scripts directly inside the containers. In this package, we provide configuration parameters for both data-analysis and data-collection containers. This way you can directly access and run each container inside it without any specific configuration.

    You first need to set up the containers

    $ cd /replication/package/folder
    $ docker-compose build
    $ docker-compose up
    # Wait docker creating and running all containers
    

    Then, you can open them in Visual Studio Code:

    1. Open VSCode in project root folder
    2. Access the command palette and select "Dev Container: Reopen in Container"
      1. Select either Data Collection or Data Analysis.
    3. Start working

    If you want/need a more customized organization, the remainder of this file describes it in detail.

    Longest Road: Manual Package Setup

    Database Setup

    The database container will automatically restore the dump in dump_matroskin.tar in its first launch. To set up and run the container, you should:

    Build an image:

    $ cd ./database
    $ docker build --tag 'dabc-database' .
    $ docker image ls
    REPOSITORY  TAG    IMAGE ID    CREATED     SIZE
    dabc-database latest  b6f8af99c90d  50 minutes ago  18.5GB
    

    Create and enter inside the container:

    $ docker run -it --name dabc-database-1 dabc-database
    $ docker exec -it dabc-database-1 /bin/bash
    root# psql -U postgres -h localhost -d jupyter-notebooks
    jupyter-notebooks=# \dt
           List of relations
     Schema |    Name    | Type | Owner
    --------+-------------------+-------+-------
     public | Cell       | table | root
     public | Code_cell     | table | root
     public | Md_cell      | table | root
     public | Notebook     | table | root
     public | Notebook_features | table | root
     public | Notebook_metadata | table | root
     public | repository    | table | root
    

    If you got the tables list as above, your database is properly setup.

    It is important to mention that this database is extended from the one provided by Grotov et al.. Basically, we added three columns in the table Notebook_features (API_functions_calls, defined_functions_calls, andother_functions_calls) containing the function calls performed by each client in the database.

    Data Collection Setup

    This container is responsible for collecting the data to answer our research questions. It has the following structure:

    • dabcs.py, extract DABCs from Scikit Learn source code, and export them to a CSV file.
    • dabcs-clients.py, extract function calls from clients and export them to a CSV file. We rely on a modified version of Matroskin to leverage the function calls. You can find the tool's source code in the `matroskin`` directory.
    • Makefile, commands to set up and run both dabcs.py and dabcs-clients.py
    • matroskin, the directory containing the modified version of matroskin tool. We extended the library to collect the function calls performed on the client notebooks of Grotov's dataset.
    • storage, a docker volume where the data-collection should save the exported data. This data will be used later in Data Analysis.
    • requirements.txt, Python dependencies adopted in this module.

    Note that the container will automatically configure this module for you, e.g., install dependencies, configure matroskin, download scikit learn source code, etc. For this, you must run the following commands:

    $ cd ./data-collection
    $ docker build --tag "data-collection" .
    $ docker run -it -d --name data-collection-1 -v $(pwd)/:/data-collection -v $(pwd)/../storage/:/data-collection/storage/ data-collection
    $ docker exec -it data-collection-1 /bin/bash
    $ ls
    Dockerfile Makefile config.yml dabcs-clients.py dabcs.py matroskin storage requirements.txt utils.py
    

    If you see project files, it means the container is configured accordingly.

    Data Analysis Setup

    We use this container to conduct the analysis over the data produced by the Data Collection container. It has the following structure:

    • dependencies.R, an R script containing the dependencies used in our data analysis.
    • data-analysis.Rmd, the R notebook we used to perform our data analysis
    • datasets, a docker volume pointing to the storage directory.

    Execute the following commands to run this container:

    $ cd ./data-analysis
    $ docker build --tag "data-analysis" .
    $ docker run -it -d --name data-analysis-1 -v $(pwd)/:/data-analysis -v $(pwd)/../storage/:/data-collection/datasets/ data-analysis
    $ docker exec -it data-analysis-1 /bin/bash
    $ ls
    data-analysis.Rmd datasets dependencies.R Dockerfile figures Makefile
    

    If you see project files, it means the container is configured accordingly.

    A note on storage shared folder

    As mentioned, the storage folder is mounted as a volume and shared between data-collection and data-analysis containers. We compressed the content of this folder due to space constraints. Therefore, before starting working on Data Collection or Data Analysis, make sure you extracted the compressed files. You can do this by running the Makefile inside storage folder.

    $ make unzip # extract files
    $ ls
    clients-dabcs.csv clients-validation.csv dabcs.csv Makefile scikit-learn-versions.csv versions.csv
    $ make zip # compress files
    $ ls
    csv-files.tar.gz Makefile
  5. Supplementary material to the journal article: An active learning framework...

    • figshare.com
    zip
    Updated Jul 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John van Osta (2023). Supplementary material to the journal article: An active learning framework and assessment of inter-annotator agreement facilitate automated recogniser development for vocalisations of a rare species, the southern black-throated finch (Poephila cincta cincta) [Dataset]. http://doi.org/10.6084/m9.figshare.23053382.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    John van Osta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Primary author details

    John van Osta ORCID: 0000-0001-6196-1241 Institution: Griffith University and E2M Pty Ltd Queensland, Australia Email: john.vanosta@griffithuni.edu.au

    Researchers and practictioners applying and adapting the data and code provided here are encouraged to contact the primary author should they require further information.

    Sharing/Access information Licence CC BY 4.0. You are free to share and adapt the material provided provided appropriate attribution is given to the authors.

    Data and File Overview

    This repository provides the code base, supplementary results and example audio data to reproduce the findings of the research article: 'An active learning framework and assessment of inter-annotator agreement facilitate automated recogniser development for vocalisations of a rare species, the southern black-throated finch (Poephila cincta cincta)', published in the Journal of Ecological Informatics. Data included within this repository are listed below.

    Code base The code base includes: - train_resnet.ipynb: Trains a resnet34 model on target and non-target audio segments (each 1.8 seconds in duration). Outputs a trained model (as a pth file). - predict.ipynb: Applies the trained model to unlabelled data. - BTF_detector_v1.5 is the latest version of the model, termed the 'final model' in the the research article. - audio_file_extract.ipynb to extract audio frames in accordance with the active learning function. For the purpose of manual review and inclusion in the next iteration of model training. - stratified_subsample.ipynb: Used to subsample predictions on unlabelled data that are stratified across the model prediction confidence scores (aka logit). - macro_averaged_error.ipynb: Calculate and plot macro averaged error of the model predictions against annotator labels. - inter_annotator_agreement.ipynb: Calculate and plot Krippendorff's alpha (a measure of inter-annotator agreeement) among the model's active learning iterations and human annotators. - requirements.txt: Python package requirements to run the code base.

    Note: The code base has been written in Jupyter Notebooks and tested in Python version 3.6.9

    Supplementary files The file Stratified_subsample_inter_annotator_agreement.xlsx contains predictions from each model iteration and annotator labels for each of the 12,278 audio frames included in the model evaluation process, as described in the research article.

    Example audio data Example audio data provided include: - Target audio files (containing black-throated finch (BTF) calls) and non-target audio files (containing other environmental noises). These are split into Training and Validation sets. To follow an active learning process, each active learning 'iteration' gets added to a new folder (i.e. IT_1, IT_2, etc..). - Field recordings (10 minutes each), the majority of which contain BTF calls. These audio data were collected from a field site within the Desert Uplands Bioregion of Queensland, Austrlaia, as described and mapped in the research article. Audio data were collected using two devices: Audiomoths and Bioacoustic records (Frontier Labs), which have been separated into separate folders in the 'Field_recordings'.

    Steps to reproduce

    General recommendations The code base has been written in Jupyter Notebooks and tested in Python version 3.6.9. 1. Download the .zip file and extract to a folder on your machine. 2. Open a code editor that is suitable for working with Jupyter Notebook files. We recommend Microsoft's free software: Visual Studio Code (https://code.visualstudio.com/). If using Visual Studio Code, ensure the 'Python' and 'Jupyter' extensions are installed (https://code.visualstudio.com/docs/datascience/jupyter-notebooks). 3. Within the code editor open the downloaded file. 4. Setup the python environment by installing the package requirements identified within the requirements.txt file contained within the repository. The steps to setup a python environment in Visual Studio Code are described here: https://code.visualstudio.com/docs/python/environments, or more generally for python described here: https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/. This will download the necessary python packages to support the below code.

    Note: We recommend running the following steps on a Windows computer with an Nvidia graphics processing unit (GPU). The code has also been tested on a Windows computer with an Intel computer processing unit (CPU), with a substantially slower runtime. Edits to the code may be required to run on a Macintosh computer or a non-Nvidia GPU; however, the core functionality will remain the same.

    Active learning iterations to develop the final model: 1. Run train_resnet.ipynb to train a model from the initial target (BTF) and non-target (other environmental sounds) audio provided. The default name for the output model will be 'model.pth', however this may be adjusted manually by changing the 'MODEL_NAME' variable. The script also provides performance metrics and a confusion matrix against the validation dataset. 2. Run predict.ipynb to make predictions on unlabelled data. The default code uses the final model (BTF_trained_model_v1.5.pth), as described in the research article, however this may be adjusted to link to the model created in step 4 (by changing the 'model_path' variable). Results of this step are saved in the Sample_files\Predict_results folder. 3. Run audio_file_extract.ipynb to extract 1.8 second audio snips that have a 'BTF' confidence score of >= 0.5. these are the sounds that range from most uncertain to the model to most likely to be BTF. The logic for this cutoff is discussed in the research article's methods section. The default extraction location is 'Sample_files\Predict_results\Audio_frames_for_review'. 4. Manually review extracted audio frames and move them to the appropriate folder of the training data. E.g. for audio frames that are reviewed to contain: - BTF calls, move them to the filepath 'Sample_files\Training_clips\Train\BTF\IT_2' - Not BTF calls, move them to the filepath 'Sample_files\Training_clips\Train\Not BTF\IT_2' IT_2 represents the second active learning iteration. Ensure 30% of the files are allocated to the validation set ('Sample_files\Training_clips\Val). Note that users will need to create subfolders for each successive iteration. 5. Repeat steps 1 to 4, making sure to update the 'iterations' variable in the train_resnet.ipynb code to include all active learning iterations undertaken. For example, to include iterations 1 and 2 in the model, set the variable 'iterations'to equal ['IT_1', 'IT_2']. An example is provided in the train_resnet.ipynb code. 6. Stop the active learning process when the stopping criterion is reached (e.g. when the F1 score plateaus).

    Model evaluation steps 1. Run predict.ipynb using the final model on an unlabelled test dataset. By default the unlabelled audio data that will be used is saved at the filepath: example data saved in the filepath 'Sample_files\Field_recordings\Audiomoth'. However, this should be changed to data not used to train the model, such as 'Sample_files\Field_recordings\BAR', or your own audio data. 2. Run stratified_subsample.ipynb to subsample the predictions that the final model made on the unlabelled data. A stratified subsample approach is used, whereby samples are stratified across confidence scores, which is described in the research article. The default output file is 'stratified_subsample_predictions.csv' 3. We then manually reviewed the subsamples, including a cross review by experts on the species, as detailed in the research article. We have provide the results of our model evaluation: 'Study_results\Stratified_subsample_inter_annotator_agreement.xlsx' 4. Run macro_averaged_error.ipynb and inter_annotator_agreement.ipynb to reproduce the results and plots contained within the paper.

    Using the model on your own data The predict.ipynb code may be adapted to run the BTF call detection model on data outside of this repository.

    Notes for running on your own data: - Accepts wav or flac files - Accepts files from Audiomoth devices, using the file naming format: 'AM###_YYYMMDD_HHMMSS' - Accepts files from Bioacoustic Recorder Devices (Frontier Labs), using the file naming format: 'BAR##_YYYMMDDTHHMMSS+TZ_REC'

  6. Z

    Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic...

    • data.niaid.nih.gov
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sertan Şentürk (2020). Experiments of the Paper "MORTY: A Toolbox for Mode Recognition and Tonic Identification" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_57999
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset authored and provided by
    Sertan Şentürk
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    This package contains the complete experimental data explained in:

    Karakurt, A., Şentürk S., & Serra X. (In Press). MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop.

    Please cite the paper above, if you are using the data in your work.

    The zip file includes the folds, features, training and testing data, results and evaluation file. It is part of the experiments hosted in github (https://github.com/sertansenturk/makam_recognition_experiments/tree/dlfm2016) in the folder call "./data". We host the experimental data in Zenodo (http://dx.doi.org/10.5281/zenodo.57999) separately due to the file size limitations in github.

    The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short "MBID"s) Please check http://musicbrainz.org/ for more information about the unique identifiers. The structure of the data in the zip file is explained below. In the paths given below task is the computational task ("tonic," "mode" or "joint"), training_type is either "single" (-distribution per mode) or "multi" (-distribution per mode), distribution is either "pcd" (pitch class distribution) or "pd" (pitch distribution), bin_size is the bin size of the distribution in cents, kernel_width is the standard deviation of the Gaussian kernel used in smoothing the distribution, distance is either the distance or the dissimilarity metric, num_neighbors is the number or neighbors checked in k-nearest neighbor classification and min_peak is the minimum peak ratio. 0 kernel_width implies no smoothing. min_peak always takes the value 0.15. For a thorough explanation please refer to the companion page (http://compmusic.upf.edu/node/319) and the paper itself.

    folds.json: Divides the test dataset (https://github.com/MTG/otmm_makam_recognition_dataset/releases) into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by the Jupyter notebook setup_feature_training.ipynb (4th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb).

    Features: The path is data/features/[distribution--bin_size--kernel_width]/[MBID--(hist or pdf)].json. "pdf" stands for probability density function, which is used to obtain the multi-distribution models in the training step and "hist" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook setup_feature_training.ipynb (5th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb)

    Training: The path is data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]. There are 10 folds in each folder, each of which stores the training model (file paths of the distributions in "multi" training_type or the distributions itself in "single" training_type) trained for the fold using the parameter set. The training files are generated by the Jupyter notebook setup_feature_training.ipynb (6th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb)

    Testing: The path is data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]. Each path has the folders fold(0:9), which have the evaluation and the results files obtained from each fold. The path also has the overall_eval.json file, which stores the overall evaluation of the experiment. The optimal value of min_peak is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th code block in the Jupyter notebook testing_evaluation.ipynb in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/testing_evaluation.ipynb). data/testing/ folder also contains a summary of all the experiments in the files data/testing/evaluation_overall.json and data/testing/evaluation_perfold.json. These files are created in MATLAB while running the statistical significance scripts. data/testing/evaluation_perfold.mat is the same with the json file of the same filename, stored for fast reading.

    For additional information please contact the authors.

    This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

  7. o

    Data and code from: Healthy young adults use distinct gait strategies to...

    • explore.openaire.eu
    • data.niaid.nih.gov
    Updated Oct 28, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mary-Elise L MacDonald; Tarique Siragy; Allen Hill; Julie Nantel (2021). Data and code from: Healthy young adults use distinct gait strategies to enhance stability when walking on mild slopes and when altering arm swing [Dataset]. http://doi.org/10.5281/zenodo.5608534
    Explore at:
    Dataset updated
    Oct 28, 2021
    Authors
    Mary-Elise L MacDonald; Tarique Siragy; Allen Hill; Julie Nantel
    Description

    This repository contains the Julia code, Jupyter notebook, and data used in the study “Healthy young adults use distinct gait strategies to enhance stability when walking on mild slopes and when altering arm swing” by MacDonald et al. Instructions To run this analysis on your computer, both Julia and Jupyter Notebook must be installed. A version of Julia appropriate for your OS can be downloaded from the Julia website, and Jupyter can be installed from within Julia (in the REPL) with ] add IJulia Alternate instructions for installing Jupyter can be found on the IJulia github or the Jupyter homepage (not recommended). From within the main repository directory, start Julia and then start Jupyter in the Julia REPL using IJulia notebook(;dir=pwd()) or if using a system Jupyter installation, start Jupyter from your favorite available shell (e.g. Powershell on Windows, bash on any *nix variant, etc.). In Jupyter, open the notebooks/analysis.ipynb notebook. Running all cells will reproduce the results for this paper. Description of data The data directory contains all the data used in the production of the results which were statistically tested. Each .mat file contains events and data generated in Visual3D: Events LTO/RTO (Left/right toe-off) LHS/RHS (Left/right heel-strike) HIST/HIEN (Hilly start/end) ROST/ROEN (Rocky start/end) MLST/MLEN (ML translation start/end) Data LFootPos/RFootPos (Left/right foot COM position) TrunkPos/TrunkVel/TrunkAcc (Trunk COM position, velocity, and acceleration) HeadPos/HeadVel/HeadAcc (Head COM position, velocity, and acceleration) COG (Whole-body COM/COG) The .csv files contain system state of the CAREN system produced by D-Flow software, which includes various system and software settings, most pertinent of which is the treadmill speed. The .c3d files contain the raw motion capture data from Vicon Nexus. The results of the notebooks/analysis.ipynb notebook are found in the results folder. Please see the paper for a list of the dependent variables and statistical analyses. This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grant RGPIN-2016-04928, NSERC Accelerator supplement RGPAS 493045-2016 and by the Ontario Ministry of Research, Innovation and Science Early Researcher Award (ERA) 16-12-206.

  8. R

    Accident Detection Model Dataset

    • universe.roboflow.com
    zip
    Updated Apr 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Accident detection model (2024). Accident Detection Model Dataset [Dataset]. https://universe.roboflow.com/accident-detection-model/accident-detection-model/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 8, 2024
    Dataset authored and provided by
    Accident detection model
    Variables measured
    Accident Bounding Boxes
    Description

    Accident-Detection-Model

    Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.

    Problem Statement

    • Road accidents are a major problem in India, with thousands of people losing their lives and many more suffering serious injuries every year.
    • According to the Ministry of Road Transport and Highways, India witnessed around 4.5 lakh road accidents in 2019, which resulted in the deaths of more than 1.5 lakh people.
    • The age range that is most severely hit by road accidents is 18 to 45 years old, which accounts for almost 67 percent of all accidental deaths.

    Accidents survey

    https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">

    Literature Survey

    • Sreyan Ghosh in Mar-2019, The goal is to develop a system using deep learning convolutional neural network that has been trained to identify video frames as accident or non-accident.
    • Deeksha Gour Sep-2019, uses computer vision technology, neural networks, deep learning, and various approaches and algorithms to detect objects.

    Research Gap

    • Lack of real-world data - We trained model for more then 3200 images.
    • Large interpretability time and space needed - Using google collab to reduce interpretability time and space required.
    • Outdated Versions of previous works - We aer using Latest version of Yolo v8.

    Proposed methodology

    • We are using Yolov8 to train our custom dataset which has been 3200+ images, collected from different platforms.
    • This model after training with 25 iterations and is ready to detect an accident with a significant probability.

    Model Set-up

    Preparing Custom dataset

    • We have collected 1200+ images from different sources like YouTube, Google images, Kaggle.com etc.
    • Then we annotated all of them individually on a tool called roboflow.
    • During Annotation we marked the images with no accident as NULL and we drew a box on the site of accident on the images having an accident
    • Then we divided the data set into train, val, test in the ratio of 8:1:1
    • At the final step we downloaded the dataset in yolov8 format.
      #### Using Google Collab
    • We are using google colaboratory to code this model because google collab uses gpu which is faster than local environments.
    • You can use Jupyter notebooks, which let you blend code, text, and visualisations in a single document, to write and run Python code using Google Colab.
    • Users can run individual code cells in Jupyter Notebooks and quickly view the results, which is helpful for experimenting and debugging. Additionally, they enable the development of visualisations that make use of well-known frameworks like Matplotlib, Seaborn, and Plotly.
    • In Google collab, First of all we Changed runtime from TPU to GPU.
    • We cross checked it by running command ‘!nvidia-smi’
      #### Coding
    • First of all, We installed Yolov8 by the command ‘!pip install ultralytics==8.0.20’
    • Further we checked about Yolov8 by the command ‘from ultralytics import YOLO from IPython.display import display, Image’
    • Then we connected and mounted our google drive account by the code ‘from google.colab import drive drive.mount('/content/drive')’
    • Then we ran our main command to run the training process ‘%cd /content/drive/MyDrive/Accident Detection model !yolo task=detect mode=train model=yolov8s.pt data= data.yaml epochs=1 imgsz=640 plots=True’
    • After the training we ran command to test and validate our model ‘!yolo task=detect mode=val model=runs/detect/train/weights/best.pt data=data.yaml’ ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt conf=0.25 source=data/test/images’
    • Further to get result from any video or image we ran this command ‘!yolo task=detect mode=predict model=runs/detect/train/weights/best.pt source="/content/drive/MyDrive/Accident-Detection-model/data/testing1.jpg/mp4"’
    • The results are stored in the runs/detect/predict folder.
      Hence our model is trained, validated and tested to be able to detect accidents on any video or image.

    Challenges I ran into

    I majorly ran into 3 problems while making this model

    • I got difficulty while saving the results in a folder, as yolov8 is latest version so it is still underdevelopment. so i then read some blogs, referred to stackoverflow then i got to know that we need to writ an extra command in new v8 that ''save=true'' This made me save my results in a folder.
    • I was facing problem on cvat website because i was not sure what
  9. Z

    Data from: BeeDNA: microfluidic environmental DNA metabarcoding as a tool...

    • data.niaid.nih.gov
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Niemiller, Matthew L. (2022). BeeDNA: microfluidic environmental DNA metabarcoding as a tool for connecting plant and pollinator communities [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5667638
    Explore at:
    Dataset updated
    Sep 8, 2022
    Dataset provided by
    Davis, Mark A.
    Niemiller, Matthew L.
    Harper, Lynsey R.
    Paddock, Lauren E.
    Benito, Joseph B.
    Molano-Flores, Brenda
    Knittle, E.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data repository accompanying the paper 'BeeDNA: microfluidic environmental DNA metabarcoding as a tool for connecting plant and pollinator communities' by Harper et al. (2021).

    1_Raw_Data.zip This zipped folder contains the raw sequence data (sorted by primer set and demultiplexed) for both sequencing runs (2019-10-24 and 2019-11-11). To decompress each file, run:

    tar -xvf filename.bz2

    This will create a folder for each primer set containing the raw reads for each sample/control.

    2_Anacapa_Bioinformatic_Processing.zip

    This zipped folder contains all files needed to perform bioinformatic processing with Anacapa. Please process sequence data belonging to each primer set individually (i.e. do not process sequence data belonging to different primer sets together).

    3_metaBEAT_Bioinformatic_Processing.zip

    This zipped folder contains the scripts and files needed to perform bioinformatic processing with metaBEAT. Before running the scripts, move the raw reads for each sample belonging to each primer set into the dedicated folder within metaBEAT_Bioinformatic_Processing, e.g. all .fastq files in Raw_Data > BF1_BR1 should be moved to metaBEAT_Bioinformatic_Processing > BF1-BR1 > raw_reads.

    To run metaBEAT, you will have to install Docker on your computer. Docker is compatible with all major operating systems, but see the Docker documentation for details. On Ubuntu, installing Docker should be as easy as:

    sudo apt-get install docker.io

    Once Docker is installed, you can enter the environment by typing:

    sudo docker run -i -t --net=host --name metaBEAT -v $(pwd):/home/working chrishah/metabeat /bin/bash

    This will download the metaBEAT image (if not yet present on your computer) and enter the 'container', i.e. the self contained environment (NB: sudo may be necessary in some cases). With the above command, the container's directory /home/working will be mounted to your current working directory (as instructed by $(pwd)). In other words, anything you do in the container's /home/working directory will be synced with your current working directory on your local machine.

    Please process sequence data belonging to each primer set individually (i.e. do not process sequence data belonging to different primer sets together). An example of expected outputs can be seen in the Jupyter Notebook for the BF1/BR1 primer set from the 2019-11-11 sequencing run.

    4_Illinois_Invert_Reference_Database.zip

    This zipped folder contains all files that were used to generate the custom COI and 16S reference databases for invertebrates that occur in Illinois, U.S. You will need to have metaBEAT installed (see above) before you try to run any Jupyter Notebooks (.ipynb files).

    5_ecoPCR.zip

    This zipped folder contains all files used to perform ecoPCR for each primer set evaluated for microfluidic eDNA metabarcoding. You will need to install ecoPCR before running any shell scripts.

    6_Tidied_Data.zip

    This zipped folder contains the taxonomically assigned data for both sequencing runs produced by metaBEAT and Anacapa. These were copied over from the folders 2_Anacapa_Bioinformatic_Processing and 3_metaBEAT_Bioinformatic_Processing and rearranged into a more logical order. These files are used as the input for data analysis using R.

    7_Data_Analysis.zip

    This zipped folder contains all scripts and metadata required to summarise and statistically analyse data in R.

    Please contact Dr Lynsey Harper (lynsey.harper2@gmail.com) or Dr Mark Davis (davis63@illinois.edu) if you encounter any issues!

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Klemen Bregar (2022). UWB Motion Detection Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4613124

UWB Motion Detection Data Set

Explore at:
Dataset updated
Feb 11, 2022
Dataset provided by
Mihael Mohorčič
Andrej Hrovat
Klemen Bregar
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Introduction

This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.

Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.

The data set will require around 10 GB of total free space after extraction.

The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.

Data Set Structure

The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:

data subfolder: contains all original .csv and intermediate .npy data files.

models

pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.

pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

generate_pdp_diff.py

validation subfolder: contains data for motion detection validation

events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.

pdp: this folder contains raw PDP measurements in .csv format.

pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

generate_events_0.py

generate_events_1.py

generate_events_2.py

generate_pdp_diff.py

figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.

Process subfolder: contains Jupyter notebooks with data processing and motion detection code.

MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection

OnlineModels: presents the development process of online models definitions

PDP_diff: presents the basic principle of PDP differences used in the motion detection

Validation: presents a motion detection validation process

Raw data structure

All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:

unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]

Search
Clear search
Close search
Google apps
Main menu