Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.
Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.
The data set will require around 10 GB of total free space after extraction.
The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.
Data Set Structure
The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:
data subfolder: contains all original .csv and intermediate .npy data files.
models
pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_pdp_diff.py
validation subfolder: contains data for motion detection validation
events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.
pdp: this folder contains raw PDP measurements in .csv format.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_events_0.py
generate_events_1.py
generate_events_2.py
generate_pdp_diff.py
figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.
Process subfolder: contains Jupyter notebooks with data processing and motion detection code.
MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection
OnlineModels: presents the development process of online models definitions
PDP_diff: presents the basic principle of PDP differences used in the motion detection
Validation: presents a motion detection validation process
Raw data structure
All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:
unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]
This resource contains the Python version of the Layered Green & Ampt with Redistribution (LGAR) model at the time of publication of the LGAR paper: "Layered Green and Ampt Infiltration With Redistribution", La Follette et al. 2023. While LGAR was initially developed in Python, we no longer maintain the Python version of LGAR; development and maintenance have switched to C. LGAR as implemented in C can be found here: https://github.com/NOAA-OWP/LGAR-C . We strongly recommend the use of the C version. The C version has undergone substantial stability testing and development since the publication of the manuscript, and the python version is no longer supported. LGAR as implemented in Python can also be found on github: https://github.com/NOAA-OWP/LGAR-Py/tree/LGAR-Py_public .
LGAR is a model which partitions precipitation into infiltration and runoff, and is designed for use in arid or semi arid climates. LGAR's main selling point is that it closely mimics precipitation partitioning results as simulated by the Richards equation (RE), without the inherent reliability and stability challenges the RE poses. Therefore, this model is useful when accurate, stable precipitation partitioning simulations are desired in arid or semi arid areas. LGAR as implemented in Python is BMI compatible. LGAR has been converted to C as well; we recommend the use of the C version ( https://github.com/NOAA-OWP/LGAR-C ) as the Python version is no longer maintained.
The contents of the first two blocks of the config file for a model run are described here.
First block:
time_step: this is the model time step, expressed in hours. It defaults to a value of 300/3600, or 5 minutes expressed in hours.
initial_psi: this is the uniform capillary head throughout the model domain expressed in mm. Note that LGAR uses absolute values for capillary head, such that a value of 20000 mm for initial_psi physically represents soil with a capillary head of -20000 mm.
verbose: this can be True or False, where no output is printed to the screen during a model run if it is False.
length_of_simulation: This is the length of the simulation, in time steps.
closed_form_capillary_drive_term: set to True or False. This determines if the capillary drive term G is calculated with the numeric integral of hydraulic conductivity with respect to capillary head, or if the equivalence relations between van Genuchten and Brooks - Corey hydraulic models is used to achieve an approximate, closed form equation for G. Setting this value to True generally significantly increases the speed while insignificantly altering model results.
Second block:
output_file_name_fluxes: this will be the path and name of the output file which contains the fluxes for each time step of the simulation
params_file: this is the path and name of the parameters file
forcing_data_file: this is the forcing data file that is in the correct format for LGAR-Py
Each parameter file, in the parameter_files folder, only has to be edited in the first block, which contains options related to soil hydraulic parameters, number of layers, maximum ponded head, and options related to the PET -> AET correction.
This folder contains sub folders for each model run. All that is necessary to run LGAR is correctly formatted forcing data as a .csv. Raw USDA SCAN data and notebooks that convert these raw data to the format usable by LGAR are also provided. Currently, it is necessary that the forcing data resolution is the same as time_step as specified in the config file.
LGAR also requires the files LGAR_compute.py, test_env.py, and BMI_base_class.py, to be in the same directory.
The Jupyter notebooks (in vis_files) are useful for visualization of results. HYDRUS_files contains HYDRUS-1D model runs which are set up to simulate the same soil hydraulic conditions and forcing data as various LGAR runs.
In order to run LGAR, ensure that the correct config file is indicated in test_env.py, and then navigate in a terminal to the directory containing test_env.py and enter "python test_env.py".
In the outputs folder, there are 6 complete simulation outputs, including 3 simulations of USDA SCAN sites, and 3 simulations with synthetically generated forcing datasets. All of the files necessary to run these simulations are also included. In order to check if LGAR is working properly for you, you can run these simulations and compare your results agains the outputs stored in this repo.
LGAR is designed for arid and semi arid areas only. Essentially, LGAR should not be used for groundwater simulations, and should only be used in semi arid or arid areas. Please see "Layered Green and Ampt Infiltration With Redistribution" (La Follette et al., 2023) for more details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dear reader,
Welcome! You must be an avid profilometry person to be interested in downloading our dataset.
Before you start tinkering with the dataset package please do install the requirements.txt libraries for a more easy step into operating this system.
We hope to have made the hierarchy of the package as clear as possible! Also note that this system was written in VScode.
Find your way to the examples folder, there you can find "entire_dataset". This folder contains a script to divide the original h5 file containing all data
into whatever sub-options you'd like. An example divided dataset has already been given namely the 80/20 division of respectively training and validation
data in the "example_dataset" folder.
In the folder models you will find the two models mentioned in the publication related to this dataset. These two were published with the dataset since they
had either the highest performance on the training and validation dataset (DenseNet) or on the random physical object test (UNet).
A training script is included (training_script.py) to show you how these models were created, so if you wish to add new models to the networks.py file in the classes folder, you can!
The validation jupyter notebook contains two visualisation tools to quickly and neatly show the performance of your model on the recorded dataset.
Lastly to test on the recorded object you can run the "test_physical_data.py" script.
We hope this helps you in your research and we hope it further improves any and all research within the single shot profilometry field! 😊
Kind regards,
Rhys Evans,
InViLab,
University of Antwerp, Belgium
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Replication Package
This repository contains data and source files needed to replicate our work described in the paper "Unboxing Default Argument Breaking Changes in Scikit Learn".
Requirements
We recommend the following requirements to replicate our study:
Package Structure
We relied on Docker containers to provide a working environment that is easier to replicate. Specifically, we configure the following containers:
data-analysis
, an R-based Container we used to run our data analysis.data-collection
, a Python Container we used to collect Scikit's default arguments and detect them in client applications.database
, a Postgres Container we used to store clients' data, obtainer from Grotov et al.storage
, a directory used to store the data processed in data-analysis
and data-collection
. This directory is shared in both containers.docker-compose.yml
, the Docker file that configures all containers used in the package.In the remainder of this document, we describe how to set up each container properly.
Using VSCode to Setup the Package
We selected VSCode as the IDE of choice because its extensions allow us to implement our scripts directly inside the containers. In this package, we provide configuration parameters for both data-analysis
and data-collection
containers. This way you can directly access and run each container inside it without any specific configuration.
You first need to set up the containers
$ cd /replication/package/folder
$ docker-compose build
$ docker-compose up
# Wait docker creating and running all containers
Then, you can open them in Visual Studio Code:
If you want/need a more customized organization, the remainder of this file describes it in detail.
Longest Road: Manual Package Setup
Database Setup
The database container will automatically restore the dump in dump_matroskin.tar
in its first launch. To set up and run the container, you should:
Build an image:
$ cd ./database
$ docker build --tag 'dabc-database' .
$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
dabc-database latest b6f8af99c90d 50 minutes ago 18.5GB
Create and enter inside the container:
$ docker run -it --name dabc-database-1 dabc-database
$ docker exec -it dabc-database-1 /bin/bash
root# psql -U postgres -h localhost -d jupyter-notebooks
jupyter-notebooks=# \dt
List of relations
Schema | Name | Type | Owner
--------+-------------------+-------+-------
public | Cell | table | root
public | Code_cell | table | root
public | Md_cell | table | root
public | Notebook | table | root
public | Notebook_features | table | root
public | Notebook_metadata | table | root
public | repository | table | root
If you got the tables list as above, your database is properly setup.
It is important to mention that this database is extended from the one provided by Grotov et al.. Basically, we added three columns in the table Notebook_features
(API_functions_calls
, defined_functions_calls
, andother_functions_calls
) containing the function calls performed by each client in the database.
Data Collection Setup
This container is responsible for collecting the data to answer our research questions. It has the following structure:
dabcs.py
, extract DABCs from Scikit Learn source code, and export them to a CSV file.dabcs-clients.py
, extract function calls from clients and export them to a CSV file. We rely on a modified version of Matroskin to leverage the function calls. You can find the tool's source code in the `matroskin`` directory.Makefile
, commands to set up and run both dabcs.py
and dabcs-clients.py
matroskin
, the directory containing the modified version of matroskin tool. We extended the library to collect the function calls performed on the client notebooks of Grotov's dataset.storage
, a docker volume where the data-collection should save the exported data. This data will be used later in Data Analysis.requirements.txt
, Python dependencies adopted in this module.Note that the container will automatically configure this module for you, e.g., install dependencies, configure matroskin, download scikit learn source code, etc. For this, you must run the following commands:
$ cd ./data-collection
$ docker build --tag "data-collection" .
$ docker run -it -d --name data-collection-1 -v $(pwd)/:/data-collection -v $(pwd)/../storage/:/data-collection/storage/ data-collection
$ docker exec -it data-collection-1 /bin/bash
$ ls
Dockerfile Makefile config.yml dabcs-clients.py dabcs.py matroskin storage requirements.txt utils.py
If you see project files, it means the container is configured accordingly.
Data Analysis Setup
We use this container to conduct the analysis over the data produced by the Data Collection container. It has the following structure:
dependencies.R
, an R script containing the dependencies used in our data analysis.data-analysis.Rmd
, the R notebook we used to perform our data analysisdatasets
, a docker volume pointing to the storage
directory.Execute the following commands to run this container:
$ cd ./data-analysis
$ docker build --tag "data-analysis" .
$ docker run -it -d --name data-analysis-1 -v $(pwd)/:/data-analysis -v $(pwd)/../storage/:/data-collection/datasets/ data-analysis
$ docker exec -it data-analysis-1 /bin/bash
$ ls
data-analysis.Rmd datasets dependencies.R Dockerfile figures Makefile
If you see project files, it means the container is configured accordingly.
A note on storage
shared folder
As mentioned, the storage
folder is mounted as a volume and shared between data-collection
and data-analysis
containers. We compressed the content of this folder due to space constraints. Therefore, before starting working on Data Collection or Data Analysis, make sure you extracted the compressed files. You can do this by running the Makefile
inside storage
folder.
$ make unzip # extract files
$ ls
clients-dabcs.csv clients-validation.csv dabcs.csv Makefile scikit-learn-versions.csv versions.csv
$ make zip # compress files
$ ls
csv-files.tar.gz Makefile
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Primary author details
John van Osta ORCID: 0000-0001-6196-1241 Institution: Griffith University and E2M Pty Ltd Queensland, Australia Email: john.vanosta@griffithuni.edu.au
Researchers and practictioners applying and adapting the data and code provided here are encouraged to contact the primary author should they require further information.
Sharing/Access information Licence CC BY 4.0. You are free to share and adapt the material provided provided appropriate attribution is given to the authors.
Data and File Overview
This repository provides the code base, supplementary results and example audio data to reproduce the findings of the research article: 'An active learning framework and assessment of inter-annotator agreement facilitate automated recogniser development for vocalisations of a rare species, the southern black-throated finch (Poephila cincta cincta)', published in the Journal of Ecological Informatics. Data included within this repository are listed below.
Code base The code base includes: - train_resnet.ipynb: Trains a resnet34 model on target and non-target audio segments (each 1.8 seconds in duration). Outputs a trained model (as a pth file). - predict.ipynb: Applies the trained model to unlabelled data. - BTF_detector_v1.5 is the latest version of the model, termed the 'final model' in the the research article. - audio_file_extract.ipynb to extract audio frames in accordance with the active learning function. For the purpose of manual review and inclusion in the next iteration of model training. - stratified_subsample.ipynb: Used to subsample predictions on unlabelled data that are stratified across the model prediction confidence scores (aka logit). - macro_averaged_error.ipynb: Calculate and plot macro averaged error of the model predictions against annotator labels. - inter_annotator_agreement.ipynb: Calculate and plot Krippendorff's alpha (a measure of inter-annotator agreeement) among the model's active learning iterations and human annotators. - requirements.txt: Python package requirements to run the code base.
Note: The code base has been written in Jupyter Notebooks and tested in Python version 3.6.9
Supplementary files The file Stratified_subsample_inter_annotator_agreement.xlsx contains predictions from each model iteration and annotator labels for each of the 12,278 audio frames included in the model evaluation process, as described in the research article.
Example audio data Example audio data provided include: - Target audio files (containing black-throated finch (BTF) calls) and non-target audio files (containing other environmental noises). These are split into Training and Validation sets. To follow an active learning process, each active learning 'iteration' gets added to a new folder (i.e. IT_1, IT_2, etc..). - Field recordings (10 minutes each), the majority of which contain BTF calls. These audio data were collected from a field site within the Desert Uplands Bioregion of Queensland, Austrlaia, as described and mapped in the research article. Audio data were collected using two devices: Audiomoths and Bioacoustic records (Frontier Labs), which have been separated into separate folders in the 'Field_recordings'.
Steps to reproduce
General recommendations The code base has been written in Jupyter Notebooks and tested in Python version 3.6.9. 1. Download the .zip file and extract to a folder on your machine. 2. Open a code editor that is suitable for working with Jupyter Notebook files. We recommend Microsoft's free software: Visual Studio Code (https://code.visualstudio.com/). If using Visual Studio Code, ensure the 'Python' and 'Jupyter' extensions are installed (https://code.visualstudio.com/docs/datascience/jupyter-notebooks). 3. Within the code editor open the downloaded file. 4. Setup the python environment by installing the package requirements identified within the requirements.txt file contained within the repository. The steps to setup a python environment in Visual Studio Code are described here: https://code.visualstudio.com/docs/python/environments, or more generally for python described here: https://packaging.python.org/en/latest/guides/installing-using-pip-and-virtual-environments/. This will download the necessary python packages to support the below code.
Note: We recommend running the following steps on a Windows computer with an Nvidia graphics processing unit (GPU). The code has also been tested on a Windows computer with an Intel computer processing unit (CPU), with a substantially slower runtime. Edits to the code may be required to run on a Macintosh computer or a non-Nvidia GPU; however, the core functionality will remain the same.
Active learning iterations to develop the final model: 1. Run train_resnet.ipynb to train a model from the initial target (BTF) and non-target (other environmental sounds) audio provided. The default name for the output model will be 'model.pth', however this may be adjusted manually by changing the 'MODEL_NAME' variable. The script also provides performance metrics and a confusion matrix against the validation dataset. 2. Run predict.ipynb to make predictions on unlabelled data. The default code uses the final model (BTF_trained_model_v1.5.pth), as described in the research article, however this may be adjusted to link to the model created in step 4 (by changing the 'model_path' variable). Results of this step are saved in the Sample_files\Predict_results folder. 3. Run audio_file_extract.ipynb to extract 1.8 second audio snips that have a 'BTF' confidence score of >= 0.5. these are the sounds that range from most uncertain to the model to most likely to be BTF. The logic for this cutoff is discussed in the research article's methods section. The default extraction location is 'Sample_files\Predict_results\Audio_frames_for_review'. 4. Manually review extracted audio frames and move them to the appropriate folder of the training data. E.g. for audio frames that are reviewed to contain: - BTF calls, move them to the filepath 'Sample_files\Training_clips\Train\BTF\IT_2' - Not BTF calls, move them to the filepath 'Sample_files\Training_clips\Train\Not BTF\IT_2' IT_2 represents the second active learning iteration. Ensure 30% of the files are allocated to the validation set ('Sample_files\Training_clips\Val). Note that users will need to create subfolders for each successive iteration. 5. Repeat steps 1 to 4, making sure to update the 'iterations' variable in the train_resnet.ipynb code to include all active learning iterations undertaken. For example, to include iterations 1 and 2 in the model, set the variable 'iterations'to equal ['IT_1', 'IT_2']. An example is provided in the train_resnet.ipynb code. 6. Stop the active learning process when the stopping criterion is reached (e.g. when the F1 score plateaus).
Model evaluation steps 1. Run predict.ipynb using the final model on an unlabelled test dataset. By default the unlabelled audio data that will be used is saved at the filepath: example data saved in the filepath 'Sample_files\Field_recordings\Audiomoth'. However, this should be changed to data not used to train the model, such as 'Sample_files\Field_recordings\BAR', or your own audio data. 2. Run stratified_subsample.ipynb to subsample the predictions that the final model made on the unlabelled data. A stratified subsample approach is used, whereby samples are stratified across confidence scores, which is described in the research article. The default output file is 'stratified_subsample_predictions.csv' 3. We then manually reviewed the subsamples, including a cross review by experts on the species, as detailed in the research article. We have provide the results of our model evaluation: 'Study_results\Stratified_subsample_inter_annotator_agreement.xlsx' 4. Run macro_averaged_error.ipynb and inter_annotator_agreement.ipynb to reproduce the results and plots contained within the paper.
Using the model on your own data The predict.ipynb code may be adapted to run the BTF call detection model on data outside of this repository.
Notes for running on your own data: - Accepts wav or flac files - Accepts files from Audiomoth devices, using the file naming format: 'AM###_YYYMMDD_HHMMSS' - Accepts files from Bioacoustic Recorder Devices (Frontier Labs), using the file naming format: 'BAR##_YYYMMDDTHHMMSS+TZ_REC'
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This package contains the complete experimental data explained in:
Karakurt, A., Şentürk S., & Serra X. (In Press). MORTY: A Toolbox for Mode Recognition and Tonic Identification. 3rd International Digital Libraries for Musicology Workshop.
Please cite the paper above, if you are using the data in your work.
The zip file includes the folds, features, training and testing data, results and evaluation file. It is part of the experiments hosted in github (https://github.com/sertansenturk/makam_recognition_experiments/tree/dlfm2016) in the folder call "./data". We host the experimental data in Zenodo (http://dx.doi.org/10.5281/zenodo.57999) separately due to the file size limitations in github.
The files generated from audio recordings are labeled with 16 character long MusicBrainz IDs (in short "MBID"s) Please check http://musicbrainz.org/ for more information about the unique identifiers. The structure of the data in the zip file is explained below. In the paths given below task is the computational task ("tonic," "mode" or "joint"), training_type is either "single" (-distribution per mode) or "multi" (-distribution per mode), distribution is either "pcd" (pitch class distribution) or "pd" (pitch distribution), bin_size is the bin size of the distribution in cents, kernel_width is the standard deviation of the Gaussian kernel used in smoothing the distribution, distance is either the distance or the dissimilarity metric, num_neighbors is the number or neighbors checked in k-nearest neighbor classification and min_peak is the minimum peak ratio. 0 kernel_width implies no smoothing. min_peak always takes the value 0.15. For a thorough explanation please refer to the companion page (http://compmusic.upf.edu/node/319) and the paper itself.
folds.json: Divides the test dataset (https://github.com/MTG/otmm_makam_recognition_dataset/releases) into training and testing sets according to stratified 10-fold scheme. The annotations are also distributed to sets accordingly. The file is generated by the Jupyter notebook setup_feature_training.ipynb (4th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb).
Features: The path is data/features/[distribution--bin_size--kernel_width]/[MBID--(hist or pdf)].json. "pdf" stands for probability density function, which is used to obtain the multi-distribution models in the training step and "hist" stands for the histogram, which is used to obtain the single-distribution models in the training step. The features are extracted using the Jupyter notebook setup_feature_training.ipynb (5th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb)
Training: The path is data/training/[training_type--distribution--bin_size--kernel_width]/fold(0:9).json]. There are 10 folds in each folder, each of which stores the training model (file paths of the distributions in "multi" training_type or the distributions itself in "single" training_type) trained for the fold using the parameter set. The training files are generated by the Jupyter notebook setup_feature_training.ipynb (6th code block) in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/setup_feature_training.ipynb)
Testing: The path is data/testing/[task]/[training_type--distribution--bin_size--kernel_width--distance--num_neighbors--min_peak]. Each path has the folders fold(0:9), which have the evaluation and the results files obtained from each fold. The path also has the overall_eval.json file, which stores the overall evaluation of the experiment. The optimal value of min_peak is selected in the 4th code block, testing is carried in the 6th code clock and the evaluation is done in the 7th code block in the Jupyter notebook testing_evaluation.ipynb in the github experiments repository (https://github.com/sertansenturk/makam_recognition_experiments/blob/master/testing_evaluation.ipynb). data/testing/ folder also contains a summary of all the experiments in the files data/testing/evaluation_overall.json and data/testing/evaluation_perfold.json. These files are created in MATLAB while running the statistical significance scripts. data/testing/evaluation_perfold.mat is the same with the json file of the same filename, stored for fast reading.
For additional information please contact the authors.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This repository contains the Julia code, Jupyter notebook, and data used in the study “Healthy young adults use distinct gait strategies to enhance stability when walking on mild slopes and when altering arm swing” by MacDonald et al. Instructions To run this analysis on your computer, both Julia and Jupyter Notebook must be installed. A version of Julia appropriate for your OS can be downloaded from the Julia website, and Jupyter can be installed from within Julia (in the REPL) with ] add IJulia Alternate instructions for installing Jupyter can be found on the IJulia github or the Jupyter homepage (not recommended). From within the main repository directory, start Julia and then start Jupyter in the Julia REPL using IJulia notebook(;dir=pwd()) or if using a system Jupyter installation, start Jupyter from your favorite available shell (e.g. Powershell on Windows, bash on any *nix variant, etc.). In Jupyter, open the notebooks/analysis.ipynb notebook. Running all cells will reproduce the results for this paper. Description of data The data directory contains all the data used in the production of the results which were statistically tested. Each .mat file contains events and data generated in Visual3D: Events LTO/RTO (Left/right toe-off) LHS/RHS (Left/right heel-strike) HIST/HIEN (Hilly start/end) ROST/ROEN (Rocky start/end) MLST/MLEN (ML translation start/end) Data LFootPos/RFootPos (Left/right foot COM position) TrunkPos/TrunkVel/TrunkAcc (Trunk COM position, velocity, and acceleration) HeadPos/HeadVel/HeadAcc (Head COM position, velocity, and acceleration) COG (Whole-body COM/COG) The .csv files contain system state of the CAREN system produced by D-Flow software, which includes various system and software settings, most pertinent of which is the treadmill speed. The .c3d files contain the raw motion capture data from Vicon Nexus. The results of the notebooks/analysis.ipynb notebook are found in the results folder. Please see the paper for a list of the dependent variables and statistical analyses. This work was supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery grant RGPIN-2016-04928, NSERC Accelerator supplement RGPAS 493045-2016 and by the Ontario Ministry of Research, Innovation and Science Early Researcher Award (ERA) 16-12-206.
Accident Detection Model is made using YOLOv8, Google Collab, Python, Roboflow, Deep Learning, OpenCV, Machine Learning, Artificial Intelligence. It can detect an accident on any accident by live camera, image or video provided. This model is trained on a dataset of 3200+ images, These images were annotated on roboflow.
https://user-images.githubusercontent.com/78155393/233774342-287492bb-26c1-4acf-bc2c-9462e97a03ca.png" alt="Survey">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository accompanying the paper 'BeeDNA: microfluidic environmental DNA metabarcoding as a tool for connecting plant and pollinator communities' by Harper et al. (2021).
1_Raw_Data.zip This zipped folder contains the raw sequence data (sorted by primer set and demultiplexed) for both sequencing runs (2019-10-24 and 2019-11-11). To decompress each file, run:
tar -xvf filename.bz2
This will create a folder for each primer set containing the raw reads for each sample/control.
2_Anacapa_Bioinformatic_Processing.zip
This zipped folder contains all files needed to perform bioinformatic processing with Anacapa. Please process sequence data belonging to each primer set individually (i.e. do not process sequence data belonging to different primer sets together).
3_metaBEAT_Bioinformatic_Processing.zip
This zipped folder contains the scripts and files needed to perform bioinformatic processing with metaBEAT. Before running the scripts, move the raw reads for each sample belonging to each primer set into the dedicated folder within metaBEAT_Bioinformatic_Processing, e.g. all .fastq files in Raw_Data > BF1_BR1 should be moved to metaBEAT_Bioinformatic_Processing > BF1-BR1 > raw_reads.
To run metaBEAT, you will have to install Docker on your computer. Docker is compatible with all major operating systems, but see the Docker documentation for details. On Ubuntu, installing Docker should be as easy as:
sudo apt-get install docker.io
Once Docker is installed, you can enter the environment by typing:
sudo docker run -i -t --net=host --name metaBEAT -v $(pwd):/home/working chrishah/metabeat /bin/bash
This will download the metaBEAT image (if not yet present on your computer) and enter the 'container', i.e. the self contained environment (NB: sudo may be necessary in some cases). With the above command, the container's directory /home/working will be mounted to your current working directory (as instructed by $(pwd)). In other words, anything you do in the container's /home/working directory will be synced with your current working directory on your local machine.
Please process sequence data belonging to each primer set individually (i.e. do not process sequence data belonging to different primer sets together). An example of expected outputs can be seen in the Jupyter Notebook for the BF1/BR1 primer set from the 2019-11-11 sequencing run.
4_Illinois_Invert_Reference_Database.zip
This zipped folder contains all files that were used to generate the custom COI and 16S reference databases for invertebrates that occur in Illinois, U.S. You will need to have metaBEAT installed (see above) before you try to run any Jupyter Notebooks (.ipynb files).
5_ecoPCR.zip
This zipped folder contains all files used to perform ecoPCR for each primer set evaluated for microfluidic eDNA metabarcoding. You will need to install ecoPCR before running any shell scripts.
6_Tidied_Data.zip
This zipped folder contains the taxonomically assigned data for both sequencing runs produced by metaBEAT and Anacapa. These were copied over from the folders 2_Anacapa_Bioinformatic_Processing and 3_metaBEAT_Bioinformatic_Processing and rearranged into a more logical order. These files are used as the input for data analysis using R.
7_Data_Analysis.zip
This zipped folder contains all scripts and metadata required to summarise and statistically analyse data in R.
Please contact Dr Lynsey Harper (lynsey.harper2@gmail.com) or Dr Mark Davis (davis63@illinois.edu) if you encounter any issues!
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.
Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.
The data set will require around 10 GB of total free space after extraction.
The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.
Data Set Structure
The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:
data subfolder: contains all original .csv and intermediate .npy data files.
models
pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_pdp_diff.py
validation subfolder: contains data for motion detection validation
events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.
pdp: this folder contains raw PDP measurements in .csv format.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_events_0.py
generate_events_1.py
generate_events_2.py
generate_pdp_diff.py
figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.
Process subfolder: contains Jupyter notebooks with data processing and motion detection code.
MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection
OnlineModels: presents the development process of online models definitions
PDP_diff: presents the basic principle of PDP differences used in the motion detection
Validation: presents a motion detection validation process
Raw data structure
All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:
unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]