39 datasets found

Jupyter Notebook Activity Dataset (rsds-20241113)
zenodo.org
application/gzip, zip
Updated Jan 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki (2025). Jupyter Notebook Activity Dataset (rsds-20241113) [Dataset]. http://doi.org/10.5281/zenodo.13357570
Explore at:
zip, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13357570
Dataset updated
Jan 18, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
List of data

rsds-20241113.zip: Collection of SQLite database files

image.tar.gz: Docker image provided in our data collection experiment

redspot-341ffa5.zip: Redspot source code (redspot@341ffa5)

Extended version of Section 2D of our paper
Redspot is a Jupyter extension (i.e., Python package) that records activity signals. However, it also offers interfaces to read recorded signals. The following shows the most basic usage of its command-line interface:

redspot replay

This command generates snapshots (.ipynb files) restored from the signal records. Note that this command does not produce a snapshot for every signal. Since the change represented by a single signal is typically minimal (e.g., one keystroke), generating a snapshot for each signal results in a meaninglessly large number of snapshots. However, we want to obtain signal-level snapshots for some analyses. In such cases, one can analyze them using the application programming interfaces:

from redspot import database

from redspot.notebook import Notebook

nbk = Notebook()

for signal in database.get("path-to-db"):

time, panel, kind, args = signal

nbk.apply(kind, args) # apply change

print(nbk) # print notebook

To record activities, one needs to run the Redspot command in the recording mode as follows:

redspot record

This command launches Jupyter Notebook with Redspot enabled. Activities made in the launched environment are stored in an SQLite file named ``redspot.db'' under the current path.

To launch the environment we provided to the participants, one first needs to download and import the image (image.tar.gz). One can then run the image with the following command:

docker run --rm -it -p8888:8888

Note that the SQLite file is generated in the running container. The file can be downloaded into the host machine via the file viewer of Jupyter Notebook.
Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...
figshare.com
txt
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard Ferrers; Speedtest Global Index (2023). Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus ALC - 2020, 2022 [Dataset]. http://doi.org/10.6084/m9.figshare.13621169.v24
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.13621169.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Richard Ferrers; Speedtest Global Index
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU

ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.

LAX file named 0320, when should be Q320. Amended in v8.

*lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)

Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.

This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.

** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).

** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract

Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).

**VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)

Melb 14784 lines Avg download speed 69.4M Tests 0.39M

SHG 31207 lines Avg 233.7M Tests 0.56M

ALC 113 lines Avg 51.5M Test 1092

BKK 29684 lines Avg 215.9M Tests 1.2M

LAX 15505 lines Avg 218.5M Tests 0.74M

v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record

** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.

** Other uses of Speedtest Open Data; - see link at Speedtest below.
Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...
zenodo.org
application/gzip
Updated Mar 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana (2021). Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks / Understanding and Improving the Quality and Reproducibility of Jupyter Notebooks [Dataset]. http://doi.org/10.5281/zenodo.3519618
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3519618
Dataset updated
Mar 16, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourages poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub. Based on the results, we proposed and evaluated Julynter, a linting tool for Jupyter Notebooks.

Papers:

PIMENTEL, J. F.; MURTA, L.; BRAGANHOLO, V.; FREIRE, J.; A large-scale study about quality and reproducibility of jupyter notebooks. In: International Conference on Mining Software Repositories (MSR), 2019, Montreal, Canada.

PIMENTEL, J. F.; MURTA, L.; BRAGANHOLO, V.; FREIRE, J.; Understanding and Improving the Quality and Reproducibility of Jupyter Notebooks. Empirical Software Engineering, 2021 (in press)

This repository contains three files:

db2020-09-22.dump.gz

sample.tar.gz

julynter_reproducility.tar.gz

Reproducing the Notebook Study

The db2020-09-22.dump.gz file contains a PostgreSQL dump of the database, with all the data we extracted from notebooks. For loading it, run:

gunzip -c db2020-09-22.dump.gz | psql jupyter

Note that this file contains only the database with the extracted data. The actual repositories are available in a google drive folder, which also contains the docker images we used in the reproducibility study. The repositories are stored as content/{hash_dir1}/{hash_dir2}.tar.bz2, where hash_dir1 and hash_dir2 are columns of repositories in the database.

For scripts, notebooks, and detailed instructions on how to analyze or reproduce the data collection, please check the instructions on the Jupyter Archaeology repository (tag 1.0.0)

The sample.tar.gz file contains the repositories obtained during the manual sampling.

Reproducing the Julynter Experiment

The julynter_reproducility.tar.gz file contains all the data collected in the Julynter experiment and the analysis notebooks. Reproducing the analysis is straightforward:

Uncompress the file: $ tar zxvf julynter_reproducibility.tar.gz

Install the dependencies: $ pip install julynter/requirements.txt

Run the notebooks in order: J1.Data.Collection.ipynb; J2.Recommendations.ipynb; J3.Usability.ipynb.

The collected data is stored in the julynter/data folder.

Changelog

2019/01/14 - Version 1 - Initial version
2019/01/22 - Version 2 - Update N8.Execution.ipynb to calculate the rate of failure for each reason
2019/03/13 - Version 3 - Update package for camera ready. Add columns to db to detect duplicates, change notebooks to consider them, and add N1.Skip.Notebook.ipynb and N11.Repository.With.Notebook.Restriction.ipynb.
2021/03/15 - Version 4 - Add Julynter experiment; Update database dump to include new data collected for the second paper; remove scripts and analysis notebooks from this package (moved to GitHub), add a link to Google Drive with collected repository files
Galaxy Training Material for the 'Use Jupyter notebooks in Galaxy' tutorial
zenodo.org
csv
Updated Apr 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Delphine Lariviere; Delphine Lariviere; Teresa Müller; Teresa Müller (2025). Galaxy Training Material for the 'Use Jupyter notebooks in Galaxy' tutorial [Dataset]. http://doi.org/10.5281/zenodo.15263830
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15263830
Dataset updated
Apr 22, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Delphine Lariviere; Delphine Lariviere; Teresa Müller; Teresa Müller
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was originally curated by Software Carpentry, a branch of The Carpentries non-profit organization, and is based on data from the Gapminder Foundation. It consists of six tabular CSV files containing GDP data for various countries across different years. The dataset was initially prepared for the Software Carpentry tutorial "Plotting and Programming in Python" and is also reused in the Galaxy Training Network (GTN) tutorial "Use Jupyter Notebooks in Galaxy."

This GTN tutorial provides an introduction to launching a Jupyter Notebook in Galaxy, installing dependencies, and importing and exporting data. It serves as a setup guide for a Jupyter Notebook environment that can be used to follow the Software Carpentry tutorial "Plotting and Programming in Python."
o
Demographic Analysis Workflow using Census API in Jupyter Notebook:...
openicpsr.org
delimited
Updated Jul 23, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donghwan Gu; Nathanael Rosenheim (2020). Demographic Analysis Workflow using Census API in Jupyter Notebook: 1990-2000 Population Size and Change [Dataset]. http://doi.org/10.3886/E120381V1
Explore at:
delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/E120381V1
Dataset updated
Jul 23, 2020
Dataset provided by
Texas A&M University
Authors
Donghwan Gu; Nathanael Rosenheim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Kentucky, Boone County, US Counties
Description
This archive reproduces a table titled "Table 3.1 Boone county population size, 1990 and 2000" from Wang and vom Hofe (2007, p.58). The archive provides a Jupyter Notebook that uses Python and can be run in Google Colaboratory. The workflow uses Census API to retrieve data, reproduce the table, and ensure reproducibility for anyone accessing this archive.The Python code was developed in Google Colaboratory, or Google Colab for short, which is an Integrated Development Environment (IDE) of JupyterLab and streamlines package installation, code collaboration and management. The Census API is used to obtain population counts from the 1990 and 2000 Decennial Census (Summary File 1, 100% data). All downloaded data are maintained in the notebook's temporary working directory while in use. The data are also stored separately with this archive.The notebook features extensive explanations, comments, code snippets, and code output. The notebook can be viewed in a PDF format or downloaded and opened in Google Colab. References to external resources are also provided for the various functional components. The notebook features code to perform the following functions:install/import necessary Python packagesintroduce a Census API Querydownload Census data via CensusAPI manipulate Census tabular data calculate absolute change and percent changeformatting numbersexport the table to csvThe notebook can be modified to perform the same operations for any county in the United States by changing the State and County FIPS code parameters for the Census API downloads. The notebook could be adapted for use in other environments (i.e., Jupyter Notebook) as well as reading and writing files to a local or shared drive, or cloud drive (i.e., Google Drive).
h
iris
huggingface.co
Updated Jul 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Energy Research Scientific Computing Center (2025). iris [Dataset]. https://huggingface.co/datasets/NERSC/iris
Explore at:
Dataset updated
Jul 1, 2025
Dataset authored and provided by
National Energy Research Scientific Computing Center
Description
Iris

The following code can be used to load the dataset from its stored location at NERSC. You may also access this code via a NERSC-hosted Jupyter notebook here.

Iris data loader

import pandas as pd iris_dat = pd.read_csv('/global/cfs/cdirs/dasrepo/www/ai_ready_datasets/iris/data/iris.csv')

If you would like to download the data, visit the following link: https://portal.nersc.gov/cfs/dasrepo/ai_ready_datasets/iris/data
d
Hydroinformatics Instruction Module Example Code: Programmatic Data Access...
search.dataone.org
hydroshare.org
+1more
Updated Dec 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amber Spackman Jones; Jeffery S. Horsburgh (2023). Hydroinformatics Instruction Module Example Code: Programmatic Data Access with USGS Data Retrieval [Dataset]. https://search.dataone.org/view/sha256%3A3b301506bb2be439d8f330b89de9c36ab074976044b6e5905593f2b7f5be772e
Explore at:
Dataset updated
Dec 30, 2023
Dataset provided by
Hydroshare
Authors
Amber Spackman Jones; Jeffery S. Horsburgh
Description
This resource contains Jupyter Notebooks with examples for accessing USGS NWIS data via web services and performing subsequent analysis related to drought with particular focus on sites in Utah and the southwestern United States (could be modified to any USGS sites). The code uses the Python DataRetrieval package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.

This resources consists of 6 example notebooks: 1. Example 1: Import and plot daily flow data 2. Example 2: Import and plot instantaneous flow data for multiple sites 3. Example 3: Perform analyses with USGS annual statistics data 4. Example 4: Retrieve data and find daily flow percentiles 3. Example 5: Further examination of drought year flows 6. Coding challenge: Assess drought severity
f
Cognitive Fatigue
figshare.com
csv
Updated Jun 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rui Varandas; Inês Silveira; Hugo Gamboa (2025). Cognitive Fatigue [Dataset]. http://doi.org/10.6084/m9.figshare.28188143.v3
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28188143.v3
Dataset updated
Jun 4, 2025
Dataset provided by
figshare
Authors
Rui Varandas; Inês Silveira; Hugo Gamboa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Cognitive Fatigue2.1. Experimental designCognitive fatigue (CF) is a phenomenon that arises following the prolonged engagement in mentally demanding cognitive tasks. Thus, we developed an experimental procedure that involved three demanding tasks: a digital lesson in Jupyter Notebook format, three repetitions of Corsi-Block task, and two repetitions of a concentration test.Before the Corsi-Block task and after the concentration task there were periods of baseline of two min. In our analysis, the first baseline period, although not explicitly present in the dataset, was designated as representing no CF, whereas the final baseline period was designated as representing the presence of CF. Between repetitions of the Corsi-Block task, there were periods of baseline of 15 s after the task and of 30 s before the beginning of each repetition of the task.2.2. Data recordingA data sample of 10 volunteer participants (4 females) aged between 22 and 48 years old (M = 28.2, SD = 7.6) took part in this study. All volunteers were recruited at NOVA School of Science and Technology, fluent in English, right-handed, none reported suffering from psychological disorders, and none reported taking regular medication. Written informed consent was obtained before participating and all Ethical Procedures approved by the Ethics Committee of NOVA University of Lisbon were thoroughly followed.In this study, we omitted the data from one participant due to the insufficient duration of data acquisition.2.3. Data labellingThe labels easy, difficult, very difficult and repeat found in the ECG_lesson_answers.txt files represent the subjects' opinion of the content read in the ECG lesson. The repeat label represents the most difficult level. It's called repeat because when you press it, the answer to the question is shown again. This system is based on the Anki system, which has been proposed and used to memorise information effectively. In addition, the PB description JSON files include timestamps indicating the start and end of cognitive tasks, baseline periods, and other events, which are useful for defining CF states as we defined in 2.1.2.4. Data descriptionBiosignals include EEG, fNIRS (not converted to oxi and deoxiHb), ECG, EDA, respiration (RIP), accelerometer (ACC), and push-button data (PB). All signals have already been converted to physical units. In each biosignal file, the first column corresponds to the timestamps.HCI features encompass keyboard, mouse, and screenshot data. Below is a Python code snippet for extracting screenshot files from the screenshots CSV file.import base64from os import mkdirfrom os.path import joinfile = '...'with open(file, 'r') as f: lines = f.readlines()for line in lines[1:]: timestamp = line.split(',')[0] code = line.split(',')[-1][:-2] imgdata = base64.b64decode(code) filename = str(timestamp) + '.jpeg' mkdir('screenshot') with open(join('screenshot', filename), 'wb') as f: f.write(imgdata)A characterization file containing age and gender information for all subjects in each dataset is provided within the respective dataset folder (e.g., D2_subject-info.csv). Other complementary files include (i) description of the pushbuttons to help segment the signals (e.g., D2_S2_PB_description.json) and (ii) labelling (e.g., D2_S2_ECG_lesson_results.txt). The files D2_Sx_results_corsi-block_board_1.json and D2_Sx_results_corsi-block_board_2.json show the results for the first and second iterations of the corsi-block task, where, for example, row_0_1 = 12 means that the subject got 12 pairs right in the first row of the first board, and row_0_2 = 12 means that the subject got 12 pairs right in the first row of the second board.
I
Data from: Enhancing Carrier Mobility In Monolayer MoS2 Transistors With...
databank.illinois.edu
aws-databank-alb.library.illinois.edu
Updated Mar 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yue Zhang; Helin Zhao; Siyuan Huang; Mohhamad Abir Hossain; Arend van der Zande (2024). Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain [Dataset]. http://doi.org/10.13012/B2IDB-4074704_V1
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-4074704_V1
Dataset updated
Mar 29, 2024
Authors
Yue Zhang; Helin Zhao; Siyuan Huang; Mohhamad Abir Hossain; Arend van der Zande
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Read me file for the data repository ******************************************************************************* This repository has raw data for the publication "Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain". We arrange the data following the figure in which it first appeared. For all electrical transfer measurement, we provide the up-sweep and down-sweep data, with voltage units in V and conductance unit in S. All Raman modes have unit of cm^-1. ******************************************************************************* How to use this dataset All data in this dataset is stored in binary Numpy array format as .npy file. To read a .npy file: use the Numpy module of the python language, and use np.load() command. Example: suppose the filename is example_data.npy. To load it into a python program, open a Jupyter notebook, or in the python program, run: import numpy as np data = np.load("example_data.npy") Then the example file is stored in the data object. *******************************************************************************
Z
UWB Motion Detection Data Set
data.niaid.nih.gov
Updated Feb 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mihael Mohorčič (2022). UWB Motion Detection Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4613124
Explore at:
Dataset updated
Feb 11, 2022
Dataset provided by
Mihael Mohorčič
Andrej Hrovat
Klemen Bregar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.

Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.

The data set will require around 10 GB of total free space after extraction.

The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.

Data Set Structure

The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:

data subfolder: contains all original .csv and intermediate .npy data files.

models

pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.

pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

generate_pdp_diff.py

validation subfolder: contains data for motion detection validation

events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.

pdp: this folder contains raw PDP measurements in .csv format.

pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

generate_events_0.py

generate_events_1.py

generate_events_2.py

generate_pdp_diff.py

figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.

Process subfolder: contains Jupyter notebooks with data processing and motion detection code.

MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection

OnlineModels: presents the development process of online models definitions

PDP_diff: presents the basic principle of PDP differences used in the motion detection

Validation: presents a motion detection validation process

Raw data structure

All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:

unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]
c
Data Repository for paper "Direct visualisation of domain wall pinning in...
research-data.cardiff.ac.uk
zip
Updated Apr 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joseph Askey; Sam Ladak; Wolfgang Langbein; Arjen Van Den Berg; Matthew Hunt; Lukas Payne; Ioannis Pitsios; Alaa Hejazi (2025). Data Repository for paper "Direct visualisation of domain wall pinning in sub-100nm 3D magnetic nanowires with cross-sectional curvature", Joseph Askey and Matthew Hunt et al. 2024 [Dataset]. http://doi.org/10.17035/cardiff.26763172.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.17035/cardiff.26763172.v3
Dataset updated
Apr 11, 2025
Dataset provided by
Cardiff University
Authors
Joseph Askey; Sam Ladak; Wolfgang Langbein; Arjen Van Den Berg; Matthew Hunt; Lukas Payne; Ioannis Pitsios; Alaa Hejazi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this paper we fabricate 3D ferromagnetic nanowires with sub-100nm using two photon lithography at wavelength of 405 nm. We demonstrate a range of novel domain wall textures via micromagnetics, and characterise our experimental systems using SEM, qDIC, AFM and MFM and observe domain wall pinning largely influenced by roughness and thickness gradients of the deposited magnetic material.In the dataset we provide the raw and processed data for each figure in the publication. The data repository folder is organised by figure, and subsequent directories containing raw data, processed data, blender file for all schematics (organised by figure in the filetree), python analysis scripts in the form of jupyter notebooks for all relevant figures, readme.txt files for directory navigation, etc.fig1 -master_blender_file_figs1-3.blend: blender master file for generating schematics shown in figs1-3 (organised by relevant folders in .blend file tree)-fig1a-d: saved as jpgs-fig1_README.txtfig2 -raw_sems: raw sem images as .tiff files-features: sem feature sizes as .txt files-fig2b.ipynb: sem feature size analysis notebook-fig2a-e: panels saved as .jpgs and .pngs-fig2_README.txtfig3 -fig3_vtk/ctw_hh.vtk: .vtk file for visualising the ctw domain wall in figure 3 of main paper-paraview/paraview_state_fig3_s3-5.pvsm: .pvsm file to load into paraview to visualise all domain wall types, other .vtks can be found in the appropriate folders-nmag_raw/: directory containing raw NMag python, data, mesh, h5 and q files. Used in the relaxation of head to head domain walls.-psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig3a.ipynb: jupyter notebook for loading the data and generating figure 3a-fig3a-g: all fig 3 panels saved as .jpgs and .pngs-fig3_README.txtfig4 -afm/: raw data (.gwy file)-mfm/: raw data (.gwy file)-fig4a & fig4b.png: of the analysed 2D and 3D afm images-fig4c & fig4d.png: of the analysed 2D and 3D mfm images-fig4e_4f.ipynb: jupyter notebook for generating figures 4e and 4f-fig4e & fig4f.png: binarized image and normalised count plot-img.png: raw binarized image of the mfm shown in figure 4c and 4d-peaks.csv: raw data peak fit data, pixel number versus normalised counts (see jupyter notebook for relevant columns)-fig4_README.txtfig5 -fig5a-b_sems/: folder containing raw uncropped .tiffs angled sem views of the sinusoidal nanowires analysed via mfm-fig5c-d_heatmaps/: folder containing raw data .csv data of pinning probability as function of position and in-plane field for l5 and l2 wires-fig5e-j_afm_tot_bk/: folder containing raw data (afm files, .gwy, .txt with z-profiles, roughness and waviness, and collated .csv files) for l5 and l2 wires-fig5c-d.ipynb: jupyter notebook for loading relevant data and generating fig5c and 5d heatmaps-fig5e-j.ipynb: jupyter notebook for loading relevant data and generating fig5e to fig 5j afm, projected total field and depinning fields-fig5a-5j: all sub-panels saved as .tif or .pngs-fig5_README.txts1 -k500_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 500 and linewidth 40 pixels -k5000_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 5000 and linewidth 40 pixels -fig_s1.ipynb: jupyter notebook for loading .csv data, analysing and generating fig_s1c.png to fig_s1f.png -fig_s1a-s1f.png: sub-panels of fig_s1 -phi_500.dat and phi_5000.dat: raw .dat files of the phase images taken from qDIC imaging, can be loaded as text image in ImageJ for processing. Images are rotated by -135.1 degrees with bicubic interpolation. -fig_s1_README.txts2 -psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig_s2.ipynb: jupyter notebook for loading the data and generating figure s2-fig_s2.png: raw .png file of fig_s2-fig_s2_README.txts3 -fig_s3_vtk/: folder containing .vtk file used for the avw shown in supplementary figure s3, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s3a-s3d.png: .png files of the sub-panels of supplementary figure s3-fig_s3_README.txts4 -fig_s4_vtk/: folder containing .vtk file used for the vw shown in supplementary figure s4, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s4a-s4d.png: .png files of the sub-panels of supplementary figure s4-fig_s4_README.txts5 -fig_s5_vtk/: folder containing .vtk file used for the cvw shown in supplementary figure s5, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s5a-s5d.png: .png files of the sub-panels of supplementary figure s5-fig_s5_README.txts6 -matt_sss_190903ja_inplane_oct19th.007: raw afm/mfm data-fig_s6a.png: processed image of the raw mfm data above-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw afm profile data-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw mfm profile data-fig_s6.ipynb: jupyter notebook used to load the above data and generate the z-profile and normalised phase plots-fig_s6d-e.png: .png files of the sub-panels shown in supplementary figure s6-fig_s6_README.txts7 -matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of bottom blue region-matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of top red region-fig_s7.ipynb: jupyter notebook used to load above data and generate the z-profiles and normalised phase plots-fig_s7f-g.png: .png files of the sub-panels shown in supplementary figure s7-fig_s7_README.txt-NOTE: the mfm image shown in this supplementary figure is identical to the one in figure 4d-c)s8 -l1/: folder containing raw data (.txt with z-profiles, roughness and waviness, and collated .csv files) for l1 wires-fig_s8.ipynb: jupyter notebook used to generate above data and generate the z-profile, total projected field and depinning fields-fig_s8.png: .png of supplementary figure s8.-fig_s8_README.txtst1 -st1_energies_pops.csv: contains the energy densities and population statistics of the relaxed dws in the micromagnetic simulations, shown in supplementary table 1.-st1_README.txt
Z
Data from: Long-Term Tracing of Indoor Solar Harvesting
data.niaid.nih.gov
zenodo.org
Updated Jul 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sigrist, Lukas (2024). Long-Term Tracing of Indoor Solar Harvesting [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3346975
Explore at:
Dataset updated
Jul 22, 2024
Dataset provided by
Thiele, Lothar
Sigrist, Lukas
Gomez, Andres
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset Information

This dataset presents long-term term indoor solar harvesting traces and jointly monitored with the ambient conditions. The data is recorded at 6 indoor positions with diverse characteristics at our institute at ETH Zurich in Zurich, Switzerland.

The data is collected with a measurement platform [3] consisting of a solar panel (AM-5412) connected to a bq25505 energy harvesting chip that stores the harvested energy in a virtual battery circuit. Two TSL45315 light sensors placed on opposite sides of the solar panel monitor the illuminance level and a BME280 sensor logs ambient conditions like temperature, humidity and air pressure.

The dataset contains the measurement of the energy flow at the input and the output of the bq25505 harvesting circuit, as well as the illuminance, temperature, humidity and air pressure measurements of the ambient sensors. The following timestamped data columns are available in the raw measurement format, as well as preprocessed and filtered HDF5 datasets:

V_in - Converter input/solar panel output voltage, in volt

I_in - Converter input/solar panel output current, in ampere

V_bat - Battery voltage (emulated through circuit), in volt

I_bat - Net Battery current, in/out flowing current, in ampere

Ev_left - Illuminance left of solar panel, in lux

Ev_right - Illuminance left of solar panel, in lux

P_amb - Ambient air pressure, in pascal

RH_amb - Ambient relative humidity, unit-less between 0 and 1

T_amb - Ambient temperature, in centigrade Celsius

The following publication presents and overview of the dataset and more details on the deployment used for data collection. A copy of the abstract is included in this dataset, see the file abstract.pdf.

L. Sigrist, A. Gomez, and L. Thiele. "Dataset: Tracing Indoor Solar Harvesting." In Proceedings of the 2nd Workshop on Data Acquisition To Analysis (DATA '19), 2019.

Folder Structure and Files

processed/ - This folder holds the imported, merged and filtered datasets of the power and sensor measurements. The datasets are stored in HDF5 format and split by measurement position posXX and and power and ambient sensor measurements. The files belonging to this folder are contained in archives named yyyy_mm_processed.tar, where yyyy and mm represent the year and month the data was published. A separate file lists the exact content of each archive (see below).

raw/ - This folder holds the raw measurement files recorded with the RocketLogger [1, 2] and using the measurement platform available at [3]. The files belonging to this folder are contained in archives named yyyy_mm_raw.tar, where yyyy and mmrepresent the year and month the data was published. A separate file lists the exact content of each archive (see below).

LICENSE - License information for the dataset.

README.md - The README file containing this information.

abstract.pdf - A copy of the above mentioned abstract submitted to the DATA '19 Workshop, introducing this dataset and the deployment used to collect it.

raw_import.ipynb [open in nbviewer] - Jupyter Python notebook to import, merge, and filter the raw dataset from the raw/ folder. This is the exact code used to generate the processed dataset and store it in the HDF5 format in the processed/folder.

raw_preview.ipynb [open in nbviewer] - This Jupyter Python notebook imports the raw dataset directly and plots a preview of the full power trace for all measurement positions.

processing_python.ipynb [open in nbviewer] - Jupyter Python notebook demonstrating the import and use of the processed dataset in Python. Calculates column-wise statistics, includes more detailed power plots and the simple energy predictor performance comparison included in the abstract.

processing_r.ipynb [open in nbviewer] - Jupyter R notebook demonstrating the import and use of the processed dataset in R. Calculates column-wise statistics and extracts and plots the energy harvesting conversion efficiency included in the abstract. Furthermore, the harvested power is analyzed as a function of the ambient light level.

Dataset File Lists

Processed Dataset Files

The list of the processed datasets included in the yyyy_mm_processed.tar archive is provided in yyyy_mm_processed.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.

Raw Dataset Files

A list of the raw measurement files included in the yyyy_mm_raw.tar archive(s) is provided in yyyy_mm_raw.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.

Dataset Revisions

v1.0 (2019-08-03)

Initial release. Includes the data collected from 2017-07-27 to 2019-08-01. The dataset archive files related to this revision are 2019_08_raw.tar and 2019_08_processed.tar. For position pos06, the measurements from 2018-01-06 00:00:00 to 2018-01-10 00:00:00 are filtered (data inconsistency in file indoor1_p27.rld).

v1.1 (2019-09-09)

Revision of the processed dataset v1.0 and addition of the final dataset abstract. Updated processing scripts reduce the timestamp drift in the processed dataset, the archive 2019_08_processed.tar has been replaced. For position pos06, the measurements from 2018-01-06 16:00:00 to 2018-01-10 00:00:00 are filtered (indoor1_p27.rld data inconsistency).

v2.0 (2020-03-20)

Addition of new data. Includes the raw data collected from 2019-08-01 to 2019-03-16. The processed data is updated with full coverage from 2017-07-27 to 2019-03-16. The dataset archive files related to this revision are 2020_03_raw.tar and 2020_03_processed.tar.

Dataset Authors, Copyright and License

Authors: Lukas Sigrist, Andres Gomez, and Lothar Thiele

Contact: Lukas Sigrist (lukas.sigrist@tik.ee.ethz.ch)

Copyright: (c) 2017-2019, ETH Zurich, Computer Engineering Group

License: Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)

References

[1] L. Sigrist, A. Gomez, R. Lim, S. Lippuner, M. Leubin, and L. Thiele. Measurement and validation of energy harvesting IoT devices. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

[2] ETH Zurich, Computer Engineering Group. RocketLogger Project Website, https://rocketlogger.ethz.ch/.

[3] L. Sigrist. Solar Harvesting and Ambient Tracing Platform, 2019. https://gitlab.ethz.ch/tec/public/employees/sigristl/harvesting_tracing
s
Data for: Emergent ferromagnetism near three-quarters filling in twisted...
purl.stanford.edu
Updated Sep 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron Sharpe; Eli Fox; Arthur Barnard; Joe Finney; Kenji Watanabe; Takashi Taniguchi; Marc Kastner; David Goldhaber-Gordon (2022). Data for: Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene [Dataset]. http://doi.org/10.25740/bg095cp1548
Explore at:
Unique identifier
https://doi.org/10.25740/bg095cp1548
Dataset updated
Sep 2, 2022
Authors
Aaron Sharpe; Eli Fox; Arthur Barnard; Joe Finney; Kenji Watanabe; Takashi Taniguchi; Marc Kastner; David Goldhaber-Gordon
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This archive contains the data and Python code generating figures for the article "Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene" by Aaron L. Sharpe, Eli J. Fox, Arthur W. Barnard, Joe Finney, Kenji Watanabe, Takashi Taniguchi, M. A. Kastner, and David Goldhaber-Gordon and available at https://arxiv.org/abs/1901.03520. This archive contains the following: 1) TBG_ferromagnetism_figures.ipynb, a Jupyter notebook loading data and generating figures. The notebook has been tested with Python version 3.6.7 and Jupyter notebook server version 5.5.0. 2) HTML_notebook directory that contains 'TBG_ferromagnetism_figures.html' an HTML file generated from the Jupyter notebook, and PNG files loaded by the HTML file, 3) scripts directory that contains additional files used by the Jupyter notebook, and 4) data directory, containing all data used to generate figures for the manuscript, stored as JSON objects. Refer to the notebook for figure captions describing the data.
d
Data from: A hydroclimatological approach to predicting regional landslide...
search.dataone.org
hydroshare.org
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ronda Strauch; Erkan Istanbulluoglu; Sai Siddhartha Nudurupati; Christina Bandaragoda; Nicole Gasparini; Greg Tucker (2021). A hydroclimatological approach to predicting regional landslide probability using Landlab [Dataset]. http://doi.org/10.4211/hs.27d34fc967be4ee6bc1f1ae92657bf2b
Explore at:
Unique identifier
https://doi.org/10.4211/hs.27d34fc967be4ee6bc1f1ae92657bf2b
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Ronda Strauch; Erkan Istanbulluoglu; Sai Siddhartha Nudurupati; Christina Bandaragoda; Nicole Gasparini; Greg Tucker
Area covered

Description
This resource supports the work published in Strauch et al., (2018) "A hydroclimatological approach to predicting regional landslide probability using Landlab", Earth Surf. Dynam., 6, 1-26 . It demonstrates a hydroclimatological approach to modeling of regional shallow landslide initiation based on the infinite slope stability model coupled with a steady-state subsurface flow representation. The model component is available as the LandslideProbability component in Landlab, an open-source, Python-based landscape earth systems modeling environment described in Hobley et al. (2017, Earth Surf. Dynam., 5, 21–46, https://doi.org/10.5194/esurf-5-21-2017). The model operates on a digital elevation model (DEM) grid to which local field parameters, such as cohesion and soil depth, are attached. A Monte Carlo approach is used to account for parameter uncertainty and calculate probability of shallow landsliding as well as the probability of soil saturation based on annual maximum recharge. The model is demonstrated in a steep mountainous region in northern Washington, U.S.A., using 30-m grid resolution over 2,700 km2.

This resource contains a 1) User Manual that describes the Landlab LandslideProbability Component design, parameters, and step-by-step guidance on using the component in a model, and 2) two Landlab driver codes (notebooks) and customized component code to run Landlab's LandslideProbability component for 2a) synthetic recharge and 2b) modeled recharge published in Strauch et al., (2018). The Jupyter Notebooks use HydroShare code libraries to import data located at this resource: https://www.hydroshare.org/resource/a5b52c0e1493401a815f4e77b09d352b/.

The Synthetic Recharge Jupyter Notebook

The Modeled Recharge Jupyter Notebook
f
NFDITalk (15 July 2024): Jupyter4NFDI - a central Jupyter Hub for the NFDI
meta4ds.fokus.fraunhofer.de
html
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NFDI, NFDITalk (15 July 2024): Jupyter4NFDI - a central Jupyter Hub for the NFDI [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/1pgpjyjs8tq
Explore at:
htmlAvailable download formats
Dataset authored and provided by
NFDI
Description
In our NFDITalks, scientists from different disciplines present exciting topics around NFDI and research data management. In this episode, Björn Hagemeier will talk about "Jupyter4NFDI - a central Jupyter Hub for the NFDI".

Jupyter Notebooks are widespread across scientific disciplines today. However, their deployment across various NFDI consortia currently occurs through individual JupyterHubs, resulting in access barriers to computational and data resources. Whereas some of the services are widely available, others are barricaded within VPNs or otherwise inaccessible for a wider audience. Our ambition is to improve the user experience by offering a centralized service to extend the reach of Jupyter to a broader audience within the NFDI and beyond. The technical foundation for our service will be the versatile configuration frontend that has been proven to meet user needs for the past seven years at JSC. It is continuously extended and traces and ever growing set of backend resources ranging from Cloud based, small-scale JupyterLabs to full-scale remote desktop environments on high-performance computing systems such as Germany's highest-ranked TOP500 system JUWELS Booster.

Importantly, the centralized system will not only simplify access but also support the import of projects along with their necessary dependencies, fostering an ecosystem conducive to creating reproducible FAIR Digital Objects (FDOs), possibly along with notebook identifiers supported by PID4NFDI.

In this talk, we'll revisit the history of the current solution, the landscape in which we intend to make it available, and give an outlook on the future of the service.
Z
Gaia EDR3 Catalogs of Machine-Learned Radial Velocities
data.niaid.nih.gov
Updated Jun 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dropulic, Adriana (2022). Gaia EDR3 Catalogs of Machine-Learned Radial Velocities [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6558082
Explore at:
Dataset updated
Jun 10, 2022
Dataset provided by
Dropulic, Adriana
Liu, Hongwan
Ostdiek, Bryan
Lisanti, Mariangela
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Gaia EDR3 Catalogs of Machine-Learned Radial Velocities

Spatially complete Test-Set and Machine-Learned Radial Velocity (ML-RV) Catalogs described in Dropulic et al., arXiv:2205.12278. The spatially complete Test-Set Catalog contains a total of 4,332,657 stars, while the spatially complete ML-RV Catalog contains 91,840,346 stars. We provide Gaia EDR3 Source IDs, the network-predicted line-of-sight velocity in km/s, and the network-predicted uncertainty in km/s.

We have included a simple Jupyter notebook demonstrating how to import the data, and make a simple histogram with it.

If you find this catalog useful in your work, please cite Dropulic et al. arXiv:2205.12278, as well as Dropulic et al. ApJL 915, L14 (2021) arXiv:2103.14039.
d
Development and Implementation of Database and Analyses for High Frequency...
search.dataone.org
beta.hydroshare.org
+1more
Updated Dec 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hyrum Tennant; Amber Spackman Jones (2021). Development and Implementation of Database and Analyses for High Frequency Data [Dataset]. https://search.dataone.org/view/sha256%3A9aa0ca8c359b1855ae314a46176643253c13fa977464b57c8dbb32848b18699c
Explore at:
Dataset updated
Dec 5, 2021
Dataset provided by
Hydroshare
Authors
Hyrum Tennant; Amber Spackman Jones
Time period covered
Jan 1, 2014 - May 22, 2018
Area covered

Description
For environmental data measured by a variety of sensors and compiled from various sources, practitioners need tools that facilitate data access and data analysis. Data are often organized in formats that are incompatible with each other and that prevent full data integration. Furthermore, analyses of these data are hampered by the inadequate mechanisms for storage and organization. Ideally, data should be centrally housed and organized in an intuitive structure with established patterns for analyses. However, in reality, the data are often scattered in multiple files without uniform structure that must be transferred between users and called individually and manually for each analysis. This effort describes a process for compiling environmental data into a single, central database that can be accessed for analyses. We use the Logan River watershed and observed water level, discharge, specific conductance, and temperature as a test case. Of interest is analysis of flow partitioning. We formatted data files and organized them into a hierarchy, and we developed scripts that import the data to a database with structure designed for hydrologic time series data. Scripts access the populated database to determine baseflow separation, flow balance, and mass balance and visualize the results. The analyses were compiled into a package of scripts in Python, which can be modified and run by scientists and researchers to determine gains and losses in reaches of interest. To facilitate reproducibility, the database and associated scripts were shared to HydroShare as Jupyter Notebooks so that any user can access the data and perform the analyses, which facilitates standardization of these operations.
f
MCCN Case Study 3 - Select optimal survey locality
adelaide.figshare.com
zip
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Donald Hobern; Alisha Aneja; Hoang Son Le; Rakesh David; Lili Andres Hernandez (2025). MCCN Case Study 3 - Select optimal survey locality [Dataset]. http://doi.org/10.25909/29176451.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25909/29176451.v1
Dataset updated
May 29, 2025
Dataset provided by
The University of Adelaide
Authors
Donald Hobern; Alisha Aneja; Hoang Son Le; Rakesh David; Lili Andres Hernandez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The MCCN project is to deliver tools to assist the agricultural sector to understand crop-environment relationships, specifically by facilitating generation of data cubes for spatiotemporal data. This repository contains Jupyter notebooks to demonstrate the functionality of the MCCN data cube components.The dataset contains input files for the case study (source_data), RO-Crate metadata (ro-crate-metadata.json), results from the case study (results), and Jupyter Notebook (MCCN-CASE 3.ipynb)Research Activity Identifier (RAiD)RAiD: https://doi.org/10.26292/8679d473Case StudiesThis repository contains code and sample data for the following case studies. Note that the analyses here are to demonstrate the software and result should not be considered scientifically or statistically meaningful. No effort has been made to address bias in samples, and sample data may not be available at sufficient density to warrant analysis. All case studies end with generation of an RO-Crate data package including the source data, the notebook and generated outputs, including netcdf exports of the datacubes themselves.Case Study 3 - Select optimal survey localityGiven a set of existing survey locations across a variable landscape, determine the optimal site to add to increase the range of surveyed environments. This study demonstrates: 1) Loading heterogeneous data sources into a cube, and 2) Analysis and visualisation using numpy and matplotlib.Data SourcesThe primary goal for this case study is to demonstrate being able to import a set of environmental values for different sites and then use these to identify a subset that maximises spread across the various environmental dimensions.This is a simple implementation that uses four environmental attributes imported for all Australia (or a subset like NSW) at a moderate grid scale:Digital soil maps for key soil properties over New South Wales, version 2.0 - SEED - see https://esoil.io/TERNLandscapes/Public/Pages/SLGA/ProductDetails-SoilAttributes.htmlANUCLIM Annual Mean Rainfall raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-rainfall-raster-layerANUCLIM Annual Mean Temperature raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-temperature-raster-layerDependenciesThis notebook requires Python 3.10 or higherInstall relevant Python libraries with: pip install mccn-engine rocrateInstalling mccn-engine will install other dependenciesOverviewGenerate STAC metadata for layers from predefined configuratiionLoad data cube and exclude nodata valuesScale all variables to a 0.0-1.0 rangeSelect four layers for comparison (soil organic carbon 0-30 cm, soil pH 0-30 cm, mean annual rainfall, mean annual temperature)Select 10 random points within NSWGenerate 10 new layers representing standardised environmental distance between one of the selected points and all other points in NSWFor every point in NSW, find the lowest environmental distance to any of the selected pointsSelect the point in NSW that has the highest value for the lowest environmental distance to any selected point - this is the most different pointClean up and save results to RO-Crate
Z
Benchmark-Tasks: Duffing Oscillator Response Analysis (DORA)
data.niaid.nih.gov
Updated Feb 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yadav, Manish (2025). Benchmark-Tasks: Duffing Oscillator Response Analysis (DORA) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14851013
Explore at:
Dataset updated
Feb 11, 2025
Dataset provided by
Yadav, Manish
Stender, Merten
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
🔹 Release v1.0 - Duffing Oscillator Response Analysis (DORA)

This release provides a collection of benchmark tasks and datasets, accompanied by minimal code to generate, import, and plot the data. The primary focus is on the Duffing Oscillator Response Analysis (DORA) prediction task, which evaluates machine learning models' ability to generalize system responses in unseen parameter regimes.

🚀 Key Features:

Duffing Oscillator Response Analysis (DORA) Prediction Task:

Objective: Predict the response of a forced Duffing oscillator using a minimal training dataset. This task assesses a model's capability to extrapolate system behavior in unseen parameter regimes, specifically varying amplitudes of external periodic forcing.

Expectation: A proficient model should qualitatively capture the system's response, such as identifying the exact number of cycles in a limit-cycle regime or chaotic trajectories when the system transitions to a chaotic regime, all trained on limited datasets.

Comprehensive Dataset:

Training Data (DORA_Train.csv): Contains data for two external forcing amplitudes, ( f $\in$ [0.46, 0.49] ).

Testing Data (DORA_Test.csv): Includes data for five forcing amplitudes, ( f $\in$ [0.2, 0.35, 0.48, 0.58, 0.75] ).

📊 Data Description:

Each dataset comprises five columns:

Column Description

t Time variable

q1(t) Time evolution of the Duffing oscillator's position

q2(t) Time evolution of the Duffing oscillator's velocity

f(t) Time evolution of external periodic forcing

f_amplitude Constant amplitude during system evaluation (default: 250)

🛠 Utility Scripts and Notebooks:

Data Generation and Visualization:

DORA_generator.py: Generates, plots, and saves training and testing data.Usage:

python DORA_generator.py -time 250 -plots 1

DORA.ipynb: A Jupyter Notebook for dataset generation, loading, and plotting.

Data Loading and Plotting:

ReadData.py: Loads and plots the provided datasets (DORA_Train.csv and DORA_Test.csv).

📈 Model Evaluation:

The prediction model's success is determined by its ability to extrapolate system behavior outside the training data.System response characteristics for external forcing are quantified in terms of amplitude and mean of ( q1^2(t) ).These can be obtained using the provided Signal_Characteristic function.

🔹 Performance Metrics:

Response Amplitude Error:MSE[max(q1_prediction²(t > t)), max(q1_original²(t > t))]

Response Mean Error:MSE[Mean(q1_prediction²(t > t)), Mean(q1_original²(t > t))]

Note: ( t* = 20s ) denotes the steady-state time.

📌 Reference Implementation:

An exemplar solution using reservoir computing is detailed in the following:📖 Yadav et al., 2025 – Springer Nonlinear Dynamics

📄 Citation:

If you utilize this dataset or code in your research, please cite:

@article{Yadav2024, author = {Manish Yadav and Swati Chauhan and Manish Dev Shrimali and Merten Stender}, title = {Predicting multi-parametric dynamics of an externally forced oscillator using reservoir computing and minimal data}, journal = {Nonlinear Dynamics}, year = {2024}, doi = {10.1007/s11071-024-10720-w}}
D
Data-driven analysis of structural instabilities in electroactive polymer...
darus.uni-stuttgart.de
Updated Feb 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Siddharth Sriram (2024). Data-driven analysis of structural instabilities in electroactive polymer bilayers based on a variational saddle-point principle: Datasets and ML codes [Dataset]. http://doi.org/10.18419/DARUS-3881
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-3881
Dataset updated
Feb 16, 2024
Dataset provided by
DaRUS
Authors
Siddharth Sriram
License
https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881
Dataset funded by
DFG
Description
The datasets and codes provided here are associated with our article entitled "Data-driven analysis of structural instabilities in electroactive polymer bilayers based on a variational saddle-point principle". The main idea of the work is to develop surrogate models using the concepts of machine learning (ML) to predict the onset of wrinkling instabilities in dielectric elastomer (DE) bilayers as a function of its tunable geometric and material parameters. The required datasets for building the surrogate models are generated using a finite-element-based framework for structural stability analysis of DE specimens that is rooted in a saddle-point-based variational principle. For a detailed description of this finite-element framework, the sampling of data points for the training/test sets and some brief notes regarding our implementation of the ML-based surrogates, kindly refer to our article mentioned above. Here, the datasets 'training_set.xlsx' and 'test_set.xlsx' contain the values of the critical buckling load (critical electric-charge density) and critical wrinkle count for the DE bilayer for the sampled data points, where each data point represents a unique set of four tunable input-feature values. The article above provides a description of these features, their physical units and their considered domain of values. The individual Jupyter notebooks import the training dataset and develop ML models for the different problems that are described in the article. The developed models are cross-validated and then tested on the test dataset. Extensive comments describing the ML workflow have been made in the notebooks for the user's reference. The conda environment containing all the necessary packages and dependencies for the execution of the Jupyter notebooks is provided in the file 'de_instabilities.yml'.

Facebook

Twitter

Click to copy link

Link copied

Cite

Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki (2025). Jupyter Notebook Activity Dataset (rsds-20241113) [Dataset]. http://doi.org/10.5281/zenodo.13357570

Jupyter Notebook Activity Dataset (rsds-20241113)

Explore at:

zip, application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13357570

Dataset updated

Jan 18, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

List of data

rsds-20241113.zip: Collection of SQLite database files
image.tar.gz: Docker image provided in our data collection experiment
redspot-341ffa5.zip: Redspot source code (redspot@341ffa5)

Extended version of Section 2D of our paper

Redspot is a Jupyter extension (i.e., Python package) that records activity signals. However, it also offers interfaces to read recorded signals. The following shows the most basic usage of its command-line interface:

redspot replay

This command generates snapshots (.ipynb files) restored from the signal records. Note that this command does not produce a snapshot for every signal. Since the change represented by a single signal is typically minimal (e.g., one keystroke), generating a snapshot for each signal results in a meaninglessly large number of snapshots. However, we want to obtain signal-level snapshots for some analyses. In such cases, one can analyze them using the application programming interfaces:

from redspot import database

from redspot.notebook import Notebook

nbk = Notebook()

for signal in database.get("path-to-db"):

time, panel, kind, args = signal

nbk.apply(kind, args) # apply change

print(nbk) # print notebook

To record activities, one needs to run the Redspot command in the recording mode as follows:

redspot record

This command launches Jupyter Notebook with Redspot enabled. Activities made in the launched environment are stored in an SQLite file named ``redspot.db'' under the current path.

To launch the environment we provided to the participants, one first needs to download and import the image (image.tar.gz). One can then run the image with the following command:

docker run --rm -it -p8888:8888

Note that the SQLite file is generated in the running container. The file can be downloaded into the host machine via the file viewer of Jupyter Notebook.

Clear search

Close search

Google apps

Main menu

Jupyter Notebook Activity Dataset (rsds-20241113)

List of data

Extended version of Section 2D of our paper

Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...

Melb 14784 lines Avg download speed 69.4M Tests 0.39M

SHG 31207 lines Avg 233.7M Tests 0.56M

ALC 113 lines Avg 51.5M Test 1092

BKK 29684 lines Avg 215.9M Tests 1.2M

LAX 15505 lines Avg 218.5M Tests 0.74M

Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...

Galaxy Training Material for the 'Use Jupyter notebooks in Galaxy' tutorial

Demographic Analysis Workflow using Census API in Jupyter Notebook:...

iris

Iris data loader

Hydroinformatics Instruction Module Example Code: Programmatic Data Access...

Cognitive Fatigue

Data from: Enhancing Carrier Mobility In Monolayer MoS2 Transistors With...

UWB Motion Detection Data Set

Data Repository for paper "Direct visualisation of domain wall pinning in...

Data from: Long-Term Tracing of Indoor Solar Harvesting

Data for: Emergent ferromagnetism near three-quarters filling in twisted...

Data from: A hydroclimatological approach to predicting regional landslide...

NFDITalk (15 July 2024): Jupyter4NFDI - a central Jupyter Hub for the NFDI

Gaia EDR3 Catalogs of Machine-Learned Radial Velocities

Development and Implementation of Database and Analyses for High Frequency...

MCCN Case Study 3 - Select optimal survey locality

Benchmark-Tasks: Duffing Oscillator Response Analysis (DORA)

Data-driven analysis of structural instabilities in electroactive polymer...

Jupyter Notebook Activity Dataset (rsds-20241113)

List of data

Extended version of Section 2D of our paper