39 datasets found
  1. Jupyter Notebook Activity Dataset (rsds-20241113)

    • zenodo.org
    application/gzip, zip
    Updated Jan 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki (2025). Jupyter Notebook Activity Dataset (rsds-20241113) [Dataset]. http://doi.org/10.5281/zenodo.13357570
    Explore at:
    zip, application/gzipAvailable download formats
    Dataset updated
    Jan 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of data

    • rsds-20241113.zip: Collection of SQLite database files
    • image.tar.gz: Docker image provided in our data collection experiment
    • redspot-341ffa5.zip: Redspot source code (redspot@341ffa5)

    Extended version of Section 2D of our paper

    Redspot is a Jupyter extension (i.e., Python package) that records activity signals. However, it also offers interfaces to read recorded signals. The following shows the most basic usage of its command-line interface:
    redspot replay


    This command generates snapshots (.ipynb files) restored from the signal records. Note that this command does not produce a snapshot for every signal. Since the change represented by a single signal is typically minimal (e.g., one keystroke), generating a snapshot for each signal results in a meaninglessly large number of snapshots. However, we want to obtain signal-level snapshots for some analyses. In such cases, one can analyze them using the application programming interfaces:

    from redspot import database
    from redspot.notebook import Notebook
    nbk = Notebook()
    for signal in database.get("path-to-db"):
    time, panel, kind, args = signal
    nbk.apply(kind, args) # apply change
    print(nbk) # print notebook

    To record activities, one needs to run the Redspot command in the recording mode as follows:

    redspot record

    This command launches Jupyter Notebook with Redspot enabled. Activities made in the launched environment are stored in an SQLite file named ``redspot.db'' under the current path.

    To launch the environment we provided to the participants, one first needs to download and import the image (image.tar.gz). One can then run the image with the following command:

    docker run --rm -it -p8888:8888

    Note that the SQLite file is generated in the running container. The file can be downloaded into the host machine via the file viewer of Jupyter Notebook.

  2. Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus...

    • figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Ferrers; Speedtest Global Index (2023). Speedtest Open Data - Four International cities - MEL, BKK, SHG, LAX plus ALC - 2020, 2022 [Dataset]. http://doi.org/10.6084/m9.figshare.13621169.v24
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Richard Ferrers; Speedtest Global Index
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU

    ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.

    1. LAX file named 0320, when should be Q320. Amended in v8.

    *lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)

    Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.

    This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.

    ** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).

    ** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract

    Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).

    **VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)

    Melb 14784 lines Avg download speed 69.4M Tests 0.39M

    SHG 31207 lines Avg 233.7M Tests 0.56M

    ALC 113 lines Avg 51.5M Test 1092

    BKK 29684 lines Avg 215.9M Tests 1.2M

    LAX 15505 lines Avg 218.5M Tests 0.74M

    v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record

    ** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.

    ** Other uses of Speedtest Open Data; - see link at Speedtest below.

  3. Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter...

    • zenodo.org
    application/gzip
    Updated Mar 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana (2021). Dataset of A Large-scale Study about Quality and Reproducibility of Jupyter Notebooks / Understanding and Improving the Quality and Reproducibility of Jupyter Notebooks [Dataset]. http://doi.org/10.5281/zenodo.3519618
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Mar 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    João Felipe; João Felipe; Leonardo; Leonardo; Vanessa; Vanessa; Juliana; Juliana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourages poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub. Based on the results, we proposed and evaluated Julynter, a linting tool for Jupyter Notebooks.

    Papers:

    This repository contains three files:

    Reproducing the Notebook Study

    The db2020-09-22.dump.gz file contains a PostgreSQL dump of the database, with all the data we extracted from notebooks. For loading it, run:

    gunzip -c db2020-09-22.dump.gz | psql jupyter

    Note that this file contains only the database with the extracted data. The actual repositories are available in a google drive folder, which also contains the docker images we used in the reproducibility study. The repositories are stored as content/{hash_dir1}/{hash_dir2}.tar.bz2, where hash_dir1 and hash_dir2 are columns of repositories in the database.

    For scripts, notebooks, and detailed instructions on how to analyze or reproduce the data collection, please check the instructions on the Jupyter Archaeology repository (tag 1.0.0)

    The sample.tar.gz file contains the repositories obtained during the manual sampling.

    Reproducing the Julynter Experiment

    The julynter_reproducility.tar.gz file contains all the data collected in the Julynter experiment and the analysis notebooks. Reproducing the analysis is straightforward:

    • Uncompress the file: $ tar zxvf julynter_reproducibility.tar.gz
    • Install the dependencies: $ pip install julynter/requirements.txt
    • Run the notebooks in order: J1.Data.Collection.ipynb; J2.Recommendations.ipynb; J3.Usability.ipynb.

    The collected data is stored in the julynter/data folder.

    Changelog

    2019/01/14 - Version 1 - Initial version
    2019/01/22 - Version 2 - Update N8.Execution.ipynb to calculate the rate of failure for each reason
    2019/03/13 - Version 3 - Update package for camera ready. Add columns to db to detect duplicates, change notebooks to consider them, and add N1.Skip.Notebook.ipynb and N11.Repository.With.Notebook.Restriction.ipynb.
    2021/03/15 - Version 4 - Add Julynter experiment; Update database dump to include new data collected for the second paper; remove scripts and analysis notebooks from this package (moved to GitHub), add a link to Google Drive with collected repository files

  4. Galaxy Training Material for the 'Use Jupyter notebooks in Galaxy' tutorial

    • zenodo.org
    csv
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Delphine Lariviere; Delphine Lariviere; Teresa Müller; Teresa Müller (2025). Galaxy Training Material for the 'Use Jupyter notebooks in Galaxy' tutorial [Dataset]. http://doi.org/10.5281/zenodo.15263830
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Delphine Lariviere; Delphine Lariviere; Teresa Müller; Teresa Müller
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was originally curated by Software Carpentry, a branch of The Carpentries non-profit organization, and is based on data from the Gapminder Foundation. It consists of six tabular CSV files containing GDP data for various countries across different years. The dataset was initially prepared for the Software Carpentry tutorial "Plotting and Programming in Python" and is also reused in the Galaxy Training Network (GTN) tutorial "Use Jupyter Notebooks in Galaxy."

    This GTN tutorial provides an introduction to launching a Jupyter Notebook in Galaxy, installing dependencies, and importing and exporting data. It serves as a setup guide for a Jupyter Notebook environment that can be used to follow the Software Carpentry tutorial "Plotting and Programming in Python."

  5. o

    Demographic Analysis Workflow using Census API in Jupyter Notebook:...

    • openicpsr.org
    delimited
    Updated Jul 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donghwan Gu; Nathanael Rosenheim (2020). Demographic Analysis Workflow using Census API in Jupyter Notebook: 1990-2000 Population Size and Change [Dataset]. http://doi.org/10.3886/E120381V1
    Explore at:
    delimitedAvailable download formats
    Dataset updated
    Jul 23, 2020
    Dataset provided by
    Texas A&M University
    Authors
    Donghwan Gu; Nathanael Rosenheim
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kentucky, Boone County, US Counties
    Description

    This archive reproduces a table titled "Table 3.1 Boone county population size, 1990 and 2000" from Wang and vom Hofe (2007, p.58). The archive provides a Jupyter Notebook that uses Python and can be run in Google Colaboratory. The workflow uses Census API to retrieve data, reproduce the table, and ensure reproducibility for anyone accessing this archive.The Python code was developed in Google Colaboratory, or Google Colab for short, which is an Integrated Development Environment (IDE) of JupyterLab and streamlines package installation, code collaboration and management. The Census API is used to obtain population counts from the 1990 and 2000 Decennial Census (Summary File 1, 100% data). All downloaded data are maintained in the notebook's temporary working directory while in use. The data are also stored separately with this archive.The notebook features extensive explanations, comments, code snippets, and code output. The notebook can be viewed in a PDF format or downloaded and opened in Google Colab. References to external resources are also provided for the various functional components. The notebook features code to perform the following functions:install/import necessary Python packagesintroduce a Census API Querydownload Census data via CensusAPI manipulate Census tabular data calculate absolute change and percent changeformatting numbersexport the table to csvThe notebook can be modified to perform the same operations for any county in the United States by changing the State and County FIPS code parameters for the Census API downloads. The notebook could be adapted for use in other environments (i.e., Jupyter Notebook) as well as reading and writing files to a local or shared drive, or cloud drive (i.e., Google Drive).

  6. h

    iris

    • huggingface.co
    Updated Jul 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Energy Research Scientific Computing Center (2025). iris [Dataset]. https://huggingface.co/datasets/NERSC/iris
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    National Energy Research Scientific Computing Center
    Description

    Iris

    The following code can be used to load the dataset from its stored location at NERSC. You may also access this code via a NERSC-hosted Jupyter notebook here.

    Iris data loader

    import pandas as pd iris_dat = pd.read_csv('/global/cfs/cdirs/dasrepo/www/ai_ready_datasets/iris/data/iris.csv')

    If you would like to download the data, visit the following link: https://portal.nersc.gov/cfs/dasrepo/ai_ready_datasets/iris/data

  7. d

    Hydroinformatics Instruction Module Example Code: Programmatic Data Access...

    • search.dataone.org
    • hydroshare.org
    • +1more
    Updated Dec 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amber Spackman Jones; Jeffery S. Horsburgh (2023). Hydroinformatics Instruction Module Example Code: Programmatic Data Access with USGS Data Retrieval [Dataset]. https://search.dataone.org/view/sha256%3A3b301506bb2be439d8f330b89de9c36ab074976044b6e5905593f2b7f5be772e
    Explore at:
    Dataset updated
    Dec 30, 2023
    Dataset provided by
    Hydroshare
    Authors
    Amber Spackman Jones; Jeffery S. Horsburgh
    Description

    This resource contains Jupyter Notebooks with examples for accessing USGS NWIS data via web services and performing subsequent analysis related to drought with particular focus on sites in Utah and the southwestern United States (could be modified to any USGS sites). The code uses the Python DataRetrieval package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.

    This resources consists of 6 example notebooks: 1. Example 1: Import and plot daily flow data 2. Example 2: Import and plot instantaneous flow data for multiple sites 3. Example 3: Perform analyses with USGS annual statistics data 4. Example 4: Retrieve data and find daily flow percentiles 3. Example 5: Further examination of drought year flows 6. Coding challenge: Assess drought severity

  8. f

    Cognitive Fatigue

    • figshare.com
    csv
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rui Varandas; Inês Silveira; Hugo Gamboa (2025). Cognitive Fatigue [Dataset]. http://doi.org/10.6084/m9.figshare.28188143.v3
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 4, 2025
    Dataset provided by
    figshare
    Authors
    Rui Varandas; Inês Silveira; Hugo Gamboa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    1. Cognitive Fatigue2.1. Experimental designCognitive fatigue (CF) is a phenomenon that arises following the prolonged engagement in mentally demanding cognitive tasks. Thus, we developed an experimental procedure that involved three demanding tasks: a digital lesson in Jupyter Notebook format, three repetitions of Corsi-Block task, and two repetitions of a concentration test.Before the Corsi-Block task and after the concentration task there were periods of baseline of two min. In our analysis, the first baseline period, although not explicitly present in the dataset, was designated as representing no CF, whereas the final baseline period was designated as representing the presence of CF. Between repetitions of the Corsi-Block task, there were periods of baseline of 15 s after the task and of 30 s before the beginning of each repetition of the task.2.2. Data recordingA data sample of 10 volunteer participants (4 females) aged between 22 and 48 years old (M = 28.2, SD = 7.6) took part in this study. All volunteers were recruited at NOVA School of Science and Technology, fluent in English, right-handed, none reported suffering from psychological disorders, and none reported taking regular medication. Written informed consent was obtained before participating and all Ethical Procedures approved by the Ethics Committee of NOVA University of Lisbon were thoroughly followed.In this study, we omitted the data from one participant due to the insufficient duration of data acquisition.2.3. Data labellingThe labels easy, difficult, very difficult and repeat found in the ECG_lesson_answers.txt files represent the subjects' opinion of the content read in the ECG lesson. The repeat label represents the most difficult level. It's called repeat because when you press it, the answer to the question is shown again. This system is based on the Anki system, which has been proposed and used to memorise information effectively. In addition, the PB description JSON files include timestamps indicating the start and end of cognitive tasks, baseline periods, and other events, which are useful for defining CF states as we defined in 2.1.2.4. Data descriptionBiosignals include EEG, fNIRS (not converted to oxi and deoxiHb), ECG, EDA, respiration (RIP), accelerometer (ACC), and push-button data (PB). All signals have already been converted to physical units. In each biosignal file, the first column corresponds to the timestamps.HCI features encompass keyboard, mouse, and screenshot data. Below is a Python code snippet for extracting screenshot files from the screenshots CSV file.import base64from os import mkdirfrom os.path import joinfile = '...'with open(file, 'r') as f: lines = f.readlines()for line in lines[1:]: timestamp = line.split(',')[0] code = line.split(',')[-1][:-2] imgdata = base64.b64decode(code) filename = str(timestamp) + '.jpeg' mkdir('screenshot') with open(join('screenshot', filename), 'wb') as f: f.write(imgdata)A characterization file containing age and gender information for all subjects in each dataset is provided within the respective dataset folder (e.g., D2_subject-info.csv). Other complementary files include (i) description of the pushbuttons to help segment the signals (e.g., D2_S2_PB_description.json) and (ii) labelling (e.g., D2_S2_ECG_lesson_results.txt). The files D2_Sx_results_corsi-block_board_1.json and D2_Sx_results_corsi-block_board_2.json show the results for the first and second iterations of the corsi-block task, where, for example, row_0_1 = 12 means that the subject got 12 pairs right in the first row of the first board, and row_0_2 = 12 means that the subject got 12 pairs right in the first row of the second board.
  9. I

    Data from: Enhancing Carrier Mobility In Monolayer MoS2 Transistors With...

    • databank.illinois.edu
    • aws-databank-alb.library.illinois.edu
    Updated Mar 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yue Zhang; Helin Zhao; Siyuan Huang; Mohhamad Abir Hossain; Arend van der Zande (2024). Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain [Dataset]. http://doi.org/10.13012/B2IDB-4074704_V1
    Explore at:
    Dataset updated
    Mar 29, 2024
    Authors
    Yue Zhang; Helin Zhao; Siyuan Huang; Mohhamad Abir Hossain; Arend van der Zande
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Read me file for the data repository ******************************************************************************* This repository has raw data for the publication "Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain". We arrange the data following the figure in which it first appeared. For all electrical transfer measurement, we provide the up-sweep and down-sweep data, with voltage units in V and conductance unit in S. All Raman modes have unit of cm^-1. ******************************************************************************* How to use this dataset All data in this dataset is stored in binary Numpy array format as .npy file. To read a .npy file: use the Numpy module of the python language, and use np.load() command. Example: suppose the filename is example_data.npy. To load it into a python program, open a Jupyter notebook, or in the python program, run: import numpy as np data = np.load("example_data.npy") Then the example file is stored in the data object. *******************************************************************************

  10. Z

    UWB Motion Detection Data Set

    • data.niaid.nih.gov
    Updated Feb 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mihael Mohorčič (2022). UWB Motion Detection Data Set [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4613124
    Explore at:
    Dataset updated
    Feb 11, 2022
    Dataset provided by
    Mihael Mohorčič
    Andrej Hrovat
    Klemen Bregar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.

    Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.

    The data set will require around 10 GB of total free space after extraction.

    The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.

    Data Set Structure

    The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:

    data subfolder: contains all original .csv and intermediate .npy data files.

    models

    pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.

    pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

    generate_pdp_diff.py

    validation subfolder: contains data for motion detection validation

    events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.

    pdp: this folder contains raw PDP measurements in .csv format.

    pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.

    generate_events_0.py

    generate_events_1.py

    generate_events_2.py

    generate_pdp_diff.py

    figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.

    Process subfolder: contains Jupyter notebooks with data processing and motion detection code.

    MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection

    OnlineModels: presents the development process of online models definitions

    PDP_diff: presents the basic principle of PDP differences used in the motion detection

    Validation: presents a motion detection validation process

    Raw data structure

    All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:

    unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]

  11. c

    Data Repository for paper "Direct visualisation of domain wall pinning in...

    • research-data.cardiff.ac.uk
    zip
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Askey; Sam Ladak; Wolfgang Langbein; Arjen Van Den Berg; Matthew Hunt; Lukas Payne; Ioannis Pitsios; Alaa Hejazi (2025). Data Repository for paper "Direct visualisation of domain wall pinning in sub-100nm 3D magnetic nanowires with cross-sectional curvature", Joseph Askey and Matthew Hunt et al. 2024 [Dataset]. http://doi.org/10.17035/cardiff.26763172.v3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 11, 2025
    Dataset provided by
    Cardiff University
    Authors
    Joseph Askey; Sam Ladak; Wolfgang Langbein; Arjen Van Den Berg; Matthew Hunt; Lukas Payne; Ioannis Pitsios; Alaa Hejazi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this paper we fabricate 3D ferromagnetic nanowires with sub-100nm using two photon lithography at wavelength of 405 nm. We demonstrate a range of novel domain wall textures via micromagnetics, and characterise our experimental systems using SEM, qDIC, AFM and MFM and observe domain wall pinning largely influenced by roughness and thickness gradients of the deposited magnetic material.In the dataset we provide the raw and processed data for each figure in the publication. The data repository folder is organised by figure, and subsequent directories containing raw data, processed data, blender file for all schematics (organised by figure in the filetree), python analysis scripts in the form of jupyter notebooks for all relevant figures, readme.txt files for directory navigation, etc.fig1 -master_blender_file_figs1-3.blend: blender master file for generating schematics shown in figs1-3 (organised by relevant folders in .blend file tree)-fig1a-d: saved as jpgs-fig1_README.txtfig2 -raw_sems: raw sem images as .tiff files-features: sem feature sizes as .txt files-fig2b.ipynb: sem feature size analysis notebook-fig2a-e: panels saved as .jpgs and .pngs-fig2_README.txtfig3 -fig3_vtk/ctw_hh.vtk: .vtk file for visualising the ctw domain wall in figure 3 of main paper-paraview/paraview_state_fig3_s3-5.pvsm: .pvsm file to load into paraview to visualise all domain wall types, other .vtks can be found in the appropriate folders-nmag_raw/: directory containing raw NMag python, data, mesh, h5 and q files. Used in the relaxation of head to head domain walls.-psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig3a.ipynb: jupyter notebook for loading the data and generating figure 3a-fig3a-g: all fig 3 panels saved as .jpgs and .pngs-fig3_README.txtfig4 -afm/: raw data (.gwy file)-mfm/: raw data (.gwy file)-fig4a & fig4b.png: of the analysed 2D and 3D afm images-fig4c & fig4d.png: of the analysed 2D and 3D mfm images-fig4e_4f.ipynb: jupyter notebook for generating figures 4e and 4f-fig4e & fig4f.png: binarized image and normalised count plot-img.png: raw binarized image of the mfm shown in figure 4c and 4d-peaks.csv: raw data peak fit data, pixel number versus normalised counts (see jupyter notebook for relevant columns)-fig4_README.txtfig5 -fig5a-b_sems/: folder containing raw uncropped .tiffs angled sem views of the sinusoidal nanowires analysed via mfm-fig5c-d_heatmaps/: folder containing raw data .csv data of pinning probability as function of position and in-plane field for l5 and l2 wires-fig5e-j_afm_tot_bk/: folder containing raw data (afm files, .gwy, .txt with z-profiles, roughness and waviness, and collated .csv files) for l5 and l2 wires-fig5c-d.ipynb: jupyter notebook for loading relevant data and generating fig5c and 5d heatmaps-fig5e-j.ipynb: jupyter notebook for loading relevant data and generating fig5e to fig 5j afm, projected total field and depinning fields-fig5a-5j: all sub-panels saved as .tif or .pngs-fig5_README.txts1 -k500_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 500 and linewidth 40 pixels -k5000_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 5000 and linewidth 40 pixels -fig_s1.ipynb: jupyter notebook for loading .csv data, analysing and generating fig_s1c.png to fig_s1f.png -fig_s1a-s1f.png: sub-panels of fig_s1 -phi_500.dat and phi_5000.dat: raw .dat files of the phase images taken from qDIC imaging, can be loaded as text image in ImageJ for processing. Images are rotated by -135.1 degrees with bicubic interpolation. -fig_s1_README.txts2 -psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig_s2.ipynb: jupyter notebook for loading the data and generating figure s2-fig_s2.png: raw .png file of fig_s2-fig_s2_README.txts3 -fig_s3_vtk/: folder containing .vtk file used for the avw shown in supplementary figure s3, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s3a-s3d.png: .png files of the sub-panels of supplementary figure s3-fig_s3_README.txts4 -fig_s4_vtk/: folder containing .vtk file used for the vw shown in supplementary figure s4, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s4a-s4d.png: .png files of the sub-panels of supplementary figure s4-fig_s4_README.txts5 -fig_s5_vtk/: folder containing .vtk file used for the cvw shown in supplementary figure s5, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s5a-s5d.png: .png files of the sub-panels of supplementary figure s5-fig_s5_README.txts6 -matt_sss_190903ja_inplane_oct19th.007: raw afm/mfm data-fig_s6a.png: processed image of the raw mfm data above-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw afm profile data-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw mfm profile data-fig_s6.ipynb: jupyter notebook used to load the above data and generate the z-profile and normalised phase plots-fig_s6d-e.png: .png files of the sub-panels shown in supplementary figure s6-fig_s6_README.txts7 -matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of bottom blue region-matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of top red region-fig_s7.ipynb: jupyter notebook used to load above data and generate the z-profiles and normalised phase plots-fig_s7f-g.png: .png files of the sub-panels shown in supplementary figure s7-fig_s7_README.txt-NOTE: the mfm image shown in this supplementary figure is identical to the one in figure 4d-c)s8 -l1/: folder containing raw data (.txt with z-profiles, roughness and waviness, and collated .csv files) for l1 wires-fig_s8.ipynb: jupyter notebook used to generate above data and generate the z-profile, total projected field and depinning fields-fig_s8.png: .png of supplementary figure s8.-fig_s8_README.txtst1 -st1_energies_pops.csv: contains the energy densities and population statistics of the relaxed dws in the micromagnetic simulations, shown in supplementary table 1.-st1_README.txt

  12. Z

    Data from: Long-Term Tracing of Indoor Solar Harvesting

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sigrist, Lukas (2024). Long-Term Tracing of Indoor Solar Harvesting [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3346975
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Thiele, Lothar
    Sigrist, Lukas
    Gomez, Andres
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset Information

    This dataset presents long-term term indoor solar harvesting traces and jointly monitored with the ambient conditions. The data is recorded at 6 indoor positions with diverse characteristics at our institute at ETH Zurich in Zurich, Switzerland.

    The data is collected with a measurement platform [3] consisting of a solar panel (AM-5412) connected to a bq25505 energy harvesting chip that stores the harvested energy in a virtual battery circuit. Two TSL45315 light sensors placed on opposite sides of the solar panel monitor the illuminance level and a BME280 sensor logs ambient conditions like temperature, humidity and air pressure.

    The dataset contains the measurement of the energy flow at the input and the output of the bq25505 harvesting circuit, as well as the illuminance, temperature, humidity and air pressure measurements of the ambient sensors. The following timestamped data columns are available in the raw measurement format, as well as preprocessed and filtered HDF5 datasets:

    V_in - Converter input/solar panel output voltage, in volt

    I_in - Converter input/solar panel output current, in ampere

    V_bat - Battery voltage (emulated through circuit), in volt

    I_bat - Net Battery current, in/out flowing current, in ampere

    Ev_left - Illuminance left of solar panel, in lux

    Ev_right - Illuminance left of solar panel, in lux

    P_amb - Ambient air pressure, in pascal

    RH_amb - Ambient relative humidity, unit-less between 0 and 1

    T_amb - Ambient temperature, in centigrade Celsius

    The following publication presents and overview of the dataset and more details on the deployment used for data collection. A copy of the abstract is included in this dataset, see the file abstract.pdf.

    L. Sigrist, A. Gomez, and L. Thiele. "Dataset: Tracing Indoor Solar Harvesting." In Proceedings of the 2nd Workshop on Data Acquisition To Analysis (DATA '19), 2019.

    Folder Structure and Files

    processed/ - This folder holds the imported, merged and filtered datasets of the power and sensor measurements. The datasets are stored in HDF5 format and split by measurement position posXX and and power and ambient sensor measurements. The files belonging to this folder are contained in archives named yyyy_mm_processed.tar, where yyyy and mm represent the year and month the data was published. A separate file lists the exact content of each archive (see below).

    raw/ - This folder holds the raw measurement files recorded with the RocketLogger [1, 2] and using the measurement platform available at [3]. The files belonging to this folder are contained in archives named yyyy_mm_raw.tar, where yyyy and mmrepresent the year and month the data was published. A separate file lists the exact content of each archive (see below).

    LICENSE - License information for the dataset.

    README.md - The README file containing this information.

    abstract.pdf - A copy of the above mentioned abstract submitted to the DATA '19 Workshop, introducing this dataset and the deployment used to collect it.

    raw_import.ipynb [open in nbviewer] - Jupyter Python notebook to import, merge, and filter the raw dataset from the raw/ folder. This is the exact code used to generate the processed dataset and store it in the HDF5 format in the processed/folder.

    raw_preview.ipynb [open in nbviewer] - This Jupyter Python notebook imports the raw dataset directly and plots a preview of the full power trace for all measurement positions.

    processing_python.ipynb [open in nbviewer] - Jupyter Python notebook demonstrating the import and use of the processed dataset in Python. Calculates column-wise statistics, includes more detailed power plots and the simple energy predictor performance comparison included in the abstract.

    processing_r.ipynb [open in nbviewer] - Jupyter R notebook demonstrating the import and use of the processed dataset in R. Calculates column-wise statistics and extracts and plots the energy harvesting conversion efficiency included in the abstract. Furthermore, the harvested power is analyzed as a function of the ambient light level.

    Dataset File Lists

    Processed Dataset Files

    The list of the processed datasets included in the yyyy_mm_processed.tar archive is provided in yyyy_mm_processed.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.

    Raw Dataset Files

    A list of the raw measurement files included in the yyyy_mm_raw.tar archive(s) is provided in yyyy_mm_raw.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.

    Dataset Revisions

    v1.0 (2019-08-03)

    Initial release. Includes the data collected from 2017-07-27 to 2019-08-01. The dataset archive files related to this revision are 2019_08_raw.tar and 2019_08_processed.tar. For position pos06, the measurements from 2018-01-06 00:00:00 to 2018-01-10 00:00:00 are filtered (data inconsistency in file indoor1_p27.rld).

    v1.1 (2019-09-09)

    Revision of the processed dataset v1.0 and addition of the final dataset abstract. Updated processing scripts reduce the timestamp drift in the processed dataset, the archive 2019_08_processed.tar has been replaced. For position pos06, the measurements from 2018-01-06 16:00:00 to 2018-01-10 00:00:00 are filtered (indoor1_p27.rld data inconsistency).

    v2.0 (2020-03-20)

    Addition of new data. Includes the raw data collected from 2019-08-01 to 2019-03-16. The processed data is updated with full coverage from 2017-07-27 to 2019-03-16. The dataset archive files related to this revision are 2020_03_raw.tar and 2020_03_processed.tar.

    Dataset Authors, Copyright and License

    Authors: Lukas Sigrist, Andres Gomez, and Lothar Thiele

    Contact: Lukas Sigrist (lukas.sigrist@tik.ee.ethz.ch)

    Copyright: (c) 2017-2019, ETH Zurich, Computer Engineering Group

    License: Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)

    References

    [1] L. Sigrist, A. Gomez, R. Lim, S. Lippuner, M. Leubin, and L. Thiele. Measurement and validation of energy harvesting IoT devices. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.

    [2] ETH Zurich, Computer Engineering Group. RocketLogger Project Website, https://rocketlogger.ethz.ch/.

    [3] L. Sigrist. Solar Harvesting and Ambient Tracing Platform, 2019. https://gitlab.ethz.ch/tec/public/employees/sigristl/harvesting_tracing

  13. s

    Data for: Emergent ferromagnetism near three-quarters filling in twisted...

    • purl.stanford.edu
    Updated Sep 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Sharpe; Eli Fox; Arthur Barnard; Joe Finney; Kenji Watanabe; Takashi Taniguchi; Marc Kastner; David Goldhaber-Gordon (2022). Data for: Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene [Dataset]. http://doi.org/10.25740/bg095cp1548
    Explore at:
    Dataset updated
    Sep 2, 2022
    Authors
    Aaron Sharpe; Eli Fox; Arthur Barnard; Joe Finney; Kenji Watanabe; Takashi Taniguchi; Marc Kastner; David Goldhaber-Gordon
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This archive contains the data and Python code generating figures for the article "Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene" by Aaron L. Sharpe, Eli J. Fox, Arthur W. Barnard, Joe Finney, Kenji Watanabe, Takashi Taniguchi, M. A. Kastner, and David Goldhaber-Gordon and available at https://arxiv.org/abs/1901.03520. This archive contains the following: 1) TBG_ferromagnetism_figures.ipynb, a Jupyter notebook loading data and generating figures. The notebook has been tested with Python version 3.6.7 and Jupyter notebook server version 5.5.0. 2) HTML_notebook directory that contains 'TBG_ferromagnetism_figures.html' an HTML file generated from the Jupyter notebook, and PNG files loaded by the HTML file, 3) scripts directory that contains additional files used by the Jupyter notebook, and 4) data directory, containing all data used to generate figures for the manuscript, stored as JSON objects. Refer to the notebook for figure captions describing the data.

  14. d

    Data from: A hydroclimatological approach to predicting regional landslide...

    • search.dataone.org
    • hydroshare.org
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ronda Strauch; Erkan Istanbulluoglu; Sai Siddhartha Nudurupati; Christina Bandaragoda; Nicole Gasparini; Greg Tucker (2021). A hydroclimatological approach to predicting regional landslide probability using Landlab [Dataset]. http://doi.org/10.4211/hs.27d34fc967be4ee6bc1f1ae92657bf2b
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    Ronda Strauch; Erkan Istanbulluoglu; Sai Siddhartha Nudurupati; Christina Bandaragoda; Nicole Gasparini; Greg Tucker
    Area covered
    Description

    This resource supports the work published in Strauch et al., (2018) "A hydroclimatological approach to predicting regional landslide probability using Landlab", Earth Surf. Dynam., 6, 1-26 . It demonstrates a hydroclimatological approach to modeling of regional shallow landslide initiation based on the infinite slope stability model coupled with a steady-state subsurface flow representation. The model component is available as the LandslideProbability component in Landlab, an open-source, Python-based landscape earth systems modeling environment described in Hobley et al. (2017, Earth Surf. Dynam., 5, 21–46, https://doi.org/10.5194/esurf-5-21-2017). The model operates on a digital elevation model (DEM) grid to which local field parameters, such as cohesion and soil depth, are attached. A Monte Carlo approach is used to account for parameter uncertainty and calculate probability of shallow landsliding as well as the probability of soil saturation based on annual maximum recharge. The model is demonstrated in a steep mountainous region in northern Washington, U.S.A., using 30-m grid resolution over 2,700 km2.

    This resource contains a 1) User Manual that describes the Landlab LandslideProbability Component design, parameters, and step-by-step guidance on using the component in a model, and 2) two Landlab driver codes (notebooks) and customized component code to run Landlab's LandslideProbability component for 2a) synthetic recharge and 2b) modeled recharge published in Strauch et al., (2018). The Jupyter Notebooks use HydroShare code libraries to import data located at this resource: https://www.hydroshare.org/resource/a5b52c0e1493401a815f4e77b09d352b/.

    The Synthetic Recharge Jupyter Notebook

    The Modeled Recharge Jupyter Notebook

  15. f

    NFDITalk (15 July 2024): Jupyter4NFDI - a central Jupyter Hub for the NFDI

    • meta4ds.fokus.fraunhofer.de
    html
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NFDI, NFDITalk (15 July 2024): Jupyter4NFDI - a central Jupyter Hub for the NFDI [Dataset]. https://meta4ds.fokus.fraunhofer.de/datasets/1pgpjyjs8tq
    Explore at:
    htmlAvailable download formats
    Dataset authored and provided by
    NFDI
    Description

    In our NFDITalks, scientists from different disciplines present exciting topics around NFDI and research data management. In this episode, Björn Hagemeier will talk about "Jupyter4NFDI - a central Jupyter Hub for the NFDI".

    Jupyter Notebooks are widespread across scientific disciplines today. However, their deployment across various NFDI consortia currently occurs through individual JupyterHubs, resulting in access barriers to computational and data resources. Whereas some of the services are widely available, others are barricaded within VPNs or otherwise inaccessible for a wider audience. Our ambition is to improve the user experience by offering a centralized service to extend the reach of Jupyter to a broader audience within the NFDI and beyond. The technical foundation for our service will be the versatile configuration frontend that has been proven to meet user needs for the past seven years at JSC. It is continuously extended and traces and ever growing set of backend resources ranging from Cloud based, small-scale JupyterLabs to full-scale remote desktop environments on high-performance computing systems such as Germany's highest-ranked TOP500 system JUWELS Booster.

    Importantly, the centralized system will not only simplify access but also support the import of projects along with their necessary dependencies, fostering an ecosystem conducive to creating reproducible FAIR Digital Objects (FDOs), possibly along with notebook identifiers supported by PID4NFDI.

    In this talk, we'll revisit the history of the current solution, the landscape in which we intend to make it available, and give an outlook on the future of the service.

  16. Z

    Gaia EDR3 Catalogs of Machine-Learned Radial Velocities

    • data.niaid.nih.gov
    Updated Jun 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dropulic, Adriana (2022). Gaia EDR3 Catalogs of Machine-Learned Radial Velocities [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6558082
    Explore at:
    Dataset updated
    Jun 10, 2022
    Dataset provided by
    Dropulic, Adriana
    Liu, Hongwan
    Ostdiek, Bryan
    Lisanti, Mariangela
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Gaia EDR3 Catalogs of Machine-Learned Radial Velocities

    Spatially complete Test-Set and Machine-Learned Radial Velocity (ML-RV) Catalogs described in Dropulic et al., arXiv:2205.12278. The spatially complete Test-Set Catalog contains a total of 4,332,657 stars, while the spatially complete ML-RV Catalog contains 91,840,346 stars. We provide Gaia EDR3 Source IDs, the network-predicted line-of-sight velocity in km/s, and the network-predicted uncertainty in km/s.

    We have included a simple Jupyter notebook demonstrating how to import the data, and make a simple histogram with it.

    If you find this catalog useful in your work, please cite Dropulic et al. arXiv:2205.12278, as well as Dropulic et al. ApJL 915, L14 (2021) arXiv:2103.14039.

  17. d

    Development and Implementation of Database and Analyses for High Frequency...

    • search.dataone.org
    • beta.hydroshare.org
    • +1more
    Updated Dec 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hyrum Tennant; Amber Spackman Jones (2021). Development and Implementation of Database and Analyses for High Frequency Data [Dataset]. https://search.dataone.org/view/sha256%3A9aa0ca8c359b1855ae314a46176643253c13fa977464b57c8dbb32848b18699c
    Explore at:
    Dataset updated
    Dec 5, 2021
    Dataset provided by
    Hydroshare
    Authors
    Hyrum Tennant; Amber Spackman Jones
    Time period covered
    Jan 1, 2014 - May 22, 2018
    Area covered
    Description

    For environmental data measured by a variety of sensors and compiled from various sources, practitioners need tools that facilitate data access and data analysis. Data are often organized in formats that are incompatible with each other and that prevent full data integration. Furthermore, analyses of these data are hampered by the inadequate mechanisms for storage and organization. Ideally, data should be centrally housed and organized in an intuitive structure with established patterns for analyses. However, in reality, the data are often scattered in multiple files without uniform structure that must be transferred between users and called individually and manually for each analysis. This effort describes a process for compiling environmental data into a single, central database that can be accessed for analyses. We use the Logan River watershed and observed water level, discharge, specific conductance, and temperature as a test case. Of interest is analysis of flow partitioning. We formatted data files and organized them into a hierarchy, and we developed scripts that import the data to a database with structure designed for hydrologic time series data. Scripts access the populated database to determine baseflow separation, flow balance, and mass balance and visualize the results. The analyses were compiled into a package of scripts in Python, which can be modified and run by scientists and researchers to determine gains and losses in reaches of interest. To facilitate reproducibility, the database and associated scripts were shared to HydroShare as Jupyter Notebooks so that any user can access the data and perform the analyses, which facilitates standardization of these operations.

  18. f

    MCCN Case Study 3 - Select optimal survey locality

    • adelaide.figshare.com
    zip
    Updated May 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Donald Hobern; Alisha Aneja; Hoang Son Le; Rakesh David; Lili Andres Hernandez (2025). MCCN Case Study 3 - Select optimal survey locality [Dataset]. http://doi.org/10.25909/29176451.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 29, 2025
    Dataset provided by
    The University of Adelaide
    Authors
    Donald Hobern; Alisha Aneja; Hoang Son Le; Rakesh David; Lili Andres Hernandez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The MCCN project is to deliver tools to assist the agricultural sector to understand crop-environment relationships, specifically by facilitating generation of data cubes for spatiotemporal data. This repository contains Jupyter notebooks to demonstrate the functionality of the MCCN data cube components.The dataset contains input files for the case study (source_data), RO-Crate metadata (ro-crate-metadata.json), results from the case study (results), and Jupyter Notebook (MCCN-CASE 3.ipynb)Research Activity Identifier (RAiD)RAiD: https://doi.org/10.26292/8679d473Case StudiesThis repository contains code and sample data for the following case studies. Note that the analyses here are to demonstrate the software and result should not be considered scientifically or statistically meaningful. No effort has been made to address bias in samples, and sample data may not be available at sufficient density to warrant analysis. All case studies end with generation of an RO-Crate data package including the source data, the notebook and generated outputs, including netcdf exports of the datacubes themselves.Case Study 3 - Select optimal survey localityGiven a set of existing survey locations across a variable landscape, determine the optimal site to add to increase the range of surveyed environments. This study demonstrates: 1) Loading heterogeneous data sources into a cube, and 2) Analysis and visualisation using numpy and matplotlib.Data SourcesThe primary goal for this case study is to demonstrate being able to import a set of environmental values for different sites and then use these to identify a subset that maximises spread across the various environmental dimensions.This is a simple implementation that uses four environmental attributes imported for all Australia (or a subset like NSW) at a moderate grid scale:Digital soil maps for key soil properties over New South Wales, version 2.0 - SEED - see https://esoil.io/TERNLandscapes/Public/Pages/SLGA/ProductDetails-SoilAttributes.htmlANUCLIM Annual Mean Rainfall raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-rainfall-raster-layerANUCLIM Annual Mean Temperature raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-temperature-raster-layerDependenciesThis notebook requires Python 3.10 or higherInstall relevant Python libraries with: pip install mccn-engine rocrateInstalling mccn-engine will install other dependenciesOverviewGenerate STAC metadata for layers from predefined configuratiionLoad data cube and exclude nodata valuesScale all variables to a 0.0-1.0 rangeSelect four layers for comparison (soil organic carbon 0-30 cm, soil pH 0-30 cm, mean annual rainfall, mean annual temperature)Select 10 random points within NSWGenerate 10 new layers representing standardised environmental distance between one of the selected points and all other points in NSWFor every point in NSW, find the lowest environmental distance to any of the selected pointsSelect the point in NSW that has the highest value for the lowest environmental distance to any selected point - this is the most different pointClean up and save results to RO-Crate

  19. Z

    Benchmark-Tasks: Duffing Oscillator Response Analysis (DORA)

    • data.niaid.nih.gov
    Updated Feb 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yadav, Manish (2025). Benchmark-Tasks: Duffing Oscillator Response Analysis (DORA) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_14851013
    Explore at:
    Dataset updated
    Feb 11, 2025
    Dataset provided by
    Yadav, Manish
    Stender, Merten
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    🔹 Release v1.0 - Duffing Oscillator Response Analysis (DORA)

    This release provides a collection of benchmark tasks and datasets, accompanied by minimal code to generate, import, and plot the data. The primary focus is on the Duffing Oscillator Response Analysis (DORA) prediction task, which evaluates machine learning models' ability to generalize system responses in unseen parameter regimes.

    🚀 Key Features:

    Duffing Oscillator Response Analysis (DORA) Prediction Task:

    Objective: Predict the response of a forced Duffing oscillator using a minimal training dataset. This task assesses a model's capability to extrapolate system behavior in unseen parameter regimes, specifically varying amplitudes of external periodic forcing.

    Expectation: A proficient model should qualitatively capture the system's response, such as identifying the exact number of cycles in a limit-cycle regime or chaotic trajectories when the system transitions to a chaotic regime, all trained on limited datasets.

    Comprehensive Dataset:

    Training Data (DORA_Train.csv): Contains data for two external forcing amplitudes, ( f $\in$ [0.46, 0.49] ).

    Testing Data (DORA_Test.csv): Includes data for five forcing amplitudes, ( f $\in$ [0.2, 0.35, 0.48, 0.58, 0.75] ).

    📊 Data Description:

    Each dataset comprises five columns:

    Column Description

    t Time variable

    q1(t) Time evolution of the Duffing oscillator's position

    q2(t) Time evolution of the Duffing oscillator's velocity

    f(t) Time evolution of external periodic forcing

    f_amplitude Constant amplitude during system evaluation (default: 250)

    🛠 Utility Scripts and Notebooks:

    Data Generation and Visualization:

    DORA_generator.py: Generates, plots, and saves training and testing data.Usage:

    python DORA_generator.py -time 250 -plots 1

    DORA.ipynb: A Jupyter Notebook for dataset generation, loading, and plotting.

    Data Loading and Plotting:

    ReadData.py: Loads and plots the provided datasets (DORA_Train.csv and DORA_Test.csv).

    📈 Model Evaluation:

    The prediction model's success is determined by its ability to extrapolate system behavior outside the training data.System response characteristics for external forcing are quantified in terms of amplitude and mean of ( q1^2(t) ).These can be obtained using the provided Signal_Characteristic function.

    🔹 Performance Metrics:

    Response Amplitude Error:MSE[max(q1_prediction²(t > t)), max(q1_original²(t > t))]

    Response Mean Error:MSE[Mean(q1_prediction²(t > t)), Mean(q1_original²(t > t))]

    Note: ( t* = 20s ) denotes the steady-state time.

    📌 Reference Implementation:

    An exemplar solution using reservoir computing is detailed in the following:📖 Yadav et al., 2025 – Springer Nonlinear Dynamics

    📄 Citation:

    If you utilize this dataset or code in your research, please cite:

    @article{Yadav2024, author = {Manish Yadav and Swati Chauhan and Manish Dev Shrimali and Merten Stender}, title = {Predicting multi-parametric dynamics of an externally forced oscillator using reservoir computing and minimal data}, journal = {Nonlinear Dynamics}, year = {2024}, doi = {10.1007/s11071-024-10720-w}}

  20. D

    Data-driven analysis of structural instabilities in electroactive polymer...

    • darus.uni-stuttgart.de
    Updated Feb 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Siddharth Sriram (2024). Data-driven analysis of structural instabilities in electroactive polymer bilayers based on a variational saddle-point principle: Datasets and ML codes [Dataset]. http://doi.org/10.18419/DARUS-3881
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 16, 2024
    Dataset provided by
    DaRUS
    Authors
    Siddharth Sriram
    License

    https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881

    Dataset funded by
    DFG
    Description

    The datasets and codes provided here are associated with our article entitled "Data-driven analysis of structural instabilities in electroactive polymer bilayers based on a variational saddle-point principle". The main idea of the work is to develop surrogate models using the concepts of machine learning (ML) to predict the onset of wrinkling instabilities in dielectric elastomer (DE) bilayers as a function of its tunable geometric and material parameters. The required datasets for building the surrogate models are generated using a finite-element-based framework for structural stability analysis of DE specimens that is rooted in a saddle-point-based variational principle. For a detailed description of this finite-element framework, the sampling of data points for the training/test sets and some brief notes regarding our implementation of the ML-based surrogates, kindly refer to our article mentioned above. Here, the datasets 'training_set.xlsx' and 'test_set.xlsx' contain the values of the critical buckling load (critical electric-charge density) and critical wrinkle count for the DE bilayer for the sampled data points, where each data point represents a unique set of four tunable input-feature values. The article above provides a description of these features, their physical units and their considered domain of values. The individual Jupyter notebooks import the training dataset and develop ML models for the different problems that are described in the article. The developed models are cross-validated and then tested on the test dataset. Extensive comments describing the ML workflow have been made in the notebooks for the user's reference. The conda environment containing all the necessary packages and dependencies for the execution of the Jupyter notebooks is provided in the file 'de_instabilities.yml'.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki (2025). Jupyter Notebook Activity Dataset (rsds-20241113) [Dataset]. http://doi.org/10.5281/zenodo.13357570
Organization logo

Jupyter Notebook Activity Dataset (rsds-20241113)

Explore at:
zip, application/gzipAvailable download formats
Dataset updated
Jan 18, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tomoki Nakamaru; Tomoki Nakamaru; Tomomasa Matsunaga; Tetsuro Yamazaki; Tomomasa Matsunaga; Tetsuro Yamazaki
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

List of data

  • rsds-20241113.zip: Collection of SQLite database files
  • image.tar.gz: Docker image provided in our data collection experiment
  • redspot-341ffa5.zip: Redspot source code (redspot@341ffa5)

Extended version of Section 2D of our paper

Redspot is a Jupyter extension (i.e., Python package) that records activity signals. However, it also offers interfaces to read recorded signals. The following shows the most basic usage of its command-line interface:
redspot replay


This command generates snapshots (.ipynb files) restored from the signal records. Note that this command does not produce a snapshot for every signal. Since the change represented by a single signal is typically minimal (e.g., one keystroke), generating a snapshot for each signal results in a meaninglessly large number of snapshots. However, we want to obtain signal-level snapshots for some analyses. In such cases, one can analyze them using the application programming interfaces:

from redspot import database
from redspot.notebook import Notebook
nbk = Notebook()
for signal in database.get("path-to-db"):
time, panel, kind, args = signal
nbk.apply(kind, args) # apply change
print(nbk) # print notebook

To record activities, one needs to run the Redspot command in the recording mode as follows:

redspot record

This command launches Jupyter Notebook with Redspot enabled. Activities made in the launched environment are stored in an SQLite file named ``redspot.db'' under the current path.

To launch the environment we provided to the participants, one first needs to download and import the image (image.tar.gz). One can then run the image with the following command:

docker run --rm -it -p8888:8888

Note that the SQLite file is generated in the running container. The file can be downloaded into the host machine via the file viewer of Jupyter Notebook.

Search
Clear search
Close search
Google apps
Main menu