Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
redspot replay
from redspot import database
from redspot.notebook import Notebook
nbk = Notebook()
for signal in database.get("path-to-db"):
time, panel, kind, args = signal
nbk.apply(kind, args) # apply change
print(nbk) # print notebook
redspot record
docker run --rm -it -p8888:8888
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset compares four cities FIXED-line broadband internet speeds: - Melbourne, AU - Bangkok, TH - Shanghai, CN - Los Angeles, US - Alice Springs, AU
ERRATA: 1.Data is for Q3 2020, but some files are labelled incorrectly as 02-20 of June 20. They all should read Sept 20, or 09-20 as Q3 20, rather than Q2. Will rename and reload. Amended in v7.
*lines of data for each geojson file; a line equates to a 600m^2 location, inc total tests, devices used, and average upload and download speed - MEL 16181 locations/lines => 0.85M speedtests (16.7 tests per 100people) - SHG 31745 lines => 0.65M speedtests (2.5/100pp) - BKK 29296 lines => 1.5M speedtests (14.3/100pp) - LAX 15899 lines => 1.3M speedtests (10.4/100pp) - ALC 76 lines => 500 speedtests (2/100pp)
Geojsons of these 2* by 2* extracts for MEL, BKK, SHG now added, and LAX added v6. Alice Springs added v15.
This dataset unpacks, geospatially, data summaries provided in Speedtest Global Index (linked below). See Jupyter Notebook (*.ipynb) to interrogate geo data. See link to install Jupyter.
** To Do Will add Google Map versions so everyone can see without installing Jupyter. - Link to Google Map (BKK) added below. Key:Green > 100Mbps(Superfast). Black > 500Mbps (Ultrafast). CSV provided. Code in Speedtestv1.1.ipynb Jupyter Notebook. - Community (Whirlpool) surprised [Link: https://whrl.pl/RgAPTl] that Melb has 20% at or above 100Mbps. Suggest plot Top 20% on map for community. Google Map link - now added (and tweet).
** Python melb = au_tiles.cx[144:146 , -39:-37] #Lat/Lon extract shg = tiles.cx[120:122 , 30:32] #Lat/Lon extract bkk = tiles.cx[100:102 , 13:15] #Lat/Lon extract lax = tiles.cx[-118:-120, 33:35] #lat/Lon extract ALC=tiles.cx[132:134, -22:-24] #Lat/Lon extract
Histograms (v9), and data visualisations (v3,5,9,11) will be provided. Data Sourced from - This is an extract of Speedtest Open data available at Amazon WS (link below - opendata.aws).
**VERSIONS v.24 Add tweet and google map of Top 20% (over 100Mbps locations) in Mel Q322. Add v.1.5 MEL-Superfast notebook, and CSV of results (now on Google Map; link below). v23. Add graph of 2022 Broadband distribution, and compare 2020 - 2022. Updated v1.4 Jupyter notebook. v22. Add Import ipynb; workflow-import-4cities. v21. Add Q3 2022 data; five cities inc ALC. Geojson files. (2020; 4.3M tests 2022; 2.9M tests)
v20. Speedtest - Five Cities inc ALC. v19. Add ALC2.ipynb. v18. Add ALC line graph. v17. Added ipynb for ALC. Added ALC to title.v16. Load Alice Springs Data Q221 - csv. Added Google Map link of ALC. v15. Load Melb Q1 2021 data - csv. V14. Added Melb Q1 2021 data - geojson. v13. Added Twitter link to pics. v12 Add Line-Compare pic (fastest 1000 locations) inc Jupyter (nbn-intl-v1.2.ipynb). v11 Add Line-Compare pic, plotting Four Cities on a graph. v10 Add Four Histograms in one pic. v9 Add Histogram for Four Cities. Add NBN-Intl.v1.1.ipynb (Jupyter Notebook). v8 Renamed LAX file to Q3, rather than 03. v7 Amended file names of BKK files to correctly label as Q3, not Q2 or 06. v6 Added LAX file. v5 Add screenshot of BKK Google Map. v4 Add BKK Google map(link below), and BKK csv mapping files. v3 replaced MEL map with big key version. Prev key was very tiny in top right corner. v2 Uploaded MEL, SHG, BKK data and Jupyter Notebook v1 Metadata record
** LICENCE AWS data licence on Speedtest data is "CC BY-NC-SA 4.0", so use of this data must be: - non-commercial (NC) - reuse must be share-alike (SA)(add same licence). This restricts the standard CC-BY Figshare licence.
** Other uses of Speedtest Open Data; - see link at Speedtest below.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The self-documenting aspects and the ability to reproduce results have been touted as significant benefits of Jupyter Notebooks. At the same time, there has been growing criticism that the way notebooks are being used leads to unexpected behavior, encourages poor coding practices and that their results can be hard to reproduce. To understand good and bad practices used in the development of real notebooks, we analyzed 1.4 million notebooks from GitHub. Based on the results, we proposed and evaluated Julynter, a linting tool for Jupyter Notebooks.
Papers:
This repository contains three files:
Reproducing the Notebook Study
The db2020-09-22.dump.gz file contains a PostgreSQL dump of the database, with all the data we extracted from notebooks. For loading it, run:
gunzip -c db2020-09-22.dump.gz | psql jupyter
Note that this file contains only the database with the extracted data. The actual repositories are available in a google drive folder, which also contains the docker images we used in the reproducibility study. The repositories are stored as content/{hash_dir1}/{hash_dir2}.tar.bz2, where hash_dir1 and hash_dir2 are columns of repositories in the database.
For scripts, notebooks, and detailed instructions on how to analyze or reproduce the data collection, please check the instructions on the Jupyter Archaeology repository (tag 1.0.0)
The sample.tar.gz file contains the repositories obtained during the manual sampling.
Reproducing the Julynter Experiment
The julynter_reproducility.tar.gz file contains all the data collected in the Julynter experiment and the analysis notebooks. Reproducing the analysis is straightforward:
The collected data is stored in the julynter/data folder.
Changelog
2019/01/14 - Version 1 - Initial version
2019/01/22 - Version 2 - Update N8.Execution.ipynb to calculate the rate of failure for each reason
2019/03/13 - Version 3 - Update package for camera ready. Add columns to db to detect duplicates, change notebooks to consider them, and add N1.Skip.Notebook.ipynb and N11.Repository.With.Notebook.Restriction.ipynb.
2021/03/15 - Version 4 - Add Julynter experiment; Update database dump to include new data collected for the second paper; remove scripts and analysis notebooks from this package (moved to GitHub), add a link to Google Drive with collected repository files
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was originally curated by Software Carpentry, a branch of The Carpentries non-profit organization, and is based on data from the Gapminder Foundation. It consists of six tabular CSV files containing GDP data for various countries across different years. The dataset was initially prepared for the Software Carpentry tutorial "Plotting and Programming in Python" and is also reused in the Galaxy Training Network (GTN) tutorial "Use Jupyter Notebooks in Galaxy."
This GTN tutorial provides an introduction to launching a Jupyter Notebook in Galaxy, installing dependencies, and importing and exporting data. It serves as a setup guide for a Jupyter Notebook environment that can be used to follow the Software Carpentry tutorial "Plotting and Programming in Python."
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive reproduces a table titled "Table 3.1 Boone county population size, 1990 and 2000" from Wang and vom Hofe (2007, p.58). The archive provides a Jupyter Notebook that uses Python and can be run in Google Colaboratory. The workflow uses Census API to retrieve data, reproduce the table, and ensure reproducibility for anyone accessing this archive.The Python code was developed in Google Colaboratory, or Google Colab for short, which is an Integrated Development Environment (IDE) of JupyterLab and streamlines package installation, code collaboration and management. The Census API is used to obtain population counts from the 1990 and 2000 Decennial Census (Summary File 1, 100% data). All downloaded data are maintained in the notebook's temporary working directory while in use. The data are also stored separately with this archive.The notebook features extensive explanations, comments, code snippets, and code output. The notebook can be viewed in a PDF format or downloaded and opened in Google Colab. References to external resources are also provided for the various functional components. The notebook features code to perform the following functions:install/import necessary Python packagesintroduce a Census API Querydownload Census data via CensusAPI manipulate Census tabular data calculate absolute change and percent changeformatting numbersexport the table to csvThe notebook can be modified to perform the same operations for any county in the United States by changing the State and County FIPS code parameters for the Census API downloads. The notebook could be adapted for use in other environments (i.e., Jupyter Notebook) as well as reading and writing files to a local or shared drive, or cloud drive (i.e., Google Drive).
Iris
The following code can be used to load the dataset from its stored location at NERSC. You may also access this code via a NERSC-hosted Jupyter notebook here.
import pandas as pd iris_dat = pd.read_csv('/global/cfs/cdirs/dasrepo/www/ai_ready_datasets/iris/data/iris.csv')
If you would like to download the data, visit the following link: https://portal.nersc.gov/cfs/dasrepo/ai_ready_datasets/iris/data
This resource contains Jupyter Notebooks with examples for accessing USGS NWIS data via web services and performing subsequent analysis related to drought with particular focus on sites in Utah and the southwestern United States (could be modified to any USGS sites). The code uses the Python DataRetrieval package. The resource is part of set of materials for hydroinformatics and water data science instruction. Complete learning module materials are found in HydroLearn: Jones, A.S., Horsburgh, J.S., Bastidas Pacheco, C.J. (2022). Hydroinformatics and Water Data Science. HydroLearn. https://edx.hydrolearn.org/courses/course-v1:USU+CEE6110+2022/about.
This resources consists of 6 example notebooks: 1. Example 1: Import and plot daily flow data 2. Example 2: Import and plot instantaneous flow data for multiple sites 3. Example 3: Perform analyses with USGS annual statistics data 4. Example 4: Retrieve data and find daily flow percentiles 3. Example 5: Further examination of drought year flows 6. Coding challenge: Assess drought severity
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Read me file for the data repository ******************************************************************************* This repository has raw data for the publication "Enhancing Carrier Mobility In Monolayer MoS2 Transistors With Process Induced Strain". We arrange the data following the figure in which it first appeared. For all electrical transfer measurement, we provide the up-sweep and down-sweep data, with voltage units in V and conductance unit in S. All Raman modes have unit of cm^-1. ******************************************************************************* How to use this dataset All data in this dataset is stored in binary Numpy array format as .npy file. To read a .npy file: use the Numpy module of the python language, and use np.load() command. Example: suppose the filename is example_data.npy. To load it into a python program, open a Jupyter notebook, or in the python program, run: import numpy as np data = np.load("example_data.npy") Then the example file is stored in the data object. *******************************************************************************
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
This data set includes a collection of measurements using DecaWave DW1000 UWB radios in two indoor environments used for motion detection functionality. Measurements include channel impulse response (CIR) samples in form of power delay profile (PDP) with corresponding timestamps for three channels for each indoor environment.
Data set includes pieces of Python code and Jupyter notebooks for data loading, analysis and to reproduce the results of a paper entitled "UWB Radio Based Motion Detection System for Assisted Living" submitted to MDPI Sensors.
The data set will require around 10 GB of total free space after extraction.
The code included in the data set is written and tested on Linux (Ubuntu 20.04) and requires 16 GB of RAM and additional SWAP partition to run properly. The code can be modified to consume less memory but it requires unnecessary additional work. If the .npy format is compatible with your numpy version, you won't need to regenerate npy data from .csv files.
Data Set Structure
The resulting folder after extracting the uwb_motion_detection.zip file is organized as follows:
data subfolder: contains all original .csv and intermediate .npy data files.
models
pdp: this folder contains 4 .csv files with raw PDP measurements (timestamp + PDP). The data format will be discussed in the following section.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_pdp_diff.py
validation subfolder: contains data for motion detection validation
events: contains .npy files with motion events for validation. The .npy files are generated using generate_event_x.py files or notebooks inside the /Process/validation folder.
pdp: this folder contains raw PDP measurements in .csv format.
pdp_diff: this folder contains .npy files with PDP samples and .npy files with timestamps. Those files are generated by running the generate_pdp_diff.py script.
generate_events_0.py
generate_events_1.py
generate_events_2.py
generate_pdp_diff.py
figures subfolder: contains all figures generated in Jupyter notebooks inside the "Process" folder.
Process subfolder: contains Jupyter notebooks with data processing and motion detection code.
MotionDetection: contains notebook comparing standard score motion detection with windowed standard score motion detection
OnlineModels: presents the development process of online models definitions
PDP_diff: presents the basic principle of PDP differences used in the motion detection
Validation: presents a motion detection validation process
Raw data structure
All .csv files in data folder contain raw PDP measurements with timestamps for each PDP sample. The structure of file goes as follows:
unix timestamp, cir0 [dBm], cir1 [dBm], cir2[dBm] ... cir149[dBm]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this paper we fabricate 3D ferromagnetic nanowires with sub-100nm using two photon lithography at wavelength of 405 nm. We demonstrate a range of novel domain wall textures via micromagnetics, and characterise our experimental systems using SEM, qDIC, AFM and MFM and observe domain wall pinning largely influenced by roughness and thickness gradients of the deposited magnetic material.In the dataset we provide the raw and processed data for each figure in the publication. The data repository folder is organised by figure, and subsequent directories containing raw data, processed data, blender file for all schematics (organised by figure in the filetree), python analysis scripts in the form of jupyter notebooks for all relevant figures, readme.txt files for directory navigation, etc.fig1 -master_blender_file_figs1-3.blend: blender master file for generating schematics shown in figs1-3 (organised by relevant folders in .blend file tree)-fig1a-d: saved as jpgs-fig1_README.txtfig2 -raw_sems: raw sem images as .tiff files-features: sem feature sizes as .txt files-fig2b.ipynb: sem feature size analysis notebook-fig2a-e: panels saved as .jpgs and .pngs-fig2_README.txtfig3 -fig3_vtk/ctw_hh.vtk: .vtk file for visualising the ctw domain wall in figure 3 of main paper-paraview/paraview_state_fig3_s3-5.pvsm: .pvsm file to load into paraview to visualise all domain wall types, other .vtks can be found in the appropriate folders-nmag_raw/: directory containing raw NMag python, data, mesh, h5 and q files. Used in the relaxation of head to head domain walls.-psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig3a.ipynb: jupyter notebook for loading the data and generating figure 3a-fig3a-g: all fig 3 panels saved as .jpgs and .pngs-fig3_README.txtfig4 -afm/: raw data (.gwy file)-mfm/: raw data (.gwy file)-fig4a & fig4b.png: of the analysed 2D and 3D afm images-fig4c & fig4d.png: of the analysed 2D and 3D mfm images-fig4e_4f.ipynb: jupyter notebook for generating figures 4e and 4f-fig4e & fig4f.png: binarized image and normalised count plot-img.png: raw binarized image of the mfm shown in figure 4c and 4d-peaks.csv: raw data peak fit data, pixel number versus normalised counts (see jupyter notebook for relevant columns)-fig4_README.txtfig5 -fig5a-b_sems/: folder containing raw uncropped .tiffs angled sem views of the sinusoidal nanowires analysed via mfm-fig5c-d_heatmaps/: folder containing raw data .csv data of pinning probability as function of position and in-plane field for l5 and l2 wires-fig5e-j_afm_tot_bk/: folder containing raw data (afm files, .gwy, .txt with z-profiles, roughness and waviness, and collated .csv files) for l5 and l2 wires-fig5c-d.ipynb: jupyter notebook for loading relevant data and generating fig5c and 5d heatmaps-fig5e-j.ipynb: jupyter notebook for loading relevant data and generating fig5e to fig 5j afm, projected total field and depinning fields-fig5a-5j: all sub-panels saved as .tif or .pngs-fig5_README.txts1 -k500_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 500 and linewidth 40 pixels -k5000_lw40.csv: raw data containing phase data and pixel number taken across line profile shown in fig_s1a.png for \kappa = 5000 and linewidth 40 pixels -fig_s1.ipynb: jupyter notebook for loading .csv data, analysing and generating fig_s1c.png to fig_s1f.png -fig_s1a-s1f.png: sub-panels of fig_s1 -phi_500.dat and phi_5000.dat: raw .dat files of the phase images taken from qDIC imaging, can be loaded as text image in ImageJ for processing. Images are rotated by -135.1 degrees with bicubic interpolation. -fig_s1_README.txts2 -psf/: directory containing raw data of the 405 voxel comprised of a tiff stack (405_psf.tiff), accompanying coordinates (405_psf.txt), and colormap-fig_s2.ipynb: jupyter notebook for loading the data and generating figure s2-fig_s2.png: raw .png file of fig_s2-fig_s2_README.txts3 -fig_s3_vtk/: folder containing .vtk file used for the avw shown in supplementary figure s3, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s3a-s3d.png: .png files of the sub-panels of supplementary figure s3-fig_s3_README.txts4 -fig_s4_vtk/: folder containing .vtk file used for the vw shown in supplementary figure s4, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s4a-s4d.png: .png files of the sub-panels of supplementary figure s4-fig_s4_README.txts5 -fig_s5_vtk/: folder containing .vtk file used for the cvw shown in supplementary figure s5, use fig3/paraview/paraview_state_fig3_s3-5.pvsm to load and visualise data-fig_s5a-s5d.png: .png files of the sub-panels of supplementary figure s5-fig_s5_README.txts6 -matt_sss_190903ja_inplane_oct19th.007: raw afm/mfm data-fig_s6a.png: processed image of the raw mfm data above-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw afm profile data-matt_sss_190903ja_inplane_oct19th_supp_fig6_afm.txt: raw mfm profile data-fig_s6.ipynb: jupyter notebook used to load the above data and generate the z-profile and normalised phase plots-fig_s6d-e.png: .png files of the sub-panels shown in supplementary figure s6-fig_s6_README.txts7 -matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of bottom blue region-matt_sss_190903ja_inplane_oct14th_010_supp_mfm_bot.txt: raw mfm profile data of top red region-fig_s7.ipynb: jupyter notebook used to load above data and generate the z-profiles and normalised phase plots-fig_s7f-g.png: .png files of the sub-panels shown in supplementary figure s7-fig_s7_README.txt-NOTE: the mfm image shown in this supplementary figure is identical to the one in figure 4d-c)s8 -l1/: folder containing raw data (.txt with z-profiles, roughness and waviness, and collated .csv files) for l1 wires-fig_s8.ipynb: jupyter notebook used to generate above data and generate the z-profile, total projected field and depinning fields-fig_s8.png: .png of supplementary figure s8.-fig_s8_README.txtst1 -st1_energies_pops.csv: contains the energy densities and population statistics of the relaxed dws in the micromagnetic simulations, shown in supplementary table 1.-st1_README.txt
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Information
This dataset presents long-term term indoor solar harvesting traces and jointly monitored with the ambient conditions. The data is recorded at 6 indoor positions with diverse characteristics at our institute at ETH Zurich in Zurich, Switzerland.
The data is collected with a measurement platform [3] consisting of a solar panel (AM-5412) connected to a bq25505 energy harvesting chip that stores the harvested energy in a virtual battery circuit. Two TSL45315 light sensors placed on opposite sides of the solar panel monitor the illuminance level and a BME280 sensor logs ambient conditions like temperature, humidity and air pressure.
The dataset contains the measurement of the energy flow at the input and the output of the bq25505 harvesting circuit, as well as the illuminance, temperature, humidity and air pressure measurements of the ambient sensors. The following timestamped data columns are available in the raw measurement format, as well as preprocessed and filtered HDF5 datasets:
V_in - Converter input/solar panel output voltage, in volt
I_in - Converter input/solar panel output current, in ampere
V_bat - Battery voltage (emulated through circuit), in volt
I_bat - Net Battery current, in/out flowing current, in ampere
Ev_left - Illuminance left of solar panel, in lux
Ev_right - Illuminance left of solar panel, in lux
P_amb - Ambient air pressure, in pascal
RH_amb - Ambient relative humidity, unit-less between 0 and 1
T_amb - Ambient temperature, in centigrade Celsius
The following publication presents and overview of the dataset and more details on the deployment used for data collection. A copy of the abstract is included in this dataset, see the file abstract.pdf.
L. Sigrist, A. Gomez, and L. Thiele. "Dataset: Tracing Indoor Solar Harvesting." In Proceedings of the 2nd Workshop on Data Acquisition To Analysis (DATA '19), 2019.
Folder Structure and Files
processed/ - This folder holds the imported, merged and filtered datasets of the power and sensor measurements. The datasets are stored in HDF5 format and split by measurement position posXX and and power and ambient sensor measurements. The files belonging to this folder are contained in archives named yyyy_mm_processed.tar, where yyyy and mm represent the year and month the data was published. A separate file lists the exact content of each archive (see below).
raw/ - This folder holds the raw measurement files recorded with the RocketLogger [1, 2] and using the measurement platform available at [3]. The files belonging to this folder are contained in archives named yyyy_mm_raw.tar, where yyyy and mmrepresent the year and month the data was published. A separate file lists the exact content of each archive (see below).
LICENSE - License information for the dataset.
README.md - The README file containing this information.
abstract.pdf - A copy of the above mentioned abstract submitted to the DATA '19 Workshop, introducing this dataset and the deployment used to collect it.
raw_import.ipynb [open in nbviewer] - Jupyter Python notebook to import, merge, and filter the raw dataset from the raw/ folder. This is the exact code used to generate the processed dataset and store it in the HDF5 format in the processed/folder.
raw_preview.ipynb [open in nbviewer] - This Jupyter Python notebook imports the raw dataset directly and plots a preview of the full power trace for all measurement positions.
processing_python.ipynb [open in nbviewer] - Jupyter Python notebook demonstrating the import and use of the processed dataset in Python. Calculates column-wise statistics, includes more detailed power plots and the simple energy predictor performance comparison included in the abstract.
processing_r.ipynb [open in nbviewer] - Jupyter R notebook demonstrating the import and use of the processed dataset in R. Calculates column-wise statistics and extracts and plots the energy harvesting conversion efficiency included in the abstract. Furthermore, the harvested power is analyzed as a function of the ambient light level.
Dataset File Lists
Processed Dataset Files
The list of the processed datasets included in the yyyy_mm_processed.tar archive is provided in yyyy_mm_processed.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.
Raw Dataset Files
A list of the raw measurement files included in the yyyy_mm_raw.tar archive(s) is provided in yyyy_mm_raw.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.
Dataset Revisions
v1.0 (2019-08-03)
Initial release. Includes the data collected from 2017-07-27 to 2019-08-01. The dataset archive files related to this revision are 2019_08_raw.tar and 2019_08_processed.tar. For position pos06, the measurements from 2018-01-06 00:00:00 to 2018-01-10 00:00:00 are filtered (data inconsistency in file indoor1_p27.rld).
v1.1 (2019-09-09)
Revision of the processed dataset v1.0 and addition of the final dataset abstract. Updated processing scripts reduce the timestamp drift in the processed dataset, the archive 2019_08_processed.tar has been replaced. For position pos06, the measurements from 2018-01-06 16:00:00 to 2018-01-10 00:00:00 are filtered (indoor1_p27.rld data inconsistency).
v2.0 (2020-03-20)
Addition of new data. Includes the raw data collected from 2019-08-01 to 2019-03-16. The processed data is updated with full coverage from 2017-07-27 to 2019-03-16. The dataset archive files related to this revision are 2020_03_raw.tar and 2020_03_processed.tar.
Dataset Authors, Copyright and License
Authors: Lukas Sigrist, Andres Gomez, and Lothar Thiele
Contact: Lukas Sigrist (lukas.sigrist@tik.ee.ethz.ch)
Copyright: (c) 2017-2019, ETH Zurich, Computer Engineering Group
License: Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)
References
[1] L. Sigrist, A. Gomez, R. Lim, S. Lippuner, M. Leubin, and L. Thiele. Measurement and validation of energy harvesting IoT devices. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[2] ETH Zurich, Computer Engineering Group. RocketLogger Project Website, https://rocketlogger.ethz.ch/.
[3] L. Sigrist. Solar Harvesting and Ambient Tracing Platform, 2019. https://gitlab.ethz.ch/tec/public/employees/sigristl/harvesting_tracing
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This archive contains the data and Python code generating figures for the article "Emergent ferromagnetism near three-quarters filling in twisted bilayer graphene" by Aaron L. Sharpe, Eli J. Fox, Arthur W. Barnard, Joe Finney, Kenji Watanabe, Takashi Taniguchi, M. A. Kastner, and David Goldhaber-Gordon and available at https://arxiv.org/abs/1901.03520. This archive contains the following: 1) TBG_ferromagnetism_figures.ipynb, a Jupyter notebook loading data and generating figures. The notebook has been tested with Python version 3.6.7 and Jupyter notebook server version 5.5.0. 2) HTML_notebook directory that contains 'TBG_ferromagnetism_figures.html' an HTML file generated from the Jupyter notebook, and PNG files loaded by the HTML file, 3) scripts directory that contains additional files used by the Jupyter notebook, and 4) data directory, containing all data used to generate figures for the manuscript, stored as JSON objects. Refer to the notebook for figure captions describing the data.
This resource supports the work published in Strauch et al., (2018) "A hydroclimatological approach to predicting regional landslide probability using Landlab", Earth Surf. Dynam., 6, 1-26 . It demonstrates a hydroclimatological approach to modeling of regional shallow landslide initiation based on the infinite slope stability model coupled with a steady-state subsurface flow representation. The model component is available as the LandslideProbability component in Landlab, an open-source, Python-based landscape earth systems modeling environment described in Hobley et al. (2017, Earth Surf. Dynam., 5, 21–46, https://doi.org/10.5194/esurf-5-21-2017). The model operates on a digital elevation model (DEM) grid to which local field parameters, such as cohesion and soil depth, are attached. A Monte Carlo approach is used to account for parameter uncertainty and calculate probability of shallow landsliding as well as the probability of soil saturation based on annual maximum recharge. The model is demonstrated in a steep mountainous region in northern Washington, U.S.A., using 30-m grid resolution over 2,700 km2.
This resource contains a 1) User Manual that describes the Landlab LandslideProbability Component design, parameters, and step-by-step guidance on using the component in a model, and 2) two Landlab driver codes (notebooks) and customized component code to run Landlab's LandslideProbability component for 2a) synthetic recharge and 2b) modeled recharge published in Strauch et al., (2018). The Jupyter Notebooks use HydroShare code libraries to import data located at this resource: https://www.hydroshare.org/resource/a5b52c0e1493401a815f4e77b09d352b/.
The Synthetic Recharge Jupyter Notebook
The Modeled Recharge Jupyter Notebook
In our NFDITalks, scientists from different disciplines present exciting topics around NFDI and research data management. In this episode, Björn Hagemeier will talk about "Jupyter4NFDI - a central Jupyter Hub for the NFDI".
Jupyter Notebooks are widespread across scientific disciplines today. However, their deployment across various NFDI consortia currently occurs through individual JupyterHubs, resulting in access barriers to computational and data resources. Whereas some of the services are widely available, others are barricaded within VPNs or otherwise inaccessible for a wider audience. Our ambition is to improve the user experience by offering a centralized service to extend the reach of Jupyter to a broader audience within the NFDI and beyond. The technical foundation for our service will be the versatile configuration frontend that has been proven to meet user needs for the past seven years at JSC. It is continuously extended and traces and ever growing set of backend resources ranging from Cloud based, small-scale JupyterLabs to full-scale remote desktop environments on high-performance computing systems such as Germany's highest-ranked TOP500 system JUWELS Booster.
Importantly, the centralized system will not only simplify access but also support the import of projects along with their necessary dependencies, fostering an ecosystem conducive to creating reproducible FAIR Digital Objects (FDOs), possibly along with notebook identifiers supported by PID4NFDI.
In this talk, we'll revisit the history of the current solution, the landscape in which we intend to make it available, and give an outlook on the future of the service.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Gaia EDR3 Catalogs of Machine-Learned Radial Velocities
Spatially complete Test-Set and Machine-Learned Radial Velocity (ML-RV) Catalogs described in Dropulic et al., arXiv:2205.12278. The spatially complete Test-Set Catalog contains a total of 4,332,657 stars, while the spatially complete ML-RV Catalog contains 91,840,346 stars. We provide Gaia EDR3 Source IDs, the network-predicted line-of-sight velocity in km/s, and the network-predicted uncertainty in km/s.
We have included a simple Jupyter notebook demonstrating how to import the data, and make a simple histogram with it.
If you find this catalog useful in your work, please cite Dropulic et al. arXiv:2205.12278, as well as Dropulic et al. ApJL 915, L14 (2021) arXiv:2103.14039.
For environmental data measured by a variety of sensors and compiled from various sources, practitioners need tools that facilitate data access and data analysis. Data are often organized in formats that are incompatible with each other and that prevent full data integration. Furthermore, analyses of these data are hampered by the inadequate mechanisms for storage and organization. Ideally, data should be centrally housed and organized in an intuitive structure with established patterns for analyses. However, in reality, the data are often scattered in multiple files without uniform structure that must be transferred between users and called individually and manually for each analysis. This effort describes a process for compiling environmental data into a single, central database that can be accessed for analyses. We use the Logan River watershed and observed water level, discharge, specific conductance, and temperature as a test case. Of interest is analysis of flow partitioning. We formatted data files and organized them into a hierarchy, and we developed scripts that import the data to a database with structure designed for hydrologic time series data. Scripts access the populated database to determine baseflow separation, flow balance, and mass balance and visualize the results. The analyses were compiled into a package of scripts in Python, which can be modified and run by scientists and researchers to determine gains and losses in reaches of interest. To facilitate reproducibility, the database and associated scripts were shared to HydroShare as Jupyter Notebooks so that any user can access the data and perform the analyses, which facilitates standardization of these operations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The MCCN project is to deliver tools to assist the agricultural sector to understand crop-environment relationships, specifically by facilitating generation of data cubes for spatiotemporal data. This repository contains Jupyter notebooks to demonstrate the functionality of the MCCN data cube components.The dataset contains input files for the case study (source_data), RO-Crate metadata (ro-crate-metadata.json), results from the case study (results), and Jupyter Notebook (MCCN-CASE 3.ipynb)Research Activity Identifier (RAiD)RAiD: https://doi.org/10.26292/8679d473Case StudiesThis repository contains code and sample data for the following case studies. Note that the analyses here are to demonstrate the software and result should not be considered scientifically or statistically meaningful. No effort has been made to address bias in samples, and sample data may not be available at sufficient density to warrant analysis. All case studies end with generation of an RO-Crate data package including the source data, the notebook and generated outputs, including netcdf exports of the datacubes themselves.Case Study 3 - Select optimal survey localityGiven a set of existing survey locations across a variable landscape, determine the optimal site to add to increase the range of surveyed environments. This study demonstrates: 1) Loading heterogeneous data sources into a cube, and 2) Analysis and visualisation using numpy and matplotlib.Data SourcesThe primary goal for this case study is to demonstrate being able to import a set of environmental values for different sites and then use these to identify a subset that maximises spread across the various environmental dimensions.This is a simple implementation that uses four environmental attributes imported for all Australia (or a subset like NSW) at a moderate grid scale:Digital soil maps for key soil properties over New South Wales, version 2.0 - SEED - see https://esoil.io/TERNLandscapes/Public/Pages/SLGA/ProductDetails-SoilAttributes.htmlANUCLIM Annual Mean Rainfall raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-rainfall-raster-layerANUCLIM Annual Mean Temperature raster layer - SEED - see https://datasets.seed.nsw.gov.au/dataset/anuclim-annual-mean-temperature-raster-layerDependenciesThis notebook requires Python 3.10 or higherInstall relevant Python libraries with: pip install mccn-engine rocrateInstalling mccn-engine will install other dependenciesOverviewGenerate STAC metadata for layers from predefined configuratiionLoad data cube and exclude nodata valuesScale all variables to a 0.0-1.0 rangeSelect four layers for comparison (soil organic carbon 0-30 cm, soil pH 0-30 cm, mean annual rainfall, mean annual temperature)Select 10 random points within NSWGenerate 10 new layers representing standardised environmental distance between one of the selected points and all other points in NSWFor every point in NSW, find the lowest environmental distance to any of the selected pointsSelect the point in NSW that has the highest value for the lowest environmental distance to any selected point - this is the most different pointClean up and save results to RO-Crate
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
🔹 Release v1.0 - Duffing Oscillator Response Analysis (DORA)
This release provides a collection of benchmark tasks and datasets, accompanied by minimal code to generate, import, and plot the data. The primary focus is on the Duffing Oscillator Response Analysis (DORA) prediction task, which evaluates machine learning models' ability to generalize system responses in unseen parameter regimes.
🚀 Key Features:
Duffing Oscillator Response Analysis (DORA) Prediction Task:
Objective: Predict the response of a forced Duffing oscillator using a minimal training dataset. This task assesses a model's capability to extrapolate system behavior in unseen parameter regimes, specifically varying amplitudes of external periodic forcing.
Expectation: A proficient model should qualitatively capture the system's response, such as identifying the exact number of cycles in a limit-cycle regime or chaotic trajectories when the system transitions to a chaotic regime, all trained on limited datasets.
Comprehensive Dataset:
Training Data (DORA_Train.csv): Contains data for two external forcing amplitudes, ( f $\in$ [0.46, 0.49] ).
Testing Data (DORA_Test.csv): Includes data for five forcing amplitudes, ( f $\in$ [0.2, 0.35, 0.48, 0.58, 0.75] ).
📊 Data Description:
Each dataset comprises five columns:
Column Description
t Time variable
q1(t) Time evolution of the Duffing oscillator's position
q2(t) Time evolution of the Duffing oscillator's velocity
f(t) Time evolution of external periodic forcing
f_amplitude Constant amplitude during system evaluation (default: 250)
🛠 Utility Scripts and Notebooks:
Data Generation and Visualization:
DORA_generator.py: Generates, plots, and saves training and testing data.Usage:
python DORA_generator.py -time 250 -plots 1
DORA.ipynb: A Jupyter Notebook for dataset generation, loading, and plotting.
Data Loading and Plotting:
ReadData.py: Loads and plots the provided datasets (DORA_Train.csv and DORA_Test.csv).
📈 Model Evaluation:
The prediction model's success is determined by its ability to extrapolate system behavior outside the training data.System response characteristics for external forcing are quantified in terms of amplitude and mean of ( q1^2(t) ).These can be obtained using the provided Signal_Characteristic function.
🔹 Performance Metrics:
Response Amplitude Error:MSE[max(q1_prediction²(t > t)), max(q1_original²(t > t))]
Response Mean Error:MSE[Mean(q1_prediction²(t > t)), Mean(q1_original²(t > t))]
Note: ( t* = 20s ) denotes the steady-state time.
📌 Reference Implementation:
An exemplar solution using reservoir computing is detailed in the following:📖 Yadav et al., 2025 – Springer Nonlinear Dynamics
📄 Citation:
If you utilize this dataset or code in your research, please cite:
@article{Yadav2024, author = {Manish Yadav and Swati Chauhan and Manish Dev Shrimali and Merten Stender}, title = {Predicting multi-parametric dynamics of an externally forced oscillator using reservoir computing and minimal data}, journal = {Nonlinear Dynamics}, year = {2024}, doi = {10.1007/s11071-024-10720-w}}
https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881https://darus.uni-stuttgart.de/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.18419/DARUS-3881
The datasets and codes provided here are associated with our article entitled "Data-driven analysis of structural instabilities in electroactive polymer bilayers based on a variational saddle-point principle". The main idea of the work is to develop surrogate models using the concepts of machine learning (ML) to predict the onset of wrinkling instabilities in dielectric elastomer (DE) bilayers as a function of its tunable geometric and material parameters. The required datasets for building the surrogate models are generated using a finite-element-based framework for structural stability analysis of DE specimens that is rooted in a saddle-point-based variational principle. For a detailed description of this finite-element framework, the sampling of data points for the training/test sets and some brief notes regarding our implementation of the ML-based surrogates, kindly refer to our article mentioned above. Here, the datasets 'training_set.xlsx' and 'test_set.xlsx' contain the values of the critical buckling load (critical electric-charge density) and critical wrinkle count for the DE bilayer for the sampled data points, where each data point represents a unique set of four tunable input-feature values. The article above provides a description of these features, their physical units and their considered domain of values. The individual Jupyter notebooks import the training dataset and develop ML models for the different problems that are described in the article. The developed models are cross-validated and then tested on the test dataset. Extensive comments describing the ML workflow have been made in the notebooks for the user's reference. The conda environment containing all the necessary packages and dependencies for the execution of the Jupyter notebooks is provided in the file 'de_instabilities.yml'.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
redspot replay
from redspot import database
from redspot.notebook import Notebook
nbk = Notebook()
for signal in database.get("path-to-db"):
time, panel, kind, args = signal
nbk.apply(kind, args) # apply change
print(nbk) # print notebook
redspot record
docker run --rm -it -p8888:8888