Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Author: Andrew J. Felton
Date: 5/5/2024
This R project contains the primary code and data (following pre-processing in python) used for data production, manipulation, visualization, and analysis and figure production for the study entitled:
"Global estimates of the storage and transit time of water through vegetation"
Please note that 'turnover' and 'transit' are used interchangeably in this project.
Data information:
The data folder contains key data sets used for analysis. In particular:
"data/turnover_from_python/updated/annual/multi_year_average/average_annual_turnover.nc" contains a global array summarizing five year (2016-2020) averages of annual transit, storage, canopy transpiration, and number of months of data. This is the core dataset for the analysis; however, each folder has much more data, including a dataset for each year of the analysis. Data are also available is separate .csv files for each land cover type. Oterh data can be found for the minimum, monthly, and seasonal transit time found in their respective folders. These data were produced using the python code found in the "supporting_code" folder given the ease of working with .nc and EASE grid in the xarray python module. R was used primarily for data visualization purposes. The remaining files in the "data" and "data/supporting_data"" folder primarily contain ground-based estimates of storage and transit found in public databases or through a literature search, but have been extensively processed and filtered here.
#Code information
Python scripts can be found in the "supporting_code" folder.
Each R script in this project has a particular function:
01_start.R: This script loads the R packages used in the analysis, sets the
directory, and imports custom functions for the project. You can also load in the
main transit time (turnover) datasets here using the `source()` function.
02_functions.R: This script contains the custom function for this analysis,
primarily to work with importing the seasonal transit data. Load this using the
`source()` function in the 01_start.R script.
03_generate_data.R: This script is not necessary to run and is primarily
for documentation. The main role of this code was to import and wrangle
the data needed to calculate ground-based estimates of aboveground water storage.
04_annual_turnover_storage_import.R: This script imports the annual turnover and
storage data for each landcover type. You load in these data from the 01_start.R script
using the `source()` function.
05_minimum_turnover_storage_import.R: This script imports the minimum turnover and
storage data for each landcover type. Minimum is defined as the lowest monthly
estimate.You load in these data from the 01_start.R script
using the `source()` function.
06_figures_tables.R: This is the main workhouse for figure/table production and
supporting analyses. This script generates the key figures and summary statistics
used in the study that then get saved in the manuscript_figures folder. Note that all
maps were produced using Python code found in the "supporting_code"" folder.
Ngoc Son Python Crocodile Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Dataset Card for Python-DPO
This dataset is the smaller version of Python-DPO-Large dataset and has been created using Argilla.
Load with datasets
To load this dataset with datasets, you'll just need to install datasets as pip install datasets --upgrade and then use the following code: from datasets import load_dataset
ds = load_dataset("NextWealth/Python-DPO")
Data Fields
Each data instance contains:
instruction: The problem description/requirements… See the full description on the dataset page: https://huggingface.co/datasets/NextWealth/Python-DPO.
The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
To use this dataset:
import tensorflow_datasets as tfds
ds = tfds.load('cifar10', split='train')
for ex in ds.take(4):
print(ex)
See the guide for more informations on tensorflow_datasets.
https://storage.googleapis.com/tfds-data/visualization/fig/cifar10-3.0.2.png" alt="Visualization" width="500px">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Open Context (https://opencontext.org) publishes free and open access research data for archaeology and related disciplines. An open source (but bespoke) Django (Python) application supports these data publishing services. The software repository is here: https://github.com/ekansa/open-context-py
The Open Context team runs ETL (extract, transform, load) workflows to import data contributed by researchers from various source relational databases and spreadsheets. Open Context uses PostgreSQL (https://www.postgresql.org) relational database to manage these imported data in a graph style schema. The Open Context Python application interacts with the PostgreSQL database via the Django Object-Relational-Model (ORM).
This database dump includes all published structured data organized used by Open Context (table names that start with 'oc_all_'). The binary media files referenced by these structured data records are stored elsewhere. Binary media files for some projects, still in preparation, are not yet archived with long term digital repositories.
These data comprehensively reflect the structured data currently published and publicly available on Open Context. Other data (such as user and group information) used to run the Website are not included.
IMPORTANT
This database dump contains data from roughly 190+ different projects. Each project dataset has its own metadata and citation expectations. If you use these data, you must cite each data contributor appropriately, not just this Zenodo archived database dump.
Python Llc Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Antonin Python Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
https://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
This dataset contains infra red recordings from a series of drop tower experiment conducted at ZARM. Thin PMMA samples were combusted under controlled atmospheric conditions and forced opposed flow. An overview of all experiment conditions and sample types is given in the index.csv file. Two kinds of samples were investigated: "full samples", i.e. continuous fuel strips, and "gap samples", i.e. fuel strips separated by air gaps of different widths. Consequently, if the `Gaps [Sequence]` or `Gap Material [Material]` fields are empty, the "full sample" was tested. For the "gap samples" the sequence of gaps is given as the individual widths of each gap in mm separated by slashes, e.g. '3/4/5' to refer to a sample with 3, 4 and 5 mm gaps.
The radial lens distortion of the individual frames was corrected as well as the temperature. The frames are stored as 3 dimensional arrays with the first two dimensions being the spatial and the third the time (or frame count) dimension. The value of each pixel corresponds to the measured temperature in °C at the pixel's location.
The experiment overview is given in the index.csv file. Here, the metadata for each experiment is given. The corresponding IR data can be found in the `Data Location [File]` column.
The frames are stored in hdf5 files as the 'data' dataset. The corresponding metadata is given as serialized JSON as the 'metadata' dataset. An example for loading the files in Python is given below:
import json
import h5py
import numpy as np
data_file = "exp0.hdf5"
with h5py.File(data_file, 'r') as f:
metadata = json.loads(f["metadata"][()]) # dictionary with date, sample and experiment conditions
data = np.array(f["data"]) # individual IR frames as a 3 dimensional array of shape (x, y, t)
The authors thank Jan Heißmeier and Michael Peters for technical support during the design phase and the experimental campaigns. This research has been funded by the German Federal Ministry for Economic Affairs and Climate Action through the German Space Agency at DLR in the framework of FLARE-G II & III (grants 50WM2160 & 50WM2456).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used in the various stage two experiments in: "Comparing Clustering Approaches for Smart Meter Time Series: Investigating the Influence of Dataset Properties on Performance". This includes datasets with varied characteristics.All datasets are stored in a dict with tuples of (time series array, class labels). To access data in python:import picklefilename = "dataset.txt"with open(filename, 'rb') as f: data = pickle.load(f)
Ballroom Python South Company Export Import Records. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains hyperspectral images obtained using SPECIM IQ for the Munsell soil color chart (MSC).
The hyperspectral images are stored in ENVI format. For those who are only interested in the endmember spectra for the MSC, we also provided the spectral library .sli and .hdr inside the endmembers folder.
The acquisition details for each image can be found in the .hdr file and metadata folder inside the whole folder. For the whole image, the acquisition details are:
Table 1. Acquisition details
samples | 512 |
lines | 512 |
bands | 204 |
default bands | 70, 53,19 |
binning | 1,1 |
tint (integration time) | 10 (ms) |
fps | 100 |
wavelength range | 397.32 - 1003.58 nm |
The dataset is organized into several folders, each containing different types of datasets.
chips folder contains only the cropped 20*20 voxels for each color chip reflectances. Each page has its own folder and each folder contains .hdr and .img for each color chip.
endmembers folder contains the spectral library (.sli and .hdr). Each page in MSC have their own .sli and .hdr.
Some of the code snippets that might help to read the dataset
using python spectral library to load the dataset
from spectral import *
import matplotlib.pyplot as plt
# load the hyperspectral image .hdr and store it to a variable
hsi = open_image(PATH)
# get the natural RGB plotting of the hyperspectral image using the SPECIM main band
hsi_rgb = hsi[:,:,[70,53,19]]
# read the spectral library .sli and store it to a variable
sli = open_image(PATH)
# plot the first endmember
plt.plot(sli.spectra[0])
# get the endmembers name
sli.names
if you have any question kindly reach me on riestiyf@stud.ntnu.no
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Multimodal Vision-Audio-Language Dataset is a large-scale dataset for multimodal learning. It contains 2M video clips with corresponding audio and a textual description of the visual and auditory content. The dataset is an ensemble of existing datasets and fills the gap of missing modalities. Details can be found in the attached report. Annotation The annotation files are provided as Parquet files. They can be read using Python and the pandas and pyarrow library. The split into train, validation and test set follows the split of the original datasets. Installation
pip install pandas pyarrow Example
import pandas as pddf = pd.read_parquet('annotation_train.parquet', engine='pyarrow')print(df.iloc[0])
dataset AudioSet filename train/---2_BBVHAA.mp3 captions_visual [a man in a black hat and glasses.] captions_auditory [a man speaks and dishes clank.] tags [Speech] Description The annotation file consists of the following fields:filename: Name of the corresponding file (video or audio file)dataset: Source dataset associated with the data pointcaptions_visual: A list of captions related to the visual content of the video. Can be NaN in case of no visual contentcaptions_auditory: A list of captions related to the auditory content of the videotags: A list of tags, classifying the sound of a file. It can be NaN if no tags are provided Data files The raw data files for most datasets are not released due to licensing issues. They must be downloaded from the source. However, due to missing files, we provide them on request. Please contact us at schaumloeffel@em.uni-frankfurt.de
Data model and generic query templates for translating and integrating a set of related CSV event logs into a single event graph for as used in https://dx.doi.org/10.1007/s13740-021-00122-1
Provides input data for 5 datasets (BPIC14, BPIC15, BPIC16, BPIC17, BPIC19)
Provides Python scripts to prepare and import each dataset into a Neo4j database instance through Cypher queries, representing behavioral information not globally (as in an event log), but locally per entity and per relation between entities.
Provides Python scripts to retrieve event data from a Neo4j database instance and render it using Graphviz dot.
The data model and queries are described in detail in: Stefan Esser, Dirk Fahland: Multi-Dimensional Event Data in Graph Databases (2020) https://arxiv.org/abs/2005.14552 and https://dx.doi.org/10.1007/s13740-021-00122-1
Fork the query code from Github: https://github.com/multi-dimensional-process-mining/graphdb-eventlogs
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Benchmark Dataset for Deep Learning-based Methods for 3D Topology Optimization.
One can find a description of the provided dataset partitions in Section 3 of Dittmer, S., Erzmann, D., Harms, H., Maass, P., SELTO: Sample-Efficient Learned Topology Optimization (2022) https://arxiv.org/abs/2209.05098.
Every dataset container consists of multiple enumerated pairs of CSV files. Each pair describes a unique topology optimization problem and a corresponding binarized SIMP solution. Every file of the form {i}.csv contains all voxel-wise information about the sample i. Every file of the form {i}_info.csv file contains scalar parameters of the topology optimization problem, such as material parameters.
This dataset represents topology optimization problems and solutions on the bases of voxels. We define all spatially varying quantities via the voxels' centers -- rather than via the vertices or surfaces of the voxels.
In {i}.csv files, each row corresponds to one voxel in the design space. The columns correspond to ['x', 'y', 'z', 'design_space', 'dirichlet_x', 'dirichlet_y', 'dirichlet_z', 'force_x', 'force_y', 'force_z', 'density'].
Any of these files with the index i can be imported using pandas by executing:
import pandas as pd
directory = ...
file_path = f'{directory}/{i}.csv'
column_names = ['x', 'y', 'z', 'design_space','dirichlet_x', 'dirichlet_y', 'dirichlet_z', 'force_x', 'force_y', 'force_z', 'density']
data = pd.read_csv(file_path, names=column_names)
From this pandas dataframe one can extract the torch tensors of forces F, Dirichlet conditions ωDirichlet, and design space information ωdesign using the following functions:
import torch
def get_shape_and_voxels(data):
shape = data[['x', 'y', 'z']].iloc[-1].values.astype(int) + 1
vox_x = data['x'].values
vox_y = data['y'].values
vox_z = data['z'].values
voxels = [vox_x, vox_y, vox_z]
return shape, voxels
def get_forces_boundary_conditions_and_design_space(data, shape, voxels):
F = torch.zeros(3, *shape, dtype=torch.float32)
F[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['force_x'].values, dtype=torch.float32)
F[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['force_y'].values, dtype=torch.float32)
F[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['force_z'].values, dtype=torch.float32)
ω_Dirichlet = torch.zeros(3, *shape, dtype=torch.float32)
ω_Dirichlet[0, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['dirichlet_x'].values, dtype=torch.float32)
ω_Dirichlet[1, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['dirichlet_y'].values, dtype=torch.float32)
ω_Dirichlet[2, voxels[0], voxels[1], voxels[2]] = torch.tensor(data['dirichlet_z'].values, dtype=torch.float32)
ω_design = torch.zeros(1, *shape, dtype=int)
ω_design[:, voxels[0], voxels[1], voxels[2]] = torch.from_numpy(data['design_space'].values.astype(int))
return F, ω_Dirichlet, ω_design
The corresponding {i}_info.csv files only have one row with column labels ['E', 'ν', 'σ_ys', 'vox_size', 'p_x', 'p_y', 'p_z'].
Analogously to above, one can import any {i}_info.csv file by executing:
file_path = f'{directory}/{i}_info.csv'
data_info_column_names = ['E', 'ν', 'σ_ys', 'vox_size', 'p_x', 'p_y', 'p_z']
data_info = pd.read_csv(file_path, names=data_info_column_names)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset Information
This dataset presents long-term term indoor solar harvesting traces and jointly monitored with the ambient conditions. The data is recorded at 6 indoor positions with diverse characteristics at our institute at ETH Zurich in Zurich, Switzerland.
The data is collected with a measurement platform [3] consisting of a solar panel (AM-5412) connected to a bq25505 energy harvesting chip that stores the harvested energy in a virtual battery circuit. Two TSL45315 light sensors placed on opposite sides of the solar panel monitor the illuminance level and a BME280 sensor logs ambient conditions like temperature, humidity and air pressure.
The dataset contains the measurement of the energy flow at the input and the output of the bq25505 harvesting circuit, as well as the illuminance, temperature, humidity and air pressure measurements of the ambient sensors. The following timestamped data columns are available in the raw measurement format, as well as preprocessed and filtered HDF5 datasets:
V_in - Converter input/solar panel output voltage, in volt
I_in - Converter input/solar panel output current, in ampere
V_bat - Battery voltage (emulated through circuit), in volt
I_bat - Net Battery current, in/out flowing current, in ampere
Ev_left - Illuminance left of solar panel, in lux
Ev_right - Illuminance left of solar panel, in lux
P_amb - Ambient air pressure, in pascal
RH_amb - Ambient relative humidity, unit-less between 0 and 1
T_amb - Ambient temperature, in centigrade Celsius
The following publication presents and overview of the dataset and more details on the deployment used for data collection. A copy of the abstract is included in this dataset, see the file abstract.pdf.
L. Sigrist, A. Gomez, and L. Thiele. "Dataset: Tracing Indoor Solar Harvesting." In Proceedings of the 2nd Workshop on Data Acquisition To Analysis (DATA '19), 2019.
Folder Structure and Files
processed/ - This folder holds the imported, merged and filtered datasets of the power and sensor measurements. The datasets are stored in HDF5 format and split by measurement position posXX and and power and ambient sensor measurements. The files belonging to this folder are contained in archives named yyyy_mm_processed.tar, where yyyy and mm represent the year and month the data was published. A separate file lists the exact content of each archive (see below).
raw/ - This folder holds the raw measurement files recorded with the RocketLogger [1, 2] and using the measurement platform available at [3]. The files belonging to this folder are contained in archives named yyyy_mm_raw.tar, where yyyy and mmrepresent the year and month the data was published. A separate file lists the exact content of each archive (see below).
LICENSE - License information for the dataset.
README.md - The README file containing this information.
abstract.pdf - A copy of the above mentioned abstract submitted to the DATA '19 Workshop, introducing this dataset and the deployment used to collect it.
raw_import.ipynb [open in nbviewer] - Jupyter Python notebook to import, merge, and filter the raw dataset from the raw/ folder. This is the exact code used to generate the processed dataset and store it in the HDF5 format in the processed/folder.
raw_preview.ipynb [open in nbviewer] - This Jupyter Python notebook imports the raw dataset directly and plots a preview of the full power trace for all measurement positions.
processing_python.ipynb [open in nbviewer] - Jupyter Python notebook demonstrating the import and use of the processed dataset in Python. Calculates column-wise statistics, includes more detailed power plots and the simple energy predictor performance comparison included in the abstract.
processing_r.ipynb [open in nbviewer] - Jupyter R notebook demonstrating the import and use of the processed dataset in R. Calculates column-wise statistics and extracts and plots the energy harvesting conversion efficiency included in the abstract. Furthermore, the harvested power is analyzed as a function of the ambient light level.
Dataset File Lists
Processed Dataset Files
The list of the processed datasets included in the yyyy_mm_processed.tar archive is provided in yyyy_mm_processed.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.
Raw Dataset Files
A list of the raw measurement files included in the yyyy_mm_raw.tar archive(s) is provided in yyyy_mm_raw.files.md. The markdown formatted table lists the name of all files, their size in bytes, as well as the SHA-256 sums.
Dataset Revisions
v1.0 (2019-08-03)
Initial release. Includes the data collected from 2017-07-27 to 2019-08-01. The dataset archive files related to this revision are 2019_08_raw.tar and 2019_08_processed.tar. For position pos06, the measurements from 2018-01-06 00:00:00 to 2018-01-10 00:00:00 are filtered (data inconsistency in file indoor1_p27.rld).
v1.1 (2019-09-09)
Revision of the processed dataset v1.0 and addition of the final dataset abstract. Updated processing scripts reduce the timestamp drift in the processed dataset, the archive 2019_08_processed.tar has been replaced. For position pos06, the measurements from 2018-01-06 16:00:00 to 2018-01-10 00:00:00 are filtered (indoor1_p27.rld data inconsistency).
v2.0 (2020-03-20)
Addition of new data. Includes the raw data collected from 2019-08-01 to 2019-03-16. The processed data is updated with full coverage from 2017-07-27 to 2019-03-16. The dataset archive files related to this revision are 2020_03_raw.tar and 2020_03_processed.tar.
Dataset Authors, Copyright and License
Authors: Lukas Sigrist, Andres Gomez, and Lothar Thiele
Contact: Lukas Sigrist (lukas.sigrist@tik.ee.ethz.ch)
Copyright: (c) 2017-2019, ETH Zurich, Computer Engineering Group
License: Creative Commons Attribution 4.0 International License (https://creativecommons.org/licenses/by/4.0/)
References
[1] L. Sigrist, A. Gomez, R. Lim, S. Lippuner, M. Leubin, and L. Thiele. Measurement and validation of energy harvesting IoT devices. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017.
[2] ETH Zurich, Computer Engineering Group. RocketLogger Project Website, https://rocketlogger.ethz.ch/.
[3] L. Sigrist. Solar Harvesting and Ambient Tracing Platform, 2019. https://gitlab.ethz.ch/tec/public/employees/sigristl/harvesting_tracing
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data has been used to carry out one of the experiment presented in Lorenzo Nespoli and Vasco Medici (2020). Multivariate Boosted Trees and Applications to Forecasting and Control arXiv
The MBTR python library is accessible here, while the repository containing all the code for the experiments carried out in the paper (including the one generating the figures) is accessible here.
This dataset contains 41 days of simulated P, Q and voltage data for a 3-phases low voltage grid located in Switzerland. The grid topology, along with parameters for the grid’s cables, were retrieved from the local DSO. Power profiles of uncontrollable loads were generated with the LoadProfileGenerator; power profiles of photovoltaic roof-mounted power plants were obtained through the PVlib python library, while the electrical loads due to heat pumps was retrieved simulating domestic heating systems and buildings thermal dynamics, modelling them starting from building’s metadata. The grid was then simulated with KrangPower, an OpenDSS python wrapper, and the 3 phases voltages, power and currents retrieved for all the QP nodes of the grid, with a 1 minute sampling time.
The data can be imported in python with:
import pickle as pk
with open('vsc_data.pk', 'rb') as f:
data = pk.load(f)
This project is carried out within the frame of the Swiss Centre for Competence in Energy Research on the Future Swiss Electrical Infrastructure (SCCER-FURIES) with the financial support of the Swiss Innovation Agency (Innosuisse - SCCER program) and of the Swiss Federal Office of Energy with the project SI/501523.
Subscribers can find out export and import data of 23 countries by HS code or product’s name. This demo is helpful for market analysis.