100+ datasets found

CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type...
zenodo.org
data.niaid.nih.gov
bin
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bernd Gruner; Bernd Gruner; Thomas Heinze; Thomas Heinze; Clemens-Alexander Brust; Clemens-Alexander Brust (2022). CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type Inference Systems [Dataset]. http://doi.org/10.5281/zenodo.5747024
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5747024
Dataset updated
Jan 28, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Bernd Gruner; Bernd Gruner; Thomas Heinze; Thomas Heinze; Clemens-Alexander Brust; Clemens-Alexander Brust
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains python repositories mined on GitHub on January 20, 2021. It allows a cross-domain evaluation of type inference systems. For this purpose, it consists of two sub-datasets, each containing only projects from the web or scientific calculation domain, respectively. Therefore we searched for projects with dependencies to either NumPy or Flask. Furthermore, only projects with dependencies to mypy were considered, because this should ensure that at least parts of the projects have type annotations. These can be used later as ground truth. Further details about the dataset will be described in an upcoming paper, as soon as it is published it will be linked here.
The dataset consists of two files for the two sub-datasets. The web domain dataset contains 3129 repositories and the scientific calculation domain dataset contains 4783 repositories. The files have two columns with the URL to the GitHub repository and the used commit hash. Thus, it is possible to download the dataset using shell or python scripts, for example, the pipeline provided by ManyTypes4Py can be used.
If repositories do not exist anymore or are private, you can contact us via the following email address: bernd.gruner@dlr.de. We have a backup of all repositories and will be happy to help you.
Z
Data from: Domain-adaptive Data Synthesis for Large-scale Supermarket...
data.niaid.nih.gov
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kampel, Martin (2024). Domain-adaptive Data Synthesis for Large-scale Supermarket Product Recognition [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7750241
Explore at:
Dataset updated
Apr 5, 2024
Dataset provided by
Strohmayer, Julian
Kampel, Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition

This repository contains the data synthesis pipeline and synthetic product recognition datasets proposed in [1].

Data Synthesis Pipeline:

We provide the Blender 3.1 project files and Python source code of our data synthesis pipeline pipeline.zip, accompanied by the FastCUT models used for synthetic-to-real domain translation models.zip. For the synthesis of new shelf images, a product assortment list and product images must be provided in the corresponding directories products/assortment/ and products/img/. The pipeline expects product images to follow the naming convention c.png, with c corresponding to a GTIN or generic class label (e.g., 9120050882171.png). The assortment list, assortment.csv, is expected to use the sample format [c, w, d, h], with c being the class label and w, d, and h being the packaging dimensions of the given product in mm (e.g., [4004218143128, 140, 70, 160]). The assortment list to use and the number of images to generate can be specified in generateImages.py (see comments). The rendering process is initiated by either executing load.py from within Blender or within a command-line terminal as a background process.

Datasets:

SG3k - Synthetic GroZi-3.2k (SG3k) dataset, consisting of 10,000 synthetic shelf images with 851,801 instances of 3,234 GroZi-3.2k products. Instance-level bounding boxes and generic class labels are provided for all product instances.

SG3kt - Domain-translated version of SGI3k, utilizing GroZi-3.2k as the target domain. Instance-level bounding boxes and generic class labels are provided for all product instances.

SGI3k - Synthetic GroZi-3.2k (SG3k) dataset, consisting of 10,000 synthetic shelf images with 838,696 instances of 1,063 GroZi-3.2k products. Instance-level bounding boxes and generic class labels are provided for all product instances.

SGI3kt - Domain-translated version of SGI3k, utilizing GroZi-3.2k as the target domain. Instance-level bounding boxes and generic class labels are provided for all product instances.

SPS8k - Synthetic Product Shelves 8k (SPS8k) dataset, comprised of 16,224 synthetic shelf images with 1,981,967 instances of 8,112 supermarket products. Instance-level bounding boxes and GTIN class labels are provided for all product instances.

SPS8kt - Domain-translated version of SPS8k, utilizing SKU110k as the target domain. Instance-level bounding boxes and GTIN class labels for all product instances.

Table 1: Dataset characteristics.

Dataset

images

products

instances

labels
translation

SG3k 10,000 3,234 851,801 bounding box & generic class¹ none

SG3kt 10,000 3,234 851,801 bounding box & generic class¹ GroZi-3.2k

SGI3k 10,000 1,063 838,696 bounding box & generic class² none

SGI3kt 10,000 1,063 838,696 bounding box & generic class² GroZi-3.2k

SPS8k 16,224 8,112 1,981,967 bounding box & GTIN none

SPS8kt 16,224 8,112 1,981,967 bounding box & GTIN SKU110k

Sample Format

A sample consists of an RGB image (i.png) and an accompanying label file (i.txt), which contains the labels for all product instances present in the image. Labels use the YOLO format [c, x, y, w, h].

¹SG3k and SG3kt use generic pseudo-GTIN class labels, created by combining the GroZi-3.2k food product category number i (1-27) with the product image index j (j.jpg), following the convention i0000j (e.g., 13000097).

²SGI3k and SGI3kt use the generic GroZi-3.2k class labels from https://arxiv.org/abs/2003.06800.

Download and UseThis data may be used for non-commercial research purposes only. If you publish material based on this data, we request that you include a reference to our paper [1].

[1] Strohmayer, Julian, and Martin Kampel. "Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition." International Conference on Computer Analysis of Images and Patterns. Cham: Springer Nature Switzerland, 2023.

BibTeX citation:

@inproceedings{strohmayer2023domain, title={Domain-Adaptive Data Synthesis for Large-Scale Supermarket Product Recognition}, author={Strohmayer, Julian and Kampel, Martin}, booktitle={International Conference on Computer Analysis of Images and Patterns}, pages={239--250}, year={2023}, organization={Springer} }
T
visual_domain_decathlon
tensorflow.org
Updated Aug 28, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). visual_domain_decathlon [Dataset]. https://www.tensorflow.org/datasets/catalog/visual_domain_decathlon
Explore at:
Dataset updated
Aug 28, 2017
Description
This contains the 10 datasets used in the Visual Domain Decathlon, part of the PASCAL in Detail Workshop Challenge (CVPR 2017). The goal of this challenge is to solve simultaneously ten image classification problems representative of very different visual domains.

Some of the datasets included here are also available as separate datasets in TFDS. However, notice that images were preprocessed for the Visual Domain Decathlon (resized isotropically to have a shorter size of 72 pixels) and might have different train/validation/test splits. Here we use the official splits for the competition.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('visual_domain_decathlon', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.

https://storage.googleapis.com/tfds-data/visualization/fig/visual_domain_decathlon-aircraft-1.2.0.png" alt="Visualization" width="500px">
Z
The dataset for the study of code change patterns in Python
data.niaid.nih.gov
Updated Oct 19, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2021). The dataset for the study of code change patterns in Python [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4004117
Explore at:
Dataset updated
Oct 19, 2021
Dataset authored and provided by
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset of Python projects used for the study of code change patterns and their automation. The dataset lists 120 projects, divided into four domains — Web, Media, Data, and ML+DL.
h
Cybersec-Mutli-domain
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zain Nadeem, Cybersec-Mutli-domain [Dataset]. https://huggingface.co/datasets/ZainNadeem7/Cybersec-Mutli-domain
Explore at:
Authors
Zain Nadeem
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Creator: Zain NadeemRole: Python Django Developer | Software Engineer | Prompt Engineer | Ethical HackerLicense: CC BY 4.0Records: ~220,000Format: JSONLLanguage: English

📌 Overview

The CyberSec Multi-Domain Dataset is a structured collection of synthetic and open-source cybersecurity data across five important domains. It is designed for building, testing, and benchmarking machine learning models in cybersecurity, threat intelligence, and automation systems. This dataset helps… See the full description on the dataset page: https://huggingface.co/datasets/ZainNadeem7/Cybersec-Mutli-domain.
Airlines Flights Data
kaggle.com
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Science Lovers (2025). Airlines Flights Data [Dataset]. https://www.kaggle.com/datasets/rohitgrewal/airlines-flights-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 29, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Data Science Lovers
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
📹Project Video available on YouTube - https://youtu.be/gu3Ot78j_Gc

Airlines Flights Dataset for Different Cities

The Flights Booking Dataset of various Airlines is a scraped datewise from a famous website in a structured format. The dataset contains the records of flight travel details between the cities in India. Here, multiple features are present like Source & Destination City, Arrival & Departure Time, Duration & Price of the flight etc.

This data is available as a CSV file. We are going to analyze this data set using the Pandas DataFrame.

This analyse will be helpful for those working in Airlines, Travel domain.

Using this dataset, we answered multiple questions with Python in our Project.

Q.1. What are the airlines in the dataset, accompanied by their frequencies?

Q.2. Show Bar Graphs representing the Departure Time & Arrival Time.

Q.3. Show Bar Graphs representing the Source City & Destination City.

Q.4. Does price varies with airlines ?

Q.5. Does ticket price change based on the departure time and arrival time?

Q.6. How the price changes with change in Source and Destination?

Q.7. How is the price affected when tickets are bought in just 1 or 2 days before departure?

Q.8. How does the ticket price vary between Economy and Business class?

Q.9. What will be the Average Price of Vistara airline for a flight from Delhi to Hyderabad in Business Class ?

These are the main Features/Columns available in the dataset :

1) Airline: The name of the airline company is stored in the airline column. It is a categorical feature having 6 different airlines.

2) Flight: Flight stores information regarding the plane's flight code. It is a categorical feature.

3) Source City: City from which the flight takes off. It is a categorical feature having 6 unique cities.

4) Departure Time: This is a derived categorical feature obtained created by grouping time periods into bins. It stores information about the departure time and have 6 unique time labels.

5) Stops: A categorical feature with 3 distinct values that stores the number of stops between the source and destination cities.

6) Arrival Time: This is a derived categorical feature created by grouping time intervals into bins. It has six distinct time labels and keeps information about the arrival time.

7) Destination City: City where the flight will land. It is a categorical feature having 6 unique cities.

8) Class: A categorical feature that contains information on seat class; it has two distinct values: Business and Economy.

9) Duration: A continuous feature that displays the overall amount of time it takes to travel between cities in hours.

10) Days Left: This is a derived characteristic that is calculated by subtracting the trip date by the booking date.

11) Price: Target variable stores information of the ticket price.
R
Accompanying Dataset and Python Code for Reproducibility and Implementation...
entrepot.recherche.data.gouv.fr
bin, hdf +2
Updated Jan 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rémi Roncen; Rémi Roncen (2025). Accompanying Dataset and Python Code for Reproducibility and Implementation of the IR-TDIBC Method [Dataset]. http://doi.org/10.57745/TPWI2R
Explore at:
bin(1165), hdf(2970976), txt(2698577), text/x-python(11551), txt(2696692), text/x-python(6179), txt(2699255), bin(17228), text/x-python(7182), text/x-python(3087), txt(2698327), bin(3077), bin(1002), text/x-python(7583)Available download formats
Unique identifier
https://doi.org/10.57745/TPWI2R
Dataset updated
Jan 23, 2025
Dataset provided by
Recherche Data Gouv
Authors
Rémi Roncen; Rémi Roncen
License
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Dataset funded by
ERC
Description
Supporting Dataset and Python Code for TDIBC Method This repository provides the dataset and Python codes necessary to regenerate the figures presented in the manuscript "Revisiting Nonlinear Impedance in Acoustic Liners" (https://hal.science/hal-04810729v1) and to facilitate the proper implementation of the Impulse-Response Time-Domain Impedance Boundary Condition (IR-TDIBC) method. The materials aim to promote transparency, reproducibility, and accessibility for researchers working with nonlinear impedance models and acoustic liners in time domain. Contents Dataset: Includes data used in the manuscript, covering experimental measurements obtained in an impedance tube and flow noise obtained in the B2A bench at ONERA. Python Scripts: Scripts designed to: Recreate the figures from the paper Demonstrate the IR-TDIBC implementation step-by-step Features Scripts for generating plots and verifying results from the manuscript. Clear examples to help users adapt the IR-TDIBC method to their specific setups. Annotations and explanations within the code for ease of understanding and modification.
T
web_graph
tensorflow.org
Updated Nov 23, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). web_graph [Dataset]. http://identifiers.org/arxiv:2112.02194
Explore at:
Unique identifier
https://identifiers.org/arxiv:2112.02194
Dataset updated
Nov 23, 2022
Description
This dataset contains a sparse graph representing web link structure for a small subset of the Web.

Its a processed version of a single crawl performed by CommonCrawl in 2021 where we strip everything and keep only the link->outlinks structure. The final dataset is basically int -> List[int] format with each integer id representing a url.

Also, in order to increase the value of this resource, we created 6 different version of WebGraph, each varying in the sparsity pattern and locale. We took the following processing steps, in order:

We started with WAT files from June 2021 crawl.

Since the outlinks in HTTP-Response-Metadata are stored as relative paths, we convert them to absolute paths using urllib after validating each link.

To study locale-specific graphs, we further filter based on 2 top level domains: ‘de’ and ‘in’, each producing a graph with an order of magnitude less number of nodes.

These graphs can still have arbitrary sparsity patterns and dangling links. Thus we further filter the nodes in each graph to have minimum of K ∈ [10, 50] inlinks and outlinks. Note that we only do this processing once, thus this is still an approximation i.e. the resulting graph might have nodes with less than K links.

Using both locale and count filters, we finalize 6 versions of WebGraph dataset, summarized in the folling table.

Version Top level domain Min count Num nodes Num edges
sparse 10 365.4M 30B
dense 50 136.5M 22B
de-sparse de 10 19.7M 1.19B
de-dense de 50 5.7M 0.82B
in-sparse in 10 1.5M 0.14B
in-dense in 50 0.5M 0.12B

All versions of the dataset have following features:

"row_tag": a unique identifier of the row (source link).

"col_tag": a list of unique identifiers of non-zero columns (dest outlinks).

"gt_tag": a list of unique identifiers of non-zero columns used as ground truth (dest outlinks), empty for train/train_t splits.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('web_graph', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
o
Dataset for interactive course on BioImage Analysis with Python (BIAPy)
explore.openaire.eu
Updated May 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillaume Witz (2020). Dataset for interactive course on BioImage Analysis with Python (BIAPy) [Dataset]. http://doi.org/10.5281/zenodo.3786306
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3786306
Dataset updated
May 5, 2020
Authors
Guillaume Witz
Description
This dataset can be used to run the course on image processing with Python available here: https://github.com/guiwitz/neubias_academy_biapy It combines microscopy images from different publicly available sources. All files are either in the Public Domain (PD) or released with a CC-BY license. The list of the original location of the data as well as their licenses can be found in the LICENSE file.
e
Python IFC Escape Route Model Generator - Dataset - B2FIND
b2find.eudat.eu
Updated Oct 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Python IFC Escape Route Model Generator - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/60ea743b-3117-5ede-9401-8eb55028de31
Explore at:
Dataset updated
Oct 27, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Context This research data contains the Python project to generate escape route domain models in the IFC format created by researchers from the TU Wien Research Unit Digital Building Process.It is linked to the research paper: "Fischer, S., Urban, H., Schranz, C., Haselberger, M., & Schnabel, F. (2024). Generation of new BIM domain models from escape route analysis results. Developments in the Built Environment, 19, 100499. https://doi.org/10.1016/j.dibe.2024.100499" The research paper describes different ways of storing the results of escape route analysis in IFC models. Five different variants have been evaluated. This Python project contains the code to generate the most promising variant "Routes group Segments -- Group". The generated IFC models for all variants for a custom test model and a real-world model are also published, as well as the two initial models:Custom test model for escape route analysis in IFC format: https://doi.org/10.48436/hx8gz-zw339Real-world test model for escape route analysis in IFC format: https://doi.org/10.48436/fnmrh-crh59Custom escape route models in IFC format: https://doi.org/10.48436/dpwd5-33k50Real-World Escape Route Models in IFC format: https://doi.org/10.48436/rrd14-t1108 Technical details The project uses the Programming Language Python. The project was successfully executed with Python 3.10, 3.11, and 3.12. The most important library is IfcOpenShell (tested for versions 0.7.0 to 0.7.11). Instructions for downloading and installing IfcOpenShell can be found here: (https://docs.ifcopenshell.org/ifcopenshell-python/installation.html). Herein it is important to install the correct version compatible with the installed Python version. The input data is provided by JSON files containing the escape route data of the two initial IFC models. Instructions on how to use the code are included in the README.md file in the zip folder. All data files are licensed under CC BY 4.0, all software files are licensed under MIT License. The IFC Escape Route Model Generator is also available for the JavaScript programming language: https://doi.org/10.48436/c35ty-ky950
h
clustered_tulu_3_16
huggingface.co
Updated Jul 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Malikeh Ehghaghi (2025). clustered_tulu_3_16 [Dataset]. https://huggingface.co/datasets/Malikeh1375/clustered_tulu_3_16
Explore at:
Dataset updated
Jul 28, 2025
Authors
Malikeh Ehghaghi
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Clustered_Tulu_3_16 Multi-Domain Dataset

This dataset contains high-quality examples across 16 specialized domains, automatically extracted and curated from the Tulu-3 SFT mixture using advanced clustering techniques.

🎯 Multi-Domain Structure

This repository provides 16 domain-specific configurations, each optimized for different types of tasks:

Configuration Domain Train Test Total

python_string_and_list_processing Python String & List Processing 43,564 10… See the full description on the dataset page: https://huggingface.co/datasets/Malikeh1375/clustered_tulu_3_16.
Data associated with manuscript "Spatial Frequency domain Mueller matrix...
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
+2more
Updated Jan 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2023). Data associated with manuscript "Spatial Frequency domain Mueller matrix imaging" [Dataset]. https://catalog.data.gov/dataset/data-associated-with-manuscript-spatial-frequency-domain-mueller-matrix-imaging-43bb2
Explore at:
Dataset updated
Jan 7, 2023
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
This archive contains the spatial frequency domain Mueller matrix data associated with the paper J. Chue-Sang, M. Litorja, A. M. Goldfain, and T. A. Germer, "Spatial Frequency domain Mueller matrix imaging," J. Biomedical Optics 27(12), 126003 (2022). The paper shows a subset of the data included in this archive. A Python script, analyze.py, is provided to assist the user in reading the data. The script can be run without any arguments from the top folder and will generate all the figures included in this archive. The script requires Python 3.6 and Matplotlib 3.4.A MATLAB script, analyze.m, is also provided.
h
staqc
huggingface.co
Updated Mar 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Charles Koutcheme (2023). staqc [Dataset]. https://huggingface.co/datasets/koutch/staqc
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 31, 2023
Authors
Charles Koutcheme
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
StaQC (Stack Overflow Question-Code pairs) is a dataset of around 148K Python and 120K SQL domain question-code pairs, which are automatically mined from Stack Overflow using a Bi-View Hierarchical Neural Network, as described in the paper "StaQC: A Systematically Mined Question-Code Dataset from Stack Overflow" (WWW'18).
T
rlu_control_suite
tensorflow.org
Updated Nov 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). rlu_control_suite [Dataset]. https://www.tensorflow.org/datasets/catalog/rlu_control_suite
Explore at:
Dataset updated
Nov 23, 2022
Description
RL Unplugged is suite of benchmarks for offline reinforcement learning. The RL Unplugged is designed around the following considerations: to facilitate ease of use, we provide the datasets with a unified API which makes it easy for the practitioner to work with all data in the suite once a general pipeline has been established.

The datasets follow the RLDS format to represent steps and episodes.

DeepMind Control Suite Tassa et al., 2018 is a set of control tasks implemented in MuJoCo Todorov et al., 2012. We consider a subset of the tasks provided in the suite that cover a wide range of difficulties.

Most of the datasets in this domain are generated using D4PG. For the environments Manipulator insert ball and Manipulator insert peg we use V-MPO Song et al., 2020 to generate the data as D4PG is unable to solve these tasks. We release datasets for 9 control suite tasks. For details on how the dataset was generated, please refer to the paper.

DeepMind Control Suite is a traditional continuous action RL benchmark. In particular, we recommend you test your approach in DeepMind Control Suite if you are interested in comparing against other state of the art offline RL methods.

To use this dataset:

import tensorflow_datasets as tfds ds = tfds.load('rlu_control_suite', split='train') for ex in ds.take(4): print(ex)

See the guide for more informations on tensorflow_datasets.
e
Dataset relaterat till processövervakning och tillståndsövervakning av en...
b2find.eudat.eu
Updated Oct 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Dataset relaterat till processövervakning och tillståndsövervakning av en lagerringsslipmaskin - Dataset for the Implementation of Condition-based Maintenance and Maintenance Decision-making of a Bearing Ring Grinder - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/aa8255c2-9170-51a8-9dd7-00e1514510dc
Explore at:
Dataset updated
Oct 10, 2024
Description
In the article (Ahmer, M., Sandin, F., Marklund, P. et al., 2022), we have investigated the effective use of sensors in a bearing ring grinder for failure classification in the condition-based maintenance context. The proposed methodology combines domain knowledge of process monitoring and condition monitoring to successfully achieve failure mode prediction with high accuracy using only a few key sensors. This enables manufacturing equipment to take advantage of advanced data processing and machine learning techniques. The grinding machine is of type SGB55 from Lidköping Machine Tools and is used to produce functional raceway surface of inner rings of type SKF-6210 deep groove ball bearing. Additional sensors like vibration, acoustic emission, force, and temperature sensors are installed to monitor machine condition while producing bearing components under different operating conditions. Data is sampled from sensors as well as the machine's numerical controller during operation. Selected parts are measured for the produced quality. Ahmer, M., Sandin, F., Marklund, P., Gustafsson, M., & Berglund, K. (2022). Failure mode classification for condition-based maintenance in a bearing ring grinding machine. In The International Journal of Advanced Manufacturing Technology (Vol. 122, pp. 1479–1495). https://doi.org/10.1007/s00170-022-09930-6 The files are of three categories and are grouped in zipped folders. The pdf file named "readme_data_description.pdf" describes the content of the files in the folders. The "lib" includes the information on libraries to read the .tdms Data Files in Matlab or Python. The raw time-domain sensors signal data are grouped in seven main folders named after each test run e.g. "test_1"... "test_7". Each test includes seven dressing cycles named e.g. "dresscyc_1"... "dresscyc_7". Each dressing cycle includes .tdms files for fifteen rings for their individual grinding cycle. The column description for both "Analogue" and "Digital" channels are described in the "readme_data_description.pdf" file. The machine and process parameters used for the tests as sampled from the machine's control system (Numerical Controller) and compiled for all test runs in a single file "process_data.csv" in the folder "proc_param". The column description is available in "readme_data_description.pdf" under "Process Parameters". The measured quality data (nine quality parameters - normalized) of the selected produced parts are recorded in the file "measured_quality_param.csv" under folder "quality". The description of the quality parameters is available in "readme_data_description.pdf". The quality parameter disposition based on their actual acceptance tolerances for the process step is presented in file "quality_disposition.csv" under folder "quality". I publikationen (Ahmer, M., Sandin, F., Marklund, P. et al., 2022) har vi undersökt användningen av sensorer i en lagerringsslipmaskin för felklassificering och tillståndsövervakning. Föreslagen metod kombinerar domänkunskap om processövervakning och tillståndsövervakning för att framgångsrikt uppnå fellägesförutsägelse med hög noggrannhet med endast ett fåtal nyckelsensorer. Denna forskning visar att tillverkningsutrustning kan dra fördel av avancerad databehandling och maskininlärningsteknik. Slipmaskinen är av typ SGB55 från Lidköping Machine Tools och används i detta fall för att slipa löpbanor på lagerinnerringar av typ SKF-6210 spårkullager. Sensorer för vibration, akustisk emission, kraft och temperatur är installerade för att övervaka maskinens tillstånd under slipning och olika driftsförhållanden. Data insamlas från sensorerna samt maskinens numeriska styrenhet under drift. Utvalda producerade kvalitetsparametrar mäts efter slipoperationen. Ahmer, M., Sandin, F., Marklund, P., Gustafsson, M., & Berglund, K. (2022). Failure mode classification for condition-based maintenance in a bearing ring grinding machine. In The International Journal of Advanced Manufacturing Technology (Vol. 122, pp. 1479–1495). https://doi.org/10.1007/s00170-022-09930-6 Filerna är grupperade i mappar i zip-filer. Pdf-filen "readme_data_description.pdf" beskriver innehållet i filerna i mapparna. "lib" innehåller information om bibliotek som kan användas för att läsa .tdms-datafilerna i Matlab eller Python. Se den engelska beskrivningen för mer information. Raw time series data collected from machine and sensors during production of bearing rings and bearing rings quality measurement data. Rå tidsseriedata insamlad från maskin och sensorer under tillverkning av lagerringar och lagerringar kvalitetsmätdata.
H
Using Python Packages and HydroShare to Advance Open Data Science and...
beta.hydroshare.org
hydroshare.org
zip
Updated Sep 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeffery S. Horsburgh; Amber Spackman Jones; Anthony M. Castronova; Scott Black (2023). Using Python Packages and HydroShare to Advance Open Data Science and Analytics for Water [Dataset]. https://beta.hydroshare.org/resource/4f4acbab5a8c4c55aa06c52a62a1d1fb/
Explore at:
zip(31.0 MB)Available download formats
Dataset updated
Sep 28, 2023
Dataset provided by
HydroShare
Authors
Jeffery S. Horsburgh; Amber Spackman Jones; Anthony M. Castronova; Scott Black
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Scientific and management challenges in the water domain require synthesis of diverse data. Many data analysis tasks are difficult because datasets are large and complex; standard data formats are not always agreed upon or mapped to efficient structures for analysis; scientists may lack training for tackling large and complex datasets; and it can be difficult to share, collaborate around, and reproduce scientific work. Overcoming barriers to accessing, organizing, and preparing datasets for analyses can transform the way water scientists work. Building on the HydroShare repository’s cyberinfrastructure, we have advanced two Python packages that make data loading, organization, and curation for analysis easier, reducing time spent in choosing appropriate data structures and writing code to ingest data. These packages enable automated retrieval of data from HydroShare and the USGS’s National Water Information System (NWIS) (i.e., a Python equivalent of USGS’ R dataRetrieval package), loading data into performant structures that integrate with existing visualization, analysis, and data science capabilities available in Python, and writing analysis results back to HydroShare for sharing and publication. While these Python packages can be installed for use within any Python environment, we will demonstrate how the technical burden for scientists associated with creating a computational environment for executing analyses can be reduced and how sharing and reproducibility of analyses can be enhanced through the use of these packages within CUAHSI’s HydroShare-linked JupyterHub server.

This HydroShare resource includes all of the materials presented in a workshop at the 2023 CUAHSI Biennial Colloquium.
OpenMIIR - a public domain dataset of EEG recordings for music imagery...
figshare.com
zip
Updated Jun 1, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sebastian Stober; Avital Sternin; Adrian M. Owen; Jessica A. Grahn (2023). OpenMIIR - a public domain dataset of EEG recordings for music imagery information retrieval [Dataset]. http://doi.org/10.6084/m9.figshare.1541151.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1541151.v1
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Sebastian Stober; Avital Sternin; Adrian M. Owen; Jessica A. Grahn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Music imagery information retrieval (MIIR) systems may one day be able to recognize a song just as we think of it. As a step towards such technology, we are presenting a public domain dataset of electroencephalography (EEG) recordings taken during music perception and imagination. We acquired this data during an ongoing study that so far comprised 10 subjects listening to and imagining 12 short music fragments - each 7s-16s long - taken from well-known pieces. These stimuli were selected from different genres and systematically span several musical dimensions such as meter, tempo and the presence of lyrics. This way, various retrieval and classification scenarios can be addressed. The dataset is primarily aimed to enable music information retrieval researchers interested in these new MIIR challenges to easily test and adapt their existing approaches for music analysis like fingerprinting, beat tracking or tempo estimation on this new kind of data. We also hope that the OpenMIIR dataset will facilitate a stronger interdisciplinary collaboration between music information retrieval researchers and neuroscientists.
excersice 4 python
kaggle.com
Updated Jun 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthi.Mastrogiannaki (2018). excersice 4 python [Dataset]. https://www.kaggle.com/anthi1984/excersice-4-python/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 14, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Anthi.Mastrogiannaki
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Anthi.Mastrogiannaki

Released under CC0: Public Domain

Contents
t
ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...
researchdata.tuwien.ac.at
b2find.eudat.eu
zip
Updated Jun 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo (2025). ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture from merged multi-satellite observations [Dataset]. http://doi.org/10.48436/3fcxr-cde10
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.48436/3fcxr-cde10
Dataset updated
Jun 6, 2025
Dataset provided by
TU Wien
Authors
Wolfgang Preimesberger; Wolfgang Preimesberger; Pietro Stradiotti; Pietro Stradiotti; Wouter Arnoud Dorigo; Wouter Arnoud Dorigo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset was produced with funding from the European Space Agency (ESA) Climate Change Initiative (CCI) Plus Soil Moisture Project (CCN 3 to ESRIN Contract No: 4000126684/19/I-NB "ESA CCI+ Phase 1 New R&D on CCI ECVS Soil Moisture"). Project website: https://climate.esa.int/en/projects/soil-moisture/

This dataset contains information on the Surface Soil Moisture (SM) content derived from satellite observations in the microwave domain.

Dataset paper (public preprint)

A description of this dataset, including the methodology and validation results, is available at:

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.

Abstract

ESA CCI Soil Moisture is a multi-satellite climate data record that consists of harmonized, daily observations coming from 19 satellites (as of v09.1) operating in the microwave domain. The wealth of satellite information, particularly over the last decade, facilitates the creation of a data record with the highest possible data consistency and coverage.
However, data gaps are still found in the record. This is particularly notable in earlier periods when a limited number of satellites were in operation, but can also arise from various retrieval issues, such as frozen soils, dense vegetation, and radio frequency interference (RFI). These data gaps present a challenge for many users, as they have the potential to obscure relevant events within a study area or are incompatible with (machine learning) software that often relies on gap-free inputs.
Since the requirement of a gap-free ESA CCI SM product was identified, various studies have demonstrated the suitability of different statistical methods to achieve this goal. A fundamental feature of such gap-filling method is to rely only on the original observational record, without need for ancillary variable or model-based information. Due to the intrinsic challenge, there was until present no global, long-term univariate gap-filled product available. In this version of the record, data gaps due to missing satellite overpasses and invalid measurements are filled using the Discrete Cosine Transform (DCT) Penalized Least Squares (PLS) algorithm (Garcia, 2010). A linear interpolation is applied over periods of (potentially) frozen soils with little to no variability in (frozen) soil moisture content. Uncertainty estimates are based on models calibrated in experiments to fill satellite-like gaps introduced to GLDAS Noah reanalysis soil moisture (Rodell et al., 2004), and consider the gap size and local vegetation conditions as parameters that affect the gapfilling performance.

Summary

Gap-filled global estimates of volumetric surface soil moisture from 1991-2023 at 0.25° sampling

Fields of application (partial): climate variability and change, land-atmosphere interactions, global biogeochemical cycles and ecology, hydrological and land surface modelling, drought applications, and meteorology

Method: Modified version of DCT-PLS (Garcia, 2010) interpolation/smoothing algorithm, linear interpolation over periods of frozen soils. Uncertainty estimates are provided for all data points.

More information: See Preimesberger et al. (2025) and https://doi.org/10.5281/zenodo.8320869" target="_blank" rel="noopener">ESA CCI SM Algorithm Theoretical Baseline Document [Chapter 7.2.9] (Dorigo et al., 2023)

Programmatic Download

You can use command line tools such as wget or curl to download (and extract) data for multiple years. The following command will download and extract the complete data set to the local directory ~/Download on Linux or macOS systems.

#!/bin/bash

# Set download directory
DOWNLOAD_DIR=~/Downloads

base_url="https://researchdata.tuwien.at/records/3fcxr-cde10/files"

# Loop through years 1991 to 2023 and download & extract data
for year in {1991..2023}; do
echo "Downloading $year.zip..."
wget -q -P "$DOWNLOAD_DIR" "$base_url/$year.zip"
unzip -o "$DOWNLOAD_DIR/$year.zip" -d $DOWNLOAD_DIR
rm "$DOWNLOAD_DIR/$year.zip"
done

Data details

The dataset provides global daily estimates for the 1991-2023 period at 0.25° (~25 km) horizontal grid resolution. Daily images are grouped by year (YYYY), each subdirectory containing one netCDF image file for a specific day (DD), month (MM) in a 2-dimensional (longitude, latitude) grid system (CRS: WGS84). The file name has the following convention:

ESACCI-SOILMOISTURE-L3S-SSMV-COMBINED_GAPFILLED-YYYYMMDD000000-fv09.1r1.nc

Data Variables

Each netCDF file contains 3 coordinate variables (WGS84 longitude, latitude and time stamp), as well as the following data variables:

sm: (float) The Soil Moisture variable reflects estimates of daily average volumetric soil moisture content (m3/m3) in the soil surface layer (~0-5 cm) over a whole grid cell (0.25 degree).

sm_uncertainty: (float) The Soil Moisture Uncertainty variable reflects the uncertainty (random error) of the original satellite observations and of the predictions used to fill observation data gaps.

sm_anomaly: Soil moisture anomalies (reference period 1991-2020) derived from the gap-filled values (`sm`)

sm_smoothed: Contains DCT-PLS predictions used to fill data gaps in the original soil moisture field. These values are also provided for cases where an observation was initially available (compare `gapmask`). In this case, they provided a smoothed version of the original data.

gapmask: (0 | 1) Indicates grid cells where a satellite observation is available (1), and where the interpolated (smoothed) values are used instead (0) in the 'sm' field.

frozenmask: (0 | 1) Indicates grid cells where ERA5 soil temperature is <0 °C. In this case, a linear interpolation over time is applied.

Additional information for each variable is given in the netCDF attributes.

Version Changelog

Changes in v9.1r1 (previous version was v09.1):

This version uses a novel uncertainty estimation scheme as described in Preimesberger et al. (2025).

Software to open netCDF files

These data can be read by any software that supports Climate and Forecast (CF) conform metadata standards for netCDF files, such as:

https://github.com/pydata/xarray" target="_blank" rel="noopener">Xarray (python)

https://unidata.github.io/netcdf4-python/" target="_blank" rel="noopener">netCDF4 (python)

https://github.com/TUW-GEO/esa_cci_sm">esa_cci_sm (python)

Similar tools exists for other programming languages (Matlab, R, etc.)

Software packages and GIS tools can open netCDF files, e.g. CDO, NCO, QGIS, ArCGIS

You can also use the GUI software Panoply to view the contents of each file

References

Preimesberger, W., Stradiotti, P., and Dorigo, W.: ESA CCI Soil Moisture GAPFILLED: An independent global gap-free satellite climate data record with uncertainty estimates, Earth Syst. Sci. Data Discuss. [preprint], https://doi.org/10.5194/essd-2024-610, in review, 2025.

Dorigo, W., Preimesberger, W., Stradiotti, P., Kidd, R., van der Schalie, R., van der Vliet, M., Rodriguez-Fernandez, N., Madelon, R., & Baghdadi, N. (2023). ESA Climate Change Initiative Plus - Soil Moisture Algorithm Theoretical Baseline Document (ATBD) Supporting Product Version 08.1 (version 1.1). Zenodo. https://doi.org/10.5281/zenodo.8320869

Garcia, D., 2010. Robust smoothing of gridded data in one and higher dimensions with missing values. Computational Statistics & Data Analysis, 54(4), pp.1167-1178. Available at: https://doi.org/10.1016/j.csda.2009.09.020

Rodell, M., Houser, P. R., Jambor, U., Gottschalck, J., Mitchell, K., Meng, C.-J., Arsenault, K., Cosgrove, B., Radakovich, J., Bosilovich, M., Entin, J. K., Walker, J. P., Lohmann, D., and Toll, D.: The Global Land Data Assimilation System, Bulletin of the American Meteorological Society, 85, 381 – 394, https://doi.org/10.1175/BAMS-85-3-381, 2004.

Related Records

The following records are all part of the Soil Moisture Climate Data Records from satellites community

1
ESA CCI SM MODELFREE Surface Soil Moisture Record
<a href="https://doi.org/10.48436/svr1r-27j77" target="_blank"
MatSeg: Material State Segmentation Dataset and Benchmark
zenodo.org
zip
Updated May 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2025). MatSeg: Material State Segmentation Dataset and Benchmark [Dataset]. http://doi.org/10.5281/zenodo.11331618
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11331618
Dataset updated
May 22, 2025
Dataset provided by
Zenodohttp://zenodo.org/
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
MatSeg Dataset and benchmark for zero-shot material state segmentation.

MatSeg Benchmark containing 1220 real-world images and their annotations is available at MatSeg_Benchmark.zip the file contains documentation and Python readers.

MatSeg dataset containing synthetic images with infused natural images patterns is available at MatSeg3D_part_*.zip and MatSeg3D_part_*.zip (* stand for number).

MatSeg3D_part_*.zip: contain synthethc 3D scenes

MatSeg2D_part_*.zip: contain syntethc 2D scenes

Readers and documentation for the synthetic data are available at: Dataset_Documentation_And_Readers.zip

Readers and documentation for the real-images benchmark are available at: MatSeg_Benchmark.zip

The Code used to generate the MatSeg Dataset is available at: https://zenodo.org/records/11401072

Additional permanent sources for downloading the dataset and metadata: 1, 2

Evaluation scripts for the Benchmark are now available at:

https://zenodo.org/records/13402003 and https://e.pcloud.link/publink/show?code=XZsP8PZbT7AJzG98tV1gnVoEsxKRbBl8awX

Description

Materials and their states form a vast array of patterns and textures that define the physical and visual world. Minerals in rocks, sediment in soil, dust on surfaces, infection on leaves, stains on fruits, and foam in liquids are some of these almost infinite numbers of states and patterns.

Image segmentation of materials and their states is fundamental to the understanding of the world and is essential for a wide range of tasks, from cooking and cleaning to construction, agriculture, and chemistry laboratory work.

The MatSeg dataset focuses on zero-shot segmentation of materials and their states, meaning identifying the region of an image belonging to a specific material type of state, without previous knowledge or training of the material type, states, or environment.

The dataset contains a large set of (100k) synthetic images and benchmarks of 1220 real-world images for testing.

Benchmark

The benchmark contains 1220 real-world images with a wide range of material states and settings. For example: food states (cooked/burned..), plants (infected/dry.) to rocks/soil (minerals/sediment), construction/metals (rusted, worn), liquids (foam/sediment), and many other states in without being limited to a set of classes or environment. The goal is to evaluate the segmentation of material materials without knowledge or pretraining on the material or setting. The focus is on materials with complex scattered boundaries, and gradual transition (like the level of wetness of the surface).

Evaluation scripts for the Benchmark are now available at: 1 and 2.

"https://sites.google.com/view/matseg/home#h.2otka7pobcz1">

Synthetic Dataset

The synthetic dataset is composed of synthetic scenes rendered in 2d and 3d using a blender. The synthetic data is infused with patterns, materials, and textures automatically extracted from real images allowing it to capture the complexity and diversity of the real world while maintaining the precision and scale of synthetic data. 100k images and their annotation are available to download.

License

This dataset, including all its components, is released under the CC0 1.0 Universal (CC0 1.0) Public Domain Dedication. To the extent possible under law, the authors have dedicated all copyright and related and neighboring rights to this dataset to the public domain worldwide. This dedication applies to the dataset and all derivative works.

The MatSeg 2D and 3D synthetic were generated using the open-images dataset which is licensed under the https://www.apache.org/licenses/LICENSE-2.0. For these components, you must comply with the terms of the Apache License. In addition, the MatSege3D dataset uses Shapenet 3D assets with GNU license.

Example Usage:

An Example of a training and evaluation code for a net trained on the dataset and evaluated on the benchmark is given at these urls: 1, 2

This include an evaluation script on the MatSeg benchmark.

Training script using the MatSeg dataset.

And weights of a trained model

Paper:

More detail on the work ca be found in the paper "Infusing Synthetic Data with Real-World Patterns for
Zero-Shot Material State Segmentation"

Croissant metadata and additional sources for downloading the dataset are available at 1,2

Version	Top level domain	Min count	Num nodes	Num edges
sparse		10	365.4M	30B
dense		50	136.5M	22B
de-sparse	de	10	19.7M	1.19B
de-dense	de	50	5.7M	0.82B
in-sparse	in	10	1.5M	0.14B
in-dense	in	50	0.5M	0.12B

Facebook

Twitter

Click to copy link

Link copied

Cite

Bernd Gruner; Bernd Gruner; Thomas Heinze; Thomas Heinze; Clemens-Alexander Brust; Clemens-Alexander Brust (2022). CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type Inference Systems [Dataset]. http://doi.org/10.5281/zenodo.5747024

CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type Inference Systems

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.5747024

Dataset updated

Jan 28, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Bernd Gruner; Bernd Gruner; Thomas Heinze; Thomas Heinze; Clemens-Alexander Brust; Clemens-Alexander Brust

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains python repositories mined on GitHub on January 20, 2021. It allows a cross-domain evaluation of type inference systems. For this purpose, it consists of two sub-datasets, each containing only projects from the web or scientific calculation domain, respectively. Therefore we searched for projects with dependencies to either NumPy or Flask. Furthermore, only projects with dependencies to mypy were considered, because this should ensure that at least parts of the projects have type annotations. These can be used later as ground truth. Further details about the dataset will be described in an upcoming paper, as soon as it is published it will be linked here.
The dataset consists of two files for the two sub-datasets. The web domain dataset contains 3129 repositories and the scientific calculation domain dataset contains 4783 repositories. The files have two columns with the URL to the GitHub repository and the used commit hash. Thus, it is possible to download the dataset using shell or python scripts, for example, the pipeline provided by ManyTypes4Py can be used.
If repositories do not exist anymore or are private, you can contact us via the following email address: bernd.gruner@dlr.de. We have a backup of all repositories and will be happy to help you.

Clear search

Close search

Google apps

Main menu

CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type...

Data from: Domain-adaptive Data Synthesis for Large-scale Supermarket...

images

products

instances

visual_domain_decathlon

The dataset for the study of code change patterns in Python

Cybersec-Mutli-domain

Airlines Flights Data

📹Project Video available on YouTube - https://youtu.be/gu3Ot78j_Gc

Airlines Flights Dataset for Different Cities

Accompanying Dataset and Python Code for Reproducibility and Implementation...

web_graph

Dataset for interactive course on BioImage Analysis with Python (BIAPy)

Python IFC Escape Route Model Generator - Dataset - B2FIND

clustered_tulu_3_16

Data associated with manuscript "Spatial Frequency domain Mueller matrix...

staqc

rlu_control_suite

Dataset relaterat till processövervakning och tillståndsövervakning av en...

Using Python Packages and HydroShare to Advance Open Data Science and...

OpenMIIR - a public domain dataset of EEG recordings for music imagery...

excersice 4 python

Dataset

Contents

ESA CCI SM GAPFILLED Long-term Climate Data Record of Surface Soil Moisture...

Dataset paper (public preprint)

Abstract

Summary

Programmatic Download

Data details

Data Variables

Version Changelog

Software to open netCDF files

References

Related Records

MatSeg: Material State Segmentation Dataset and Benchmark

Description

Benchmark

Synthetic Dataset

Example Usage:

CrossDomainTypes4Py: A Python Dataset for Cross-Domain Evaluation of Type Inference Systems