100+ datasets found
  1. f

    Type, posture, and the number of observations in the IMU Posture dataset.

    • plos.figshare.com
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marinara Marcato; Salvatore Tedesco; Conor O’Mahony; Brendan O’Flynn; Paul Galvin (2023). Type, posture, and the number of observations in the IMU Posture dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0286311.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Marinara Marcato; Salvatore Tedesco; Conor O’Mahony; Brendan O’Flynn; Paul Galvin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Type, posture, and the number of observations in the IMU Posture dataset.

  2. f

    The pseudocode of the isolated Forests.

    • plos.figshare.com
    xls
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhibo Xie; Heng Long; Chengyi Ling; Yingjun Zhou; Yan Luo (2025). The pseudocode of the isolated Forests. [Dataset]. http://doi.org/10.1371/journal.pone.0315322.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Zhibo Xie; Heng Long; Chengyi Ling; Yingjun Zhou; Yan Luo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Anomaly detection is widely used in cold chain logistics (CCL). But, because of the high cost and technical problem, the anomaly detection performance is poor, and the anomaly can not be detected in time, which affects the quality of goods. To solve these problems, the paper presents a new anomaly detection scheme for CCL. At first, the characteristics of the collected data of CCL are analyzed, the mathematical model of data flow is established, and the sliding window and correlation coefficient are defined. Then the abnormal events in CCL are summarized, and three types of abnormal judgment conditions based on cor-relation coefficient ρjk are deduced. A measurement anomaly detection algorithm based on the improved isolated forest algorithm is proposed. Subsampling and cross factor are designed and used to overcome the shortcomings of the isolated forest algorithm (iForest). Experiments have shown that as the dimensionality of the data increases, the performance indicators of the new scheme, such as P (precision), R (recall), F1 score, and AUC (area under the curve), become increasingly superior to commonly used support vector machines (SVM), local outlier factors (LOF), and iForests. Its average P is 0.8784, average R is 0.8731, average F1 score is 0.8639, and average AUC is 0.9064. However, the execution time of the improved algorithm is slightly longer than that of the iForest.

  3. f

    coldChainDataA.

    • figshare.com
    bin
    Updated Mar 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhibo Xie; Heng Long; Chengyi Ling; Yingjun Zhou; Yan Luo (2025). coldChainDataA. [Dataset]. http://doi.org/10.1371/journal.pone.0315322.s001
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 10, 2025
    Dataset provided by
    PLOS ONE
    Authors
    Zhibo Xie; Heng Long; Chengyi Ling; Yingjun Zhou; Yan Luo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Anomaly detection is widely used in cold chain logistics (CCL). But, because of the high cost and technical problem, the anomaly detection performance is poor, and the anomaly can not be detected in time, which affects the quality of goods. To solve these problems, the paper presents a new anomaly detection scheme for CCL. At first, the characteristics of the collected data of CCL are analyzed, the mathematical model of data flow is established, and the sliding window and correlation coefficient are defined. Then the abnormal events in CCL are summarized, and three types of abnormal judgment conditions based on cor-relation coefficient ρjk are deduced. A measurement anomaly detection algorithm based on the improved isolated forest algorithm is proposed. Subsampling and cross factor are designed and used to overcome the shortcomings of the isolated forest algorithm (iForest). Experiments have shown that as the dimensionality of the data increases, the performance indicators of the new scheme, such as P (precision), R (recall), F1 score, and AUC (area under the curve), become increasingly superior to commonly used support vector machines (SVM), local outlier factors (LOF), and iForests. Its average P is 0.8784, average R is 0.8731, average F1 score is 0.8639, and average AUC is 0.9064. However, the execution time of the improved algorithm is slightly longer than that of the iForest.

  4. m

    Data, R Scripts and Random Forest Models for Winter Catch Crop Monitoring...

    • data.mendeley.com
    Updated Dec 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Schulz (2020). Data, R Scripts and Random Forest Models for Winter Catch Crop Monitoring from Sentinel-2 NDVI Time Series in Germany [Dataset]. http://doi.org/10.17632/78g2r5dp3k.1
    Explore at:
    Dataset updated
    Dec 16, 2020
    Authors
    Christian Schulz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the supplementary data to the article "Large-scale winter catch crop monitoring with Sentinel-2 time series and machine learning–An alternative to on-site controls?" in Computers and Electronics in Agriculture by C. Schulz, A.-K. Holtgrave, and B. Kleinschmit 2021 (https://doi.org/xxxxxxxx). The data contains a zip-file with the following folders: data (parcels, filled and unfilled time series tables, feature extraction results and prediction results) (csv, shp), model (random forest models for catch crop prediction) (rds), and R (R script files for Random Forest model training and prediction with RStudio) (r).

    The algorithms and RF models developed for this study were implemented via virtual Docker containers into the timeStamp software prototype which allows for large-scale automatized catch crop analysis on the parcel-level (timestamp.lup-umwelt.de). This software saves the raster data from the GTS² archive as parcel-wise clipped image time series into a PostGIS database. All further processing steps were performed with the statistical computing language R (RStudio Team, 2020). For raster data manipulation within the PostGIS database and downloading NDVI time series, we used the packages rpostgis (Bucklin and Basille, 2019) and RPostgreSQL (Conway et al., 2017). For time series filling and calculation of the predictors, we used the packages zoo (Zeileis et al., 2020), hydroGOF (Zambrano-Bigiarini, 2020), tsoutliers (de Lacalle, 2019), and changepoint (Killick et al., 2016). For RF modelling, we used the package caret (Kuhn et al., 2020).

  5. Forest Type NFI

    • envidat.ch
    jpeg, not available +1
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lars Waser; Christian Ginzler; Achilleas Psomas; Marius Rüetschi; Nataliia Rehush (2025). Forest Type NFI [Dataset]. http://doi.org/10.16904/1000001.7
    Explore at:
    not available, jpeg, tiffAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Swiss Federal Institute for Forest, Snow and Landscape Research
    Authors
    Lars Waser; Christian Ginzler; Achilleas Psomas; Marius Rüetschi; Nataliia Rehush
    Dataset funded by
    Federal Office for the Environment (FOEN).
    Description

    A series of Forest Type NFI datasets covering Switzerland have been produced, and currently available for the years 2023, 2018, and 2016. These datasets provide the probability (0-100%) of the class broadleaf at the pixel level. While pixel values with low probabilities correspond to the class coniferous, pixel values with high probabilities correspond to the class broadleaf. Pixels with values around 50% can't unambiguously be assigned to one of the two classes and might belong to mixed stands. The three datasets are based on different remote sensing data from different time spans and with different spatial resolutions: - Forest Type NFI 2023: Sentinel-1/-2, 2021-2023, 10m (new dataset!) - Forest Type NFI 2018: Sentinel-1/-2, 2016-2018, 10m - Forest Type NFI 2016: Aerial orthoimages, 2010-2015, 3m

    .

    . Important note: None of the datasets is suitable for analysis at the individual tree crown level because pixel-level probabilities are not allocated to individual trees. Analysis is recommended only for areas larger than 3x3 pixels, i.e. by calculating the mean values, except in rare cases of homogeneous forest stands (either broadleaf or coniferous). User’s feedback: We are very motivated to improve the dataset further and welcome any feedback or suggestions for corrections. Feel free to directly contact lars.waser@wsl.ch _ . Detailed description: Forest Type NFI datasets, versions 2023 and 2018: Both datasets use a remote sensing-based approach for a countrywide mapping of the Dominant Leaf Type (DLT) in Switzerland, classifying areas as either broadleaf or coniferous. These datasets have a spatial resolution of 10 m and provide the probability (0-100%) of the class broadleaf at the pixel level (covering all areas with vegetation height - 5m version 2018, and -3 m version 2023). The classification approach is based on a Random Forest (RF) classifier, that combines predictors derived from multi-temporal Sentinel-1 and Sentinel-2 data with the SwissAlti3D terrain model. In addition to the original Sentinel-2 spectral bands, vegetation indices such as GEMI, MSAVI2, NDVI, CLRE and CCCI were used in the final model. The classification models were tested, trained and validated using up to 400,000 labels representing the two classes broadleaf and coniferous, derived from aerial image delineations. For the 2023 data set, the previously used labels were quality-checked for reliability and temporal consistency with the image data. Additionally, more labels were collected in the regions where the previous version of the data set performed less accurately – as reported by the users. Similar high model performances were achieved for both data sets: overall accuracy = 0.96, kappa = 0.94, F1-score =0.96, precision = 0.97, recall = 0.94 (2023), 0.95 (2018). Independent validation and plausibility check included the comparison of the predicted results with aerial image interpretations of the National Forest Inventory (NFI). For both data sets, deviations particularly occurred in mixed forest stands: over all accuracy = 0.81 (2018, 2023), kappa = 0.62 (2018), 0.59 (2023). For more details, see Waser et al. (2021). The Forest Type NFI dataset, version 2016: This dataset presents a countrywide map with the two classes broadleaf and coniferous in Switzerland based on digital aerial imagery. The spatial resolution of the data set is 3 m. The pixel values correspond to the probabilities (0-100 %) of the class broadleaf. The classification approach incorporates a RF classifier, predictors from multispectral aerial imagery (ADS80) and the SwissAlti3D terrain model. The model was tested, trained and validated using 90,000 digitized polygons and achieved an overall accuracy of 0.99 and a kappa of 0.98. Independent validation and plausibility check included the comparison of the predicted results with aerial image interpretations of the NFI. Significant deviations were observed, primarily due to an underestimation of broadleaved trees (median underestimation of 3.17%), especially in mixed forest stands. For more details, see Waser et al. (2017; https://doi.org/10.3390/rs9080766).

  6. d

    Systematic review of validation of supervised machine learning models in...

    • search.dataone.org
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oakleigh Wilson (2025). Systematic review of validation of supervised machine learning models in accelerometer-based animal behaviour classification literature [Dataset]. http://doi.org/10.5061/dryad.fxpnvx14d
    Explore at:
    Dataset updated
    Jun 25, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Oakleigh Wilson
    Description

    Supervised machine learning has been used to detect fine-scale animal behaviour from accelerometer data, but a standardised protocol for implementing this workflow is currently lacking. As the application of machine learning to ecological problems expands, it is essential to establish technical protocols and validation standards that align with those in other "big data" fields. Overfitting is a prevalent and often misunderstood challenge in machine learning. Overfit models overly adapt to the training data to memorise specific instances rather than to discern the underlying signal. Associated results can indicate high performance on the training set, yet these models are unlikely to generalise to new data. Overfitting can be detected through rigorous validation using independent test sets. Our systematic review of 119 studies using accelerometer-based supervised machine learning to classify animal behaviour reveals that 79% (94 papers) did not validate their models sufficiently wel..., We defined eligibility criteria as 'peer-reviewed primary research papers published 2013-present that use supervised machine learning to identify specific behaviours from raw, non-livestock animal accelerometer data'. We elected to ignore analysis of livestock behaviour as agricultural methods often operate within different constraints to the analyses conducted on wild animals and this body of literature has mostly developed in isolation to wild animal research. Our search was conducted on 27/09/2024. Initial keyword search across 3 databases (Google Scholar, PubMed, and Scopus) yielded 249 unique papers. Papers outside of the search criteria — including hardware and software advances, non-ML analysis, insufficient accelerometry application (e.g., research focused on other sensors with accelerometry providing minimal support), unsupervised methods, and research limited to activity intensity or active and inactive states— were excluded, resulting in 119 papers., , # Systematic review of validation of supervised machine learning models in accelerometer-based animal behaviour classification literature

    https://doi.org/10.5061/dryad.fxpnvx14d

    Description of the data and file structure

    Files and variables

    File: Systematic_Review_Supplementary.xlsx

    Description:Â Methods information from animal accelerometer-based behaviour classification literature utilising supervised machine learning techniques.

    Variables

    • Citation: Citation information for paper
    • Title: Extracted title from citation information
    • Year: Year of publication
    • ModelCategory: General category of the supervised machine learning model used (e.g., all Support Vector Machines are listed as SVM)
      • DT — Decision Tree
      • EM — Expectation Maximisation
      • Ensemble — Ensemble methods (e.g., boosting, bagging)
      • HMM — Hidden Markov Model
      • Isolation Forest — Anomaly detection using Isolation Forest ...,
  7. B

    A Random Forest Algorithm for Learning and Updating Fuel Types for Fire...

    • borealisdata.ca
    Updated Apr 17, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shuojie Li (2021). A Random Forest Algorithm for Learning and Updating Fuel Types for Fire Research [Dataset]. http://doi.org/10.5683/SP2/N45050
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 17, 2021
    Dataset provided by
    Borealis
    Authors
    Shuojie Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Elephant hill, British Columbia, Canada
    Description

    Wildfire drives a tremendous amount of forest and land cover change in the central interior of British Columbia, Canada. Fuel type maps have been acknowledged as critical references to conduct landscape-level fire simulations as well as fire behavior predictions. Nonetheless, the current thematic maps are not updated on an annual basis and cannot be easily produced at a certain scale and speed. The objective of this research was to test the hypothesis – that machine learning algorithm can augment the current manual wildfire fuel types identification system and can help to update fuel types on an annual basis, in the meantime the accuracy of the algorithm can meet the standards of The Ministry of Forests, Lands, Natural Resource Operations and Rural Development (BC FLNRO). The random forest algorithm was applied over a 40 000-km² landscape in central interior British Columbia that burned from a megafire in 2017. Fuel maps were obtained from the years 2013-2017, with the cross-validation overall accuracy reached 98.57% and the overall accuracy of confusion matrix tested on the validation set reached 92.35%. Various recommendations are given for future research using machine learning algorithms for fuel mapping such as assuring pre-processing procedure follows delicate standards, modifying the machine learning algorithm, and adopting other sources of remotely sensed data.

  8. i

    Data from: Hyperspectral Remote Sensing Benchmark Database for Oil Spill...

    • ieee-dataport.org
    Updated Oct 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashwini pal (2023). Hyperspectral Remote Sensing Benchmark Database for Oil Spill Detection with an Isolation Forest-Guided Unsupervised Detector [Dataset]. https://ieee-dataport.org/documents/hyperspectral-remote-sensing-benchmark-database-oil-spill-detection-isolation-forest
    Explore at:
    Dataset updated
    Oct 31, 2023
    Authors
    Ashwini pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    2010

  9. Data from: Evaluating the use of lidar to discern snag characteristics...

    • zenodo.org
    • data.niaid.nih.gov
    • +2more
    bin, csv
    Updated Jun 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jessica M. Stitt; Jessica M. Stitt; Andrew T. Hudak; Carlos A. Silva; Lee A. Vierling; Kerri T. Vierling; Andrew T. Hudak; Carlos A. Silva; Lee A. Vierling; Kerri T. Vierling (2022). Data from: Evaluating the use of lidar to discern snag characteristics important for wildlife [Dataset]. http://doi.org/10.5061/dryad.pnvx0k6nr
    Explore at:
    csv, binAvailable download formats
    Dataset updated
    Jun 5, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jessica M. Stitt; Jessica M. Stitt; Andrew T. Hudak; Carlos A. Silva; Lee A. Vierling; Kerri T. Vierling; Andrew T. Hudak; Carlos A. Silva; Lee A. Vierling; Kerri T. Vierling
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Standing dead trees (known as snags) are historically difficult to map and model using airborne laser scanning (ALS), or lidar. Specific snag characteristics are important for wildlife; for instance, a larger snag with a broken top can serve as a nesting platform for raptors. The objective of this study was to evaluate whether characteristics such as top intactness could be inferred from discrete-return ALS data. We collected structural information for 198 snags in closed-canopy conifer forest plots in Idaho. We selected 13 lidar metrics within 5 m diameter point clouds to serve as predictor variables in random forest (RF) models to classify snags into four groups by size (small [<40 cm diameter] or large [≥40 cm diameter]) and intactness (intact or broken top) across multiple iterations. We conducted these models first with all snags combined, and then ran the same models with only small or large snags. Overall accuracies were highest in RF models with large snags only (77%), but kappa statistics for all models were low (0.29–0.49). ALS data alone were not sufficient to identify top intactness for large snags; future studies combining ALS data with other remotely sensed data to improve classification of snag characteristics important for wildlife is encouraged.

  10. m

    Agriculture_Forest_Hyperspectral_Data

    • data.mendeley.com
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajesh C B (2023). Agriculture_Forest_Hyperspectral_Data [Dataset]. http://doi.org/10.17632/p7n6ktjdx7.1
    Explore at:
    Dataset updated
    Jun 6, 2023
    Authors
    Rajesh C B
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Maintaining rich biodiversity and being a habitat and resource for humans, tropical forests are one of the most important global biomes. These forest ecosystems have been experiencing a host of unregulated anthropogenic activities including illegal tourism and shifting cultivation. The presence of human-habitats in the restricted zones of forest ecosystems is a direct indicator of the human activities that may drive the future deterioration of forest quality by area and tree species composition. Remote sensing data have been extensively for mapping forest types, and biophysical characterization at various spatial scales. Several remote sensing datasets from multispectral, hyperspectral and LIDAR sensors acquired from airborne and satellite platforms are available for developing and validating a host of methodologies for remote sensing application in forestry. However, quantifying the quality of forest stands and detecting potential threats from the sporadic and small-scale human activities requires sub-pixel level remote sensing data analysis methods such as spectral mixture modelling. Generally, most of the studies employ pixel-level supervised learning-based analysis techniques to detect infrastructure and settlements. However, if the settlements are smaller than the ground sampling distance and are under the canopy, pixel-based techniques are not suitable. Reinvigorated with progressive availability of hyperspectral imagery, spectral mixture modelling based sub-pixel image analysis is gaining prominence in the contemporary remote sensing application development. However, there is a paucity of high-resolution hyperspectral imagery and associated ground truth spectral measurements for assessing various methodological approaches on studies related to anthropogenic activities and forest disturbance. Most of the studies have relied upon simulating and synthesising the hyperspectral imagery and its associated ground truth spectra for implementation of methods and algorithms. This article presents a distinct dataset of high-resolution hyperspectral imagery and associated ground truth spectra of various vegetable crops acquired over a tropical forest ecosystem. The dataset is valuable for research on developing new discrimination models of forest and cultivated vegetation, classification methods, spectral matching analysis techniques and sub-pixel target detection methods.

  11. TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

    • zenodo.org
    • data.niaid.nih.gov
    bin, pdf, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias (2024). TreeSatAI Benchmark Archive for Deep Learning in Forest Applications [Dataset]. http://doi.org/10.5281/zenodo.6598391
    Explore at:
    pdf, zip, binAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and Aim

    Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

    We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

    Description

    The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

    The TreeSatAI Benchmark Archive contains:

    • 50,381 image triplets (aerial, Sentinel-1, Sentinel-2)

    • synchronized time steps and locations

    • all original spectral bands/polarizations from the sensors

    • 20 species classes (single labels)

    • 12 age classes (single labels)

    • 15 genus classes (multi labels)

    • 60 m and 200 m patches

    • fixed split for train (90%) and test (10%) data

    • additional single labels such as English species name, genus, forest stand type, foliage type, land cover

    The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.

    Version history

    v1.0.0 - First release

    Citation

    Ahlswede et al. (in prep.)

    GitHub

    Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

    Folder structure

    We refer to the proposed folder structure in the PDF file.

    • Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.

    • Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.

    • Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.

    • The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]

    • The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.

    • The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

    CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),

    Join the archive

    Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

    Project description

    This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

    Publications

    Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication

    Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens

    Schulz et al. (2022, in prep.): Phenoprofiling

    Conference contributions

    S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

    C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

    C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.

    B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.

    C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.

  12. Forest disturbance detection by using remote sensing and artificial...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kamila Pawłuszek-Filipiak; Dorota Włodarczyk; Karolina Stuchły; Femi Ikuemonisan; Kamila Pawłuszek-Filipiak; Dorota Włodarczyk; Karolina Stuchły; Femi Ikuemonisan (2025). Forest disturbance detection by using remote sensing and artificial intelligence in Africa [Dataset]. http://doi.org/10.5281/zenodo.10882149
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kamila Pawłuszek-Filipiak; Dorota Włodarczyk; Karolina Stuchły; Femi Ikuemonisan; Kamila Pawłuszek-Filipiak; Dorota Włodarczyk; Karolina Stuchły; Femi Ikuemonisan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Africa
    Description

    The dataset arises from the "Forest Disturbance Detection Using Remote Sensing and Artificial Intelligence in Africa" (EO4Forest) project, a collaboration funded by the European Space Agency and conducted by Wrocław University of Environmental and Life Sciences (Poland) and Lagos State University (Nigeria). Designed for forest monitoring in the Ogun and Lagos States, the dataset includes detailed land cover classification maps for the years 2015, 2019, 2022, and 2023, all at a 20-meter spatial resolution to ensure accurate representation of land cover. A legend file accompanies the maps, clarifying six defined land cover classes: water, urbanized areas, soil, cropland, grasslands, and forest.

    The dataset includes over 112,000 training and 11,900 validation samples, which are essential for the accurate development and evaluation of the Random Forest classification models used. Special emphasis was placed on the forest class, ensuring a diverse representation of forest types, including tropical humid forests, mangroves, and dry woodlands.

    In addition to land cover maps, the dataset also contains forest gain and loss maps, with a particular focus on recent updates for selected subareas in 2022 and 2023, available in the NTR_ForestUpdate folder.

    Based on optical satellite imagery from Sentinel-2 and Landsat-8, the dataset leverages spectral indices and the Random Forest algorithm to classify land cover types. It provides valuable insights for environmental research related to deforestation, reforestation, and afforestation. Forest change maps are included to highlight areas of forest loss and gain, capturing the dynamic shifts in Nigeria's forest cover and offering detailed geographic context.

    The methodology, including data acquisition, preprocessing, feature extraction (spectral index calculation), classification, and accuracy assessment, is fully documented in Python scripts available in the associated GitHub repository. This ensures transparency and reproducibility, offering users both the processed outputs and the tools necessary for custom analyses and advanced forest monitoring.

  13. Harmonic Baseline Experiments for Landsat-Based Forest Condition Monitoring...

    • dataone.org
    • portal.edirepository.org
    Updated Dec 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valerie Pasquarella (2023). Harmonic Baseline Experiments for Landsat-Based Forest Condition Monitoring in Southern New England 2017 [Dataset]. https://dataone.org/datasets/https%3A%2F%2Fpasta.lternet.edu%2Fpackage%2Fmetadata%2Feml%2Fknb-lter-hfr%2F374%2F4
    Explore at:
    Dataset updated
    Dec 12, 2023
    Dataset provided by
    Long Term Ecological Research Networkhttp://www.lternet.edu/
    Authors
    Valerie Pasquarella
    Time period covered
    Jan 1, 2017
    Area covered
    Variables measured
    dir, doi, filename, description, filesize_mb
    Description

    This dataset was developed as part of a study of harmonic baseline model parameterization for forest condition monitoring using Landsat time series. We implemented a previously published harmonic modeling approach for forest condition monitoring in Google Earth Engine and systematically assessed the relative ability of condition change products generated using various model parameterizations for predicting pest abundances and defoliation during the 2016-2018 Lymantria dispar outbreak in southern New England. We ran a series of 32 experiments that considered a variety of parameter choices for establishing multi-year “baseline” models representing relatively stable forest conditions for each Landsat pixel in our study area. We tested a full set of factors including (a) spectral vegetation index used for model fitting, (b) baseline-modeling period, (c) frequencies of harmonic regression terms, and (d) differences in Landsat time series input imagery. We generated average condition score estimates for each of these 32 baseline parameterizations for a May 1 to September 30, 2017 monitoring period, then used Generalized Linear Mixed Models to test the relationships between ground-based observations of defoliation and defoliator abundance (larva and egg masses). This archived dataset includes the full set of experimental raster results, as well as a “reanalysis” product from a previous implementation of our condition monitoring workflow. More information on model parameterization rankings can be found in the associated publication (Pasquarella et al. 2021).

  14. e

    sdaas - a Python tool computing an amplitude anomaly score of seismic data...

    • b2find.eudat.eu
    Updated Jun 29, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2007). sdaas - a Python tool computing an amplitude anomaly score of seismic data and metadata using simple machine learning algorithm - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/b0ff5f26-69b6-597b-a879-299e3c5118f1
    Explore at:
    Dataset updated
    Jun 29, 2007
    Description

    The increasingly high number of big data applications in seismology has made quality control tools to filter, discard, or rank data of extreme importance. In this framework, machine learning algorithms, already established in several seismic applications, are good candidates to perform the task flexibility and efficiently. sdaas (seismic data/metadata amplitude anomaly score) is a Python library and command line tool for detecting a wide range of amplitude anomalies on any seismic waveform segment such as recording artifacts (e.g., anomalous noise, peaks, gaps, spikes), sensor problems (e.g., digitizer noise), metadata field errors (e.g., wrong stage gain in StationXML). The underlying machine learning model, based on the isolation forest algorithm, has been trained and tested on a broad variety of seismic waveforms of different length, from local to teleseismic earthquakes to noise recordings from both broadband and accelerometers. For this reason, the software assures a high degree of flexibility and ease of use: from any given input (waveform in miniSEED format and its metadata as StationXML, either given as file path or FDSN URLs), the computed anomaly score is a probability-like numeric value in [0, 1] indicating the degree of belief that the analyzed waveform represents an anomaly (or outlier), where scores ≤0.5 indicate no distinct anomaly. sdaas can be employed for filtering malformed data in a pre-process routine, assign robustness weights, or be used as metadata checker by computing randomly selected segments from a given station/channel: in this case, a persistent sequence of high scores clearly indicates problems in the metadata

  15. d

    Data from: Tree mortality in an agricultural landscape of Southwestern...

    • search.dataone.org
    • datadryad.org
    Updated Apr 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cristina Barber; Jennyffer Cruz; Sarah Graves; Stephanie Bohlman; Pieter Zuidema; Gregory Asner; Aaron Carignan; Vicente Vasquez; Jodi Brandt; T. Trevor Caughlin (2025). Tree mortality in an agricultural landscape of Southwestern Panama assessed using remote sensing and field data [Dataset]. http://doi.org/10.5061/dryad.gxd2547xt
    Explore at:
    Dataset updated
    Apr 4, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Cristina Barber; Jennyffer Cruz; Sarah Graves; Stephanie Bohlman; Pieter Zuidema; Gregory Asner; Aaron Carignan; Vicente Vasquez; Jodi Brandt; T. Trevor Caughlin
    Area covered
    Panama
    Description

    Agricultural tree cover is declining globally, including the loss of large, scattered trees that function as keystone structures. Understanding the drivers of agricultural tree loss could help prevent further declines. However, the drivers of agricultural tree mortality vary across scales, from individual trees to landscapes, complicating efforts to quantify mortality risk. We applied high-resolution remote sensing and multi-method occupancy models to test hypotheses of drivers of tree mortality in a pastoral landscape of Southwestern Panama. Our approach enabled us to identify individual tree mortality across a >20,000 ha area, encompassing a wide range of land use intensity. Neighboring tree cover was the strongest predictor of mortality, with a higher probability of death for isolated trees relative to trees with many neighbors. Landscape-level covariates also predicted mortality risk, including higher mortality closer to roads and in parcels with larger areas. These results impli..., The study was conducted in Southwestern Panama, covering a 23,000-hectare area in Los Santos province. This region experiences a dry season from December to March, with most of its 1,700 mm of annual rainfall occurring between April and November. Historically, the landscape was dominated by dry tropical forests, but extensive cattle ranching has reduced forest cover to a small fraction of its original extent. The current landscape consists of active pastures, riparian corridors, and second-growth forests, with ongoing agricultural de-intensification and reforestation efforts. To assess tree mortality between 2012 and 2019, data were integrated from three sources: field measurements of individual trees, aerial hyperspectral-lidar imagery, and high-resolution satellite imagery. Field data were collected during initial surveys in 2012 and 2013, when individual trees were identified, georeferenced, and classified by species. These trees were revisited in 2019 to determine survival, with dea..., , # Tree mortality in an agricultural landscape of Southwestern Panama assessed using remote sensing and field data

    https://doi.org/10.5061/dryad.gxd2547xt

    Description of the data and file structure

    These data represent field and remotely sensed measurements of tree mortality, along with landscape- and individual-level covariates for tree mortality. The file titled "y.csv" contains two columns. The first column represents the 269 trees whose mortality was measured directly in the field. The second column represents mortality estimated from remotely sensed data for 6,154 trees. Each row corresponds to a single tree. A value of 1 indicates tree survival, while a value of 0 indicates tree death. Values of -1 indicate trees that were measured using remotely sensed data but not field data.

    The file titled "field.csv" includes a single column with an index indicating which trees were measured in the field (out of 6,154 total trees). Simi...,

  16. Global drivers of forest loss at 1 km resolution

    • zenodo.org
    bin, csv, tiff
    Updated Jun 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michelle Sims; Michelle Sims; Radost Stanimirova; Radost Stanimirova; Anton Raichuk; Anton Raichuk; Maxim Neumann; Maxim Neumann; Jessica Richter; Jessica Richter; Forrest Follett; Forrest Follett; James MacCarthy; James MacCarthy; Kristine Lister; Kristine Lister; Christopher Randle; Christopher Randle; Lindsey Sloat; Lindsey Sloat; Elena Esipova; Jaelah Jupiter; Jaelah Jupiter; Charlotte Stanton; Charlotte Stanton; Dan Morris; Dan Morris; Christy Melhart Slay; Christy Melhart Slay; Drew Purves; Nancy Harris; Nancy Harris; Elena Esipova; Drew Purves (2025). Global drivers of forest loss at 1 km resolution [Dataset]. http://doi.org/10.5281/zenodo.14163025
    Explore at:
    tiff, csv, binAvailable download formats
    Dataset updated
    Jun 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michelle Sims; Michelle Sims; Radost Stanimirova; Radost Stanimirova; Anton Raichuk; Anton Raichuk; Maxim Neumann; Maxim Neumann; Jessica Richter; Jessica Richter; Forrest Follett; Forrest Follett; James MacCarthy; James MacCarthy; Kristine Lister; Kristine Lister; Christopher Randle; Christopher Randle; Lindsey Sloat; Lindsey Sloat; Elena Esipova; Jaelah Jupiter; Jaelah Jupiter; Charlotte Stanton; Charlotte Stanton; Dan Morris; Dan Morris; Christy Melhart Slay; Christy Melhart Slay; Drew Purves; Nancy Harris; Nancy Harris; Elena Esipova; Drew Purves
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 14, 2024
    Description

    The World Resources Institute and Google DeepMind created a global map of the dominant drivers of tree cover loss from 2001 to 2022 at 1 km spatial resolution. We used a deep learning model to classify seven driver classes: permanent agriculture, hard commodities, shifting cultivation, forest management, wildfires, settlements & infrastructure, and other natural disturbances. As part of the study, we collected a set of training samples through interpretation of very high resolution satellite imagery and developed a single world-wide customized residual convolutional neural network model (ResNet) using satellite data (Landsat and Sentinel-2) and ancillary biophysical and population data. In addition, we collected a stratified random sample of validation plots through interpretation of very high resolution satellite imagery to estimate the accuracy of the final classification map.

    In this repository, we make the following available:

    • The training and validation data, available in two separate files. Both datasets were collected by a team of image interpreters and assessed for quality by two additional interpreters. Note that while for the validation data the quality of the primary driver was rigorously assessed, the secondary driver wasn’t subject to the same level of quality control.

    • The global drivers of forest loss raster (drivers_forest_loss_1km.tif), including the discrete classification and probabilities for each class.

    Creator & Contact

    Created by World Resources Institute and Google DeepMind

    Contact: Michelle Sims: Michelle.Sims@wri.org

    Definitions

    A driver is defined as the direct cause of tree cover loss, and can include both temporary disturbances (natural or anthropogenic) or permanent loss of tree cover due to a change to a non-forest land use (e.g., deforestation). The dominant driver is defined as the direct driver that caused the majority of tree cover loss within each 1 km cell over the time period.

    Classes are defined as follows:

    Driver

    Definition

    Permanent agriculture

    Long-term, permanent tree cover loss for small- to large-scale agriculture. This includes perennial tree crops such as oil palm, cacao, orchards, nut trees, and rubber, as well as pasture and seasonal crops and cropping systems, which may include a fallow period. Agricultural activities are considered "permanent" if there is visible evidence that they persist following the tree cover loss event and are not a part of a temporary cultivation cycle. Clearing land for agricultural activities may involve use of fire.

    Hard commodities

    Tree cover loss due to the establishment or expansion of mining or energy infrastructure. Mining activities range from small-scale and artisanal mining to large-scale mining. Energy infrastructure includes power lines, power plants, oil drilling and refineries, wind and solar farms, flooding due to the construction of hydroelectric dams, and other types of energy infrastructure.

    Shifting cultivation

    Tree cover loss due to small- to medium-scale clearing for temporary cultivation that is later abandoned and followed by subsequent regrowth of secondary forest or vegetation. Clearing land for temporary cultivation may involve use of fire.

    Forest management

    Forest management and logging activities occurring within managed, natural or semi-natural forests and plantations, often with evidence of forest regrowth or planting in subsequent years. This includes harvesting in wood-fiber plantations, clear-cut and selective logging, establishment of logging roads, and other forest management activities such as forest thinning and salvage or sanitation logging.

    Wildfire

    Tree cover loss due to fire with no visible human conversion or agricultural activity afterward. Fires may be started by natural causes (e.g. lightning) or may be related to human activities (accidental or deliberate).

    Settlements and infrastructure

    Tree cover loss due to expansion and intensification of roads, settlements, urban areas, or built infrastructure (not associated with other classes).

    Other natural disturbances

    Tree cover loss due to other non-fire natural disturbances, including storms, flooding, landslides, drought, windthrow, lava flows, sediment flow or meandering rivers, natural flooding, insect outbreaks, etc. If tree cover loss due to natural causes is followed by salvage or sanitation logging, it is classified as forest management.

  17. n

    Data from: Isolation drives species gains and losses of insect...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Sep 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pedro Giovâni da Silva; Marina do Vale Beirão; Flávio Siqueira de Castro; Lucas Neves Perillo; Flávio Camarota; Ricardo Solar; Geraldo Fernandes; Frederico Neves (2023). Isolation drives species gains and losses of insect metacommunities over time in a mountaintop forest archipelago [Dataset]. http://doi.org/10.5061/dryad.sxksn038c
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 6, 2023
    Dataset provided by
    Universidade Federal de Minas Gerais
    Authors
    Pedro Giovâni da Silva; Marina do Vale Beirão; Flávio Siqueira de Castro; Lucas Neves Perillo; Flávio Camarota; Ricardo Solar; Geraldo Fernandes; Frederico Neves
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Aim: We evaluated the effects of forest island size, isolation, and area in the landscape driving temporal changes of insect biodiversity in a mountaintop forest archipelago. We tested the following hypotheses: (i) changes in insect species composition is higher and dominated by species gains through time in smaller but less isolated forest islands; (ii) larger amounts of forest in the landscape result in higher gains of more vagile species over time, regardless of their size and isolation; (iii) heterogenization processes will prevail in less vagile groups, while homogenization processes will dominate in highly vagile groups due to differences in dispersal capability. Location: Espinhaço Range Biosphere Reserve, Brazil. Taxon: Insects. Methods: We used ants, dung beetles, bees, wasps, and butterflies as study models to represent a gradient of low-to-high dispersal capability. We evaluated the colonization- and extirpation-resultant components of temporal β-diversity using area- and isolation-related variables as predictors. Results: We found distinct colonization- and extirpation-resultant homogenization and heterogenization processes acting according to each insect group, likely due to different dispersal capabilities. Ants were dominated by species losses, with widespread and rare species being lost. Butterflies gained species, represented mainly by widespread species, leading to an increased colonization-resultant homogenization. Distance to neighboring forest islands was the underlying predictor affecting the temporal β-diversity of insect groups and on species gains and losses but differed according to the survey period. Effects of the forest amount in the landscape increased the temporal β-diversity of bees and butterflies but decreased that of ants, dung beetles, and wasps. Main Conclusions: We recommend conserving the forest amount in the landscape and keeping forest connectivity among forest islands since the temporal dynamics of local colonization and extirpation can depend on the organisms’ dispersal capability. Fragmentation and habitat loss-related hypotheses will benefit from incorporating colonization-extirpation processes into their predictions.

  18. m

    Data, R Scripts and Random Forest Models for Winter Catch Crop Monitoring...

    • data.mendeley.com
    Updated Dec 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Schulz (2020). Data, R Scripts and Random Forest Models for Winter Catch Crop Monitoring from Sentinel-2 NDVI Time Series in Germany [Dataset]. http://doi.org/10.17632/78g2r5dp3k.2
    Explore at:
    Dataset updated
    Dec 18, 2020
    Authors
    Christian Schulz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Germany
    Description

    The data contains a zip-file with the following folders: a) data (agricultural parcels, filled and unfilled NDVI time series tables, feature extraction tables and prediction results) (csv, shp), b) model (random forest models for catch crop prediction) (rds), and c) R (R script files for Random Forest model training and prediction with RStudio) (r).

    The algorithms and models developed for this study were implemented via virtual Docker containers into the timeStamp software prototype which allows for large-scale automatized catch crop analysis on the parcel-level (www.timestamp.lup-umwelt.de). timeStamp saves the raster data from the GTS² archive as parcel-wise clipped image time series into a PostGIS database. All further processing steps were performed with the statistical computing language R (RStudio Team, 2020). For raster data manipulation within the PostGIS database and downloading NDVI time series, we used the packages rpostgis (Bucklin and Basille, 2019) and RPostgreSQL (Conway et al., 2017). For time series filling and predictors calculation, we used the packages zoo (Zeileis et al., 2020), hydroGOF (Zambrano-Bigiarini, 2020), tsoutliers (de Lacalle, 2019), and changepoint (Killick et al., 2016). For RF modelling, we used the package caret (Kuhn et al., 2020).

    The original data for NDVI time series calculation is from the GFZ Time Series System for Sentinel-2 by the German Research Centre for Geosciences, 2020 (https://gitext.gfz-potsdam.de/gts2). The predictors for Random Forest modelling calculated from the NDVI time series are described in the article in the reference section.

    For further information, we refer to the article mentioned in the references.

  19. Data from: Modelling riparian forest distribution and composition to entire...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pérez-Silos; Pérez-Silos (2020). Modelling riparian forest distribution and composition to entire river networks [Dataset]. http://doi.org/10.5281/zenodo.3444285
    Explore at:
    binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pérez-Silos; Pérez-Silos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aim: Developing a methodology to map the distribution of riparian forests to entire river networks and determining the main environmental factors controlling their spatial patterns.

    Location: Cantabrian region, northern Spain.

    Methods: We mapped the riparian forests at a physiognomic and phytosociological levels by delimiting riparian zones and generating vegetation distribution models based on remote sensing data (Landsat 8 OLI and LiDAR PNOA). We built virtual watersheds to define a spatial framework where the catchment environmental information can be routed to each river reach, jointly with the vegetation map. In order to determine the drivers playing a significant role on the observed spatial patterns in the riparian forest we modelled interactions between these datasets of environmental information and riparian vegetation by using the Random Forest algorithm.

    Results: The modelling results obtained reproduced a reliable variation of riparian forest structure and composition across Cantabrian watersheds. The produced maps were highly accurate, with more than a 70% overall accuracy for the forest occurrence. A clear differentiation between Eurosiberian (91E0 and 9160 habitats) and Mediterranean (92E0) riparian forests was shown on both sides of the mountain range. Topography and land use were the main drivers defining the distribution of riparian forest as a physiognomic unit. In turn, altitude, climate and percentage of pasture were the most relevant factors determining their composition (phytosociological approach).

    Conclusions: Our study confirms that the anthropic control ultimately defines the distribution of the vegetation in the riparian area at a regional to local scale. Human disturbances constrain the extension of forest patches across their potential distribution defined by topoclimatic boundaries, which establish a clear limit between Mediterranean and Eurosiberian biogeographical regions.

  20. Intelligent Monitor

    • kaggle.com
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ptdevsecops (2024). Intelligent Monitor [Dataset]. http://doi.org/10.34740/kaggle/ds/4383210
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    ptdevsecops
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    IntelligentMonitor: Empowering DevOps Environments With Advanced Monitoring and Observability aims to improve monitoring and observability in complex, distributed DevOps environments by leveraging machine learning and data analytics. This repository contains a sample implementation of the IntelligentMonitor system proposed in the research paper, presented and published as part of the 11th International Conference on Information Technology (ICIT 2023).

    If you use this dataset and code or any herein modified part of it in any publication, please cite these papers:

    P. Thantharate, "IntelligentMonitor: Empowering DevOps Environments with Advanced Monitoring and Observability," 2023 International Conference on Information Technology (ICIT), Amman, Jordan, 2023, pp. 800-805, doi: 10.1109/ICIT58056.2023.10226123.

    For any questions and research queries - please reach out via Email.

    Abstract - In the dynamic field of software development, DevOps has become a critical tool for enhancing collaboration, streamlining processes, and accelerating delivery. However, monitoring and observability within DevOps environments pose significant challenges, often leading to delayed issue detection, inefficient troubleshooting, and compromised service quality. These issues stem from DevOps environments' complex and ever-changing nature, where traditional monitoring tools often fall short, creating blind spots that can conceal performance issues or system failures. This research addresses these challenges by proposing an innovative approach to improve monitoring and observability in DevOps environments. Our solution, Intelligent-Monitor, leverages realtime data collection, intelligent analytics, and automated anomaly detection powered by advanced technologies such as machine learning and artificial intelligence. The experimental results demonstrate that IntelligentMonitor effectively manages data overload, reduces alert fatigue, and improves system visibility, thereby enhancing performance and reliability. For instance, the average CPU usage across all components showed a decrease of 9.10%, indicating improved CPU efficiency. Similarly, memory utilization and network traffic showed an average increase of 7.33% and 0.49%, respectively, suggesting more efficient use of resources. By providing deep insights into system performance and facilitating rapid issue resolution, this research contributes to the DevOps community by offering a comprehensive solution to one of its most pressing challenges. This fosters more efficient, reliable, and resilient software development and delivery processes.

    Components The key components that would need to be implemented are:

    • Data Collection - Collect performance metrics and log data from the distributed system components. Could use technology like Kafka or telemetry libraries.
    • Data Processing - Preprocess and aggregate the collected data into an analyzable format. Could use Spark for distributed data processing.
    • Anomaly Detection - Apply machine learning algorithms to detect anomalies in the performance metrics. Could use isolation forest or LSTM models.
    • Alerting - Generate alerts when anomalies are detected. It could integrate with tools like PagerDuty.
    • Visualization - Create dashboards to visualize system health and key metrics. Could use Grafana or Kibana.
    • Data Storage - Store the collected metrics and log data. Could use Elasticsearch or InfluxDB.

    Implementation Details The core of the implementation would involve the following: - Setting up the data collection pipelines. - Building and training anomaly detection ML models on historical data. - Developing a real-time data processing pipeline. - Creating an alerting framework that ties into the ML models. - Building visualizations and dashboards.

    The code would need to handle scaled-out, distributed execution for production environments.

    Proper code documentation, logging, and testing would be added throughout the implementation.

    Usage Examples Usage examples could include:

    • Running the data collection agents on each system component.
    • Visualizing system metrics through Grafana dashboards.
    • Investigating anomalies detected by the ML models.
    • Tuning the alerting rules to minimize false positives.
    • Correlating metrics with log data to troubleshoot issues.

    References The implementation would follow the details provided in the original research paper: P. Thantharate, "IntelligentMonitor: Empowering DevOps Environments with Advanced Monitoring and Observability," 2023 International Conference on Information Technology (ICIT), Amman, Jordan, 2023, pp. 800-805, doi: 10.1109/ICIT58056.2023.10226123.

    Any additional external libraries or sources used would be properly cited.

    Tags - DevOps, Software Development, Collaboration, Streamlini...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Marinara Marcato; Salvatore Tedesco; Conor O’Mahony; Brendan O’Flynn; Paul Galvin (2023). Type, posture, and the number of observations in the IMU Posture dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0286311.t004

Type, posture, and the number of observations in the IMU Posture dataset.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Marinara Marcato; Salvatore Tedesco; Conor O’Mahony; Brendan O’Flynn; Paul Galvin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Type, posture, and the number of observations in the IMU Posture dataset.

Search
Clear search
Close search
Google apps
Main menu