5 datasets found
  1. TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

    • zenodo.org
    • data.niaid.nih.gov
    bin, pdf, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias (2024). TreeSatAI Benchmark Archive for Deep Learning in Forest Applications [Dataset]. http://doi.org/10.5281/zenodo.6598391
    Explore at:
    pdf, zip, binAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context and Aim

    Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

    We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

    Description

    The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

    The TreeSatAI Benchmark Archive contains:

    • 50,381 image triplets (aerial, Sentinel-1, Sentinel-2)

    • synchronized time steps and locations

    • all original spectral bands/polarizations from the sensors

    • 20 species classes (single labels)

    • 12 age classes (single labels)

    • 15 genus classes (multi labels)

    • 60 m and 200 m patches

    • fixed split for train (90%) and test (10%) data

    • additional single labels such as English species name, genus, forest stand type, foliage type, land cover

    The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.

    Version history

    v1.0.0 - First release

    Citation

    Ahlswede et al. (in prep.)

    GitHub

    Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

    Folder structure

    We refer to the proposed folder structure in the PDF file.

    • Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.

    • Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.

    • Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.

    • The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]

    • The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.

    • The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

    CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),

    Join the archive

    Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

    Project description

    This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

    Publications

    Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication

    Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens

    Schulz et al. (2022, in prep.): Phenoprofiling

    Conference contributions

    S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

    C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

    C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.

    B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.

    C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.

  2. Bank Account Fraud Dataset Suite (NeurIPS 2022)

    • kaggle.com
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sérgio Jesus (2023). Bank Account Fraud Dataset Suite (NeurIPS 2022) [Dataset]. https://www.kaggle.com/datasets/sgpjesus/bank-account-fraud-dataset-neurips-2022
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sérgio Jesus
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The Bank Account Fraud (BAF) suite of datasets has been published at NeurIPS 2022 and it comprises a total of 6 different synthetic bank account fraud tabular datasets. BAF is a realistic, complete, and robust test bed to evaluate novel and existing methods in ML and fair ML, and the first of its kind!

    This suite of datasets is: - Realistic, based on a present-day real-world dataset for fraud detection; - Biased, each dataset has distinct controlled types of bias; - Imbalanced, this setting presents a extremely low prevalence of positive class; - Dynamic, with temporal data and observed distribution shifts;
    - Privacy preserving, to protect the identity of potential applicants we have applied differential privacy techniques (noise addition), feature encoding and trained a generative model (CTGAN).

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2F4271ec763b04362801df2660c6e2ec30%2FScreenshot%20from%202022-11-29%2017-42-41.png?generation=1669743799938811&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2Faf502caf5b9e370b869b85c9d4642c5c%2FScreenshot%20from%202022-12-15%2015-17-59.png?generation=1671117525527314&alt=media" alt="">

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F3349776%2Ff3789bd484ee392d648b7809429134df%2FScreenshot%20from%202022-11-29%2017-40-58.png?generation=1669743681526133&alt=media" alt="">

    Each dataset is composed of: - 1 million instances; - 30 realistic features used in the fraud detection use-case; - A column of “month”, providing temporal information about the dataset; - Protected attributes, (age group, employment status and % income).

    Detailed information (datasheet) on the suite: https://github.com/feedzai/bank-account-fraud/blob/main/documents/datasheet.pdf

    Check out the github repository for more resources and some example notebooks: https://github.com/feedzai/bank-account-fraud

    Read the NeurIPS 2022 paper here: https://arxiv.org/abs/2211.13358

    Learn more about Feedzai Research here: https://research.feedzai.com/

    Please, use the following citation of BAF dataset suite @article{jesusTurningTablesBiased2022, title={Turning the {{Tables}}: {{Biased}}, {{Imbalanced}}, {{Dynamic Tabular Datasets}} for {{ML Evaluation}}}, author={Jesus, S{\'e}rgio and Pombal, Jos{\'e} and Alves, Duarte and Cruz, Andr{\'e} and Saleiro, Pedro and Ribeiro, Rita P. and Gama, Jo{\~a}o and Bizarro, Pedro}, journal={Advances in Neural Information Processing Systems}, year={2022} }

  3. Artificial Intelligence In Biotechnology Market Analysis North America,...

    • technavio.com
    pdf
    Updated Dec 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Artificial Intelligence In Biotechnology Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, China, Germany, UK, Switzerland, The Netherlands, Japan, South Korea, India, Brazil - Size and Forecast 2025-2029 [Dataset]. https://www.technavio.com/report/artificial-intelligence-in-biotechnology-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Dec 25, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2025 - 2029
    Area covered
    United States
    Description

    Snapshot img

    What is the Artificial Intelligence In Biotechnology Market Size?

    The artificial intelligence in biotechnology market size is forecast to increase by USD 4.46 billion, at a CAGR of 19% between 2024 and 2029. Artificial Intelligence (AI) is revolutionizing the biotechnology industry by enhancing research and development processes, enabling accurate diagnoses, and improving productivity. Key growth factors fueling the market include substantial investments in biotechnology advancements and strategic collaborations between industry players and tech companies. However, the high initial cost of implementing AI solutions remains a challenge for smaller organizations. The market is expected to witness significant growth due to the increasing adoption of AI in areas such as drug discovery, genetic research, and agricultural technology. Furthermore, advancements in machine learning algorithms and natural language processing are enabling more precise and efficient data analysis, leading to new discoveries and innovations. Overall, the integration of AI in biotechnology is transforming the industry and offering numerous opportunities for growth.

    What will be the size of the Market during the forecast period?

    Request Free Artificial Intelligence In Biotechnology Market Sample

    Market Segmentation

    The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2025-2029, as well as historical data from 2019 - 2023 for the following segments.

    Application
    
      Drug discovery and development
      Clinical trials and optimization
      Medical imaging
      Diagnostics
      Others
    
    
    End-user
    
      Pharmaceutical companies
      Biotechnology companies
      Contract research organization (CRO)
      Healthcare providers
      Others
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        India
        Japan
        South Korea
    
    
      South America
    
        Brazil
    
    
      Middle East and Africa
    

    Which is the largest segment driving market growth?

    The drug discovery and development segment is estimated to witness significant growth during the forecast period. The market is experiencing significant growth, particularly in drug discovery and development. AI technologies are transforming the drug discovery process by increasing accuracy and efficiency in identifying potential drug candidates.

    Get a glance at the market share of various regions. Download the PDF Sample

    The drug discovery and development segment was valued at USD 522.60 million in 2019. AI applications in biotechnology extend beyond drug discovery, including compound screening, personalized medicine, and environmental factors analysis. This growth is driven by the increasing demand for personalized treatments, the need for faster drug development, and the potential for AI to revolutionize various applications in biotech and pharmaceuticals.

    Which region is leading the market?

    For more insights on the market share of various regions, Request Free Sample

    North America is estimated to contribute 40% to the growth of the global market during the forecast period. Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period. The North American region leads the global artificial intelligence (AI) market in biotechnology due to substantial investments, strategic collaborations, and technological advancements. With a strong infrastructure and a strong focus on innovation, the region is at the forefront of adopting and developing AI-driven biotechnological solutions. For example, in March 2023, Predictive Oncology partnered with Integra Therapeutics to enhance gene editing capabilities for cancer therapies. This collaboration leverages Predictive Oncology's expertise in protein expression to advance gene editing techniques, aiming to develop more effective cancer treatments. The partnership in Minnesota highlights the region's commitment to pioneering cancer research and therapeutic development through AI technology. In the life sciences sector, AI is utilized to analyze large datasets of genetic information, improve treatment outcomes, and increase productivity. Key players in the market include leading research institutions and biotechnology companies.

    How do company ranking index and market positioning come to your aid?

    Companies are implementing various strategies, such as strategic alliances, partnerships, mergers and acquisitions, geographical expansion, and product/service launches, to enhance their presence in the market.

    Abbott Laboratories - The company offers artificial intelligence in biotechnology solutions that include AI-driven medical imaging and predictive analytics for identifying individuals at risk of heart attacks.

    Technavio provides the ranking index for the top 20 companies along with insights on the market

  4. VasTexture: Vast repository of textures and PBR Materials extracted from...

    • zenodo.org
    jpeg, zip
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zenodo (2025). VasTexture: Vast repository of textures and PBR Materials extracted from images using unsupervised approach [Dataset]. http://doi.org/10.5281/zenodo.11391127
    Explore at:
    zip, jpegAvailable download formats
    Dataset updated
    Apr 26, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    VasTexture: Vast repository of textures and SVBRDF/PBR Materials extracted from images using an unsupervised approach.

    This is an old version For latest version: https://zenodo.org/records/12629301

    This dataset contains hundreds of thousands (hopefully millions soon) of textures and PBR/SV-BRDF materials extracted from real-world natural images.

    The repository is composed of RGB images of textures given as RGB images (each image is one uniform texture) and folders of PBR/SVBRDF materials given as a set of property maps (base color, roughness, metallic, etc).

    Visualisation of sampled PBRs and Textures can be seen in: PBR_examples.jpg and Textures_Examples.jpg

    Link to the main project page

    Link to paper

    File structure

    Texture images are given in the Extracted_textures_*.zip files.

    Each image in this zip file is a single texture, the textures were extracted and cropped from the open images dataset.

    PBR Materials are available in PBR_*.zip files these PBRs were generated from the texture images in an unsupervised way (with no human intervention). Each subfolder in this file contains the properties map of the PBRs (roughness, metallic, etc, suitable for blender/unreal engine). Visualization of the rendered material appears in the file Material_View.jpg in each PBR folder.

    PBR materials that were generated by mixing other PBR materials are available in files with the names PBR_mix*.zip

    Samples for each case can be found in files named: Sample_*.zip

    Documented code used to extract the textures and generate the PBRs is available at:

    Texture_And_Material_ExtractionCode_And_Documentation.zip

    Details:

    The materials and textures were extracted from real-world images using an unsupervised extraction method (code supplied). As such they are far more diverse and wide in scope compared to existing repositories, at the same time they are much more noisy and contain more outliers compared to existing repositories. This repository is probably more useful for things that demand large-scale and very diverse data, yet can use noisy and lower quality compared to professional repositories with manually made assets like ambientCG. It can be very useful for creating machine learning datasets, or large-scale procedural generation. It is less suitable for areas that demand precise clean and categorized PBR like CGI art and graphic design. For preview It is recommended to look at PBR_examples.jpg and Textures_Examples.jpg or download the Sample files and look at the Material_View.jpg files to visualize the quality of the materials.

    Scale:

    Currently, there are a few hundred of thousands PBR materials and textures but the goal is to make this into over a million in the near future.

    Data generation code:

    The Python scripts used to extract these assets are supplied at:

    Texture_And_Material_ExtractionCode_And_Documentation.zip

    The code could be run in any folder of random images extract regions with uniform textures and turn these into PBR materials.

    Alternative download sources:

    Alternative download sources:

    https://sites.google.com/view/infinitexture/home

    https://e.pcloud.link/publink/show?code=kZON5TZtxLfdvKrVCzn12NADBFRNuCKHm70

    https://icedrive.net/s/jfY1xSDNkVwtYDYD4FN5wha2A8Pz

    Paper

    This work was done as part of the paper "Learning Zero-Shot Material States Segmentation,

    by Implanting Natural Image Patterns in Synthetic Data".

    @article{eppel2024learning,

    title={Learning Zero-Shot Material States Segmentation, by Implanting Natural Image Patterns in Synthetic Data},

    author={Eppel, Sagi and Li, Jolina and Drehwald, Manuel and Aspuru-Guzik, Alan},

    journal={arXiv preprint arXiv:2403.03309},

    year={2024}

    }

    License:

    All the code and repositories are available on CC0 (free to use) licenses.

    Textures were extracted from the open images dataset which is an Apache license.

  5. Ai Edge Computing Market Analysis North America, Europe, APAC, South...

    • technavio.com
    pdf
    Updated Nov 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Ai Edge Computing Market Analysis North America, Europe, APAC, South America, Middle East and Africa - US, China, Canada, UK, Germany, France, Brazil, Japan, Saudi Arabia, India - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/ai-edge-computing-market-industry-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Nov 21, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2024 - 2028
    Area covered
    China, Saudi Arabia, Brazil, Germany, United States, Canada, France, United Kingdom, Japan
    Description

    Snapshot img

    AI Edge Computing Market Size and Trends

    The AI edge computing market size is forecast to increase by USD 69.72 billion at a CAGR of 38.6% between 2023 and 2028. The market is experiencing significant growth due to the increasing deployment of IoT sensors and programmable application-specific integrated circuits (ASICs). Edge AI technology is being adopted in various industries, including smart homes, autonomous vehicles, and manufacturing, for real-time application of computer vision, object detection, and quality inspection. Edge AI enables data processing at the source, reducing latency and bandwidth requirements. However, security concerns related to edge AI devices are a challenge, necessitating strong encryption and access control mechanisms. Additionally, the integration of edge AI in commercial drones and remote through augmented reality is a developing trend. Overall, the edge AI market is poised for expansion, driven by the need for real-time analytics and the proliferation of connected devices.

    Request Free Sample

    AI edge computing refers to the practice of processing artificial intelligence (AI) algorithms at the edge of a network, closer to where data is generated, rather than relying on cloud computing for processing. This approach offers several advantages, including reduced latency, increased data security, and improved energy efficiency. The market in North America is witnessing significant growth due to the increasing adoption of connected devices and the need for real-time data processing. Data security is a major concern for organizations, and AI edge computing addresses this issue by keeping sensitive data local and encrypted. Network connectivity is also essential for edge computing, and the rollout of 5G technology is expected to accelerate the adoption of AI edge computing solutions. Image recognition and computer vision are two key applications of AI edge computing. These technologies are used in various industries, including retail and manufacturing, for tasks such as quality control and inventory management.

    Market Segmentation

    The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD billion' for the period 2024-2028, as well as historical data from 2018 - 2022 for the following segments.

    Type
    
      Hardware
      Software and services
    
    
    Geography
    
      North America
    
        Canada
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
    
    
      South America
    
    
    
      Middle East and Africa
    

    By Type Insights

    The hardware segment is estimated to witness significant growth during the forecast period. AI edge computing refers to the implementation of artificial intelligence and machine learning algorithms to process data from IoT sensors and other hardware devices at the local level, enabling real-time decision-making. The hardware component of the global market comprises processors and devices that necessitate cognitive computing.

    Get a glance at the market share of various segments Download the PDF Sample

    The hardware segment was the largest segment and was valued at USD 3.87 billion in 2018. The physical edge AI computing components consist of processors and sensors, while the devices encompass smartphones, laptops, smart speakers, drones, and surveillance cameras. Various processors, such as central processing units (CPUs), graphics processing units (GPUs), field-programmable gate arrays (FPGAs), and application-specific integrated circuits (ASICs), are utilized in edge AI devices. The expansion of interconnected systems and devices, including smartphones, laptops, and smart speakers, has fueled the growth of this market segment during the forecast period.

    Regional Analysis

    For more insights on the market share of various regions Download PDF Sample now!

    North America is estimated to contribute 79% to the growth of the global market during the forecast period. Technavio's analysts have elaborately explained the regional trends and drivers that shape the market during the forecast period. In the North American market, AI edge computing is experiencing significant growth due to various factors. One key driver is the increasing use of edge AI devices, such as surveillance cameras and IoT sensors. The adoption of these technologies is particularly high in the US, where government initiatives mandate their installation in public places. Additionally, the region's advanced IT and telecom infrastructure, including high-speed networks, supports the efficient implementation of AI algorithms in industries like automotive, robotics, and healthcare. The automotive sector, in particular, is witnessing an increase in AI edge computing applications, with the development of electroceuticals and autonomous vehicles requiring real-time decision-making capabilities. In healthcare, AI edge computing enables the processing of large health data sets at

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias (2024). TreeSatAI Benchmark Archive for Deep Learning in Forest Applications [Dataset]. http://doi.org/10.5281/zenodo.6598391
Organization logo

TreeSatAI Benchmark Archive for Deep Learning in Forest Applications

Explore at:
pdf, zip, binAvailable download formats
Dataset updated
Jul 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Christian Schulz; Christian Schulz; Steve Ahlswede; Steve Ahlswede; Christiano Gava; Patrick Helber; Patrick Helber; Benjamin Bischke; Benjamin Bischke; Florencia Arias; Michael Förster; Michael Förster; Jörn Hees; Jörn Hees; Begüm Demir; Begüm Demir; Birgit Kleinschmit; Birgit Kleinschmit; Christiano Gava; Florencia Arias
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Context and Aim

Deep learning in Earth Observation requires large image archives with highly reliable labels for model training and testing. However, a preferable quality standard for forest applications in Europe has not yet been determined. The TreeSatAI consortium investigated numerous sources for annotated datasets as an alternative to manually labeled training datasets.

We found the federal forest inventory of Lower Saxony, Germany represents an unseen treasure of annotated samples for training data generation. The respective 20-cm Color-infrared (CIR) imagery, which is used for forestry management through visual interpretation, constitutes an excellent baseline for deep learning tasks such as image segmentation and classification.

Description

The data archive is highly suitable for benchmarking as it represents the real-world data situation of many German forest management services. One the one hand, it has a high number of samples which are supported by the high-resolution aerial imagery. On the other hand, this data archive presents challenges, including class label imbalances between the different forest stand types.

The TreeSatAI Benchmark Archive contains:

  • 50,381 image triplets (aerial, Sentinel-1, Sentinel-2)

  • synchronized time steps and locations

  • all original spectral bands/polarizations from the sensors

  • 20 species classes (single labels)

  • 12 age classes (single labels)

  • 15 genus classes (multi labels)

  • 60 m and 200 m patches

  • fixed split for train (90%) and test (10%) data

  • additional single labels such as English species name, genus, forest stand type, foliage type, land cover

The geoTIFF and GeoJSON files are readable in any GIS software, such as QGIS. For further information, we refer to the PDF document in the archive and publications in the reference section.

Version history

v1.0.0 - First release

Citation

Ahlswede et al. (in prep.)

GitHub

Full code examples and pre-trained models from the dataset article (Ahlswede et al. 2022) using the TreeSatAI Benchmark Archive are published on the GitHub repositories of the Remote Sensing Image Analysis (RSiM) Group (https://git.tu-berlin.de/rsim/treesat_benchmark). Code examples for the sampling strategy can be made available by Christian Schulz via email request.

Folder structure

We refer to the proposed folder structure in the PDF file.

  • Folder “aerial” contains the aerial imagery patches derived from summertime orthophotos of the years 2011 to 2020. Patches are available in 60 x 60 m (304 x 304 pixels). Band order is near-infrared, red, green, and blue. Spatial resolution is 20 cm.

  • Folder “s1” contains the Sentinel-1 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is VV, VH, and VV/VH ratio. Spatial resolution is 10 m.

  • Folder “s2” contains the Sentinel-2 imagery patches derived from summertime mosaics of the years 2015 to 2020. Patches are available in 60 x 60 m (6 x 6 pixels) and 200 x 200 m (20 x 20 pixels). Band order is B02, B03, B04, B08, B05, B06, B07, B8A, B11, B12, B01, and B09. Spatial resolution is 10 m.

  • The folder “labels” contains a JSON string which was used for multi-labeling of the training patches. Code example of an image sample with respective proportions of 94% for Abies and 6% for Larix is: "Abies_alba_3_834_WEFL_NLF.tif": [["Abies", 0.93771], ["Larix", 0.06229]]

  • The two files “test_filesnames.lst” and “train_filenames.lst” define the filenames used for train (90%) and test (10%) split. We refer to this fixed split for better reproducibility and comparability.

  • The folder “geojson” contains geoJSON files with all the samples chosen for the derivation of training patch generation (point, 60 m bounding box, 200 m bounding box).

CAUTION: As we could not upload the aerial patches as a single zip file on Zenodo, you need to download the 20 single species files (aerial_60m_…zip) separately. Then, unzip them into a folder named “aerial” with a subfolder named “60m”. This structure is recommended for better reproducibility and comparability to the experimental results of Ahlswede et al. (2022),

Join the archive

Model training, benchmarking, algorithm development… many applications are possible! Feel free to add samples from other regions in Europe or even worldwide. Additional remote sensing data from Lidar, UAVs or aerial imagery from different time steps are very welcome. This helps the research community in development of better deep learning and machine learning models for forest applications. You might have questions or want to share code/results/publications using that archive? Feel free to contact the authors.

Project description

This work was part of the project TreeSatAI (Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees at Infrastructures, Nature Conservation Sites and Forests). Its overall aim is the development of AI methods for the monitoring of forests and woody features on a local, regional and global scale. Based on freely available geodata from different sources (e.g., remote sensing, administration maps, and social media), prototypes will be developed for the deep learning-based extraction and classification of tree- and tree stand features. These prototypes deal with real cases from the monitoring of managed forests, nature conservation and infrastructures. The development of the resulting services by three enterprises (liveEO, Vision Impulse and LUP Potsdam) will be supported by three research institutes (German Research Center for Artificial Intelligence, TU Remote Sensing Image Analysis Group, TUB Geoinformation in Environmental Planning Lab).

Publications

Ahlswede et al. (2022, in prep.): TreeSatAI Dataset Publication

Ahlswede S., Nimisha, T.M., and Demir, B. (2022, in revision): Embedded Self-Enhancement Maps for Weakly Supervised Tree Species Mapping in Remote Sensing Images. IEEE Trans Geosci Remote Sens

Schulz et al. (2022, in prep.): Phenoprofiling

Conference contributions

S. Ahlswede, N. T. Madam, C. Schulz, B. Kleinschmit and B. Demіr, "Weakly Supervised Semantic Segmentation of Remote Sensing Images for Tree Species Classification Based on Explanation Methods", IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, “Exploring the temporal fingerprints of mid-European forest types from Sentinel-1 RVI and Sentinel-2 NDVI time series”, IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 2022.

C. Schulz, M. Förster, S. Vulova and B. Kleinschmit, “The temporal fingerprints of common European forest types from SAR and optical remote sensing data”, AGU Fall Meeting, New Orleans, USA, 2021.

B. Kleinschmit, M. Förster, C. Schulz, F. Arias, B. Demir, S. Ahlswede, A. K. Aksoy, T. Ha Minh, J. Hees, C. Gava, P. Helber, B. Bischke, P. Habelitz, A. Frick, R. Klinke, S. Gey, D. Seidel, S. Przywarra, R. Zondag and B. Odermatt, “Artificial Intelligence with Satellite data and Multi-Source Geodata for Monitoring of Trees and Forests”, Living Planet Symposium, Bonn, Germany, 2022.

C. Schulz, M. Förster, S. Vulova, T. Gränzig and B. Kleinschmit, (2022, submitted): “Exploring the temporal fingerprints of sixteen mid-European forest types from Sentinel-1 and Sentinel-2 time series”, ForestSAT, Berlin, Germany, 2022.

Search
Clear search
Close search
Google apps
Main menu