Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all recorded and hand-annotated as well as all synthetically generated data as well as representative trained networks used for semantic and instance segmentation experiments in the replicAnt - generating annotated images of animals in complex environments using Unreal Engine manuscript. Unless stated otherwise, all 3D animal models used in the synthetically generated data have been generated with the open-source photgrammetry platform scAnt peerj.com/articles/11155/. All synthetic data has been generated with the associated replicAnt project available from https://github.com/evo-biomech/replicAnt.
Abstract:
Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.
Benchmark data
Two pose-estimation datasets were procured. Both datasets used first instar Sungaya nexpectata (Zompro 1996) stick insects as a model species. Recordings from an evenly lit platform served as representative for controlled laboratory conditions; recordings from a hand-held phone camera served as approximate example for serendipitous recordings in the field.
For the platform experiments, walking S. inexpectata were recorded using a calibrated array of five FLIR blackfly colour cameras (Blackfly S USB3, Teledyne FLIR LLC, Wilsonville, Oregon, U.S.), each equipped with 8 mm c-mount lenses (M0828-MPW3 8MM 6MP F2.8-16 C-MOUNT, CBC Co., Ltd., Tokyo, Japan). All videos were recorded with 55 fps, and at the sensors’ native resolution of 2048 px by 1536 px. The cameras were synchronised for simultaneous capture from five perspectives (top, front right and left, back right and left), allowing for time-resolved, 3D reconstruction of animal pose.
The handheld footage was recorded in landscape orientation with a Huawei P20 (Huawei Technologies Co., Ltd., Shenzhen, China) in stabilised video mode: S. inexpectata were recorded walking across cluttered environments (hands, lab benches, PhD desks etc), resulting in frequent partial occlusions, magnification changes, and uneven lighting, so creating a more varied pose-estimation dataset.
Representative frames were extracted from videos using DeepLabCut (DLC)-internal k-means clustering. 46 key points in 805 and 200 frames for the platform and handheld case, respectively, were subsequently hand-annotated using the DLC annotation GUI.
Synthetic data
We generated a synthetic dataset of 10,000 images at a resolution of 1500 by 1500 px, based on a 3D model of a first instar S. inexpectata specimen, generated with the scAnt photogrammetry workflow. Generating 10,000 samples took about three hours on a consumer-grade laptop (6 Core 4 GHz CPU, 16 GB RAM, RTX 2070 Super). We applied 70\% scale variation, and enforced hue, brightness, contrast, and saturation shifts, to generate 10 separate sub-datasets containing 1000 samples each, which were combined to form the full dataset.
Funding
This study received funding from Imperial College’s President’s PhD Scholarship (to Fabian Plum), and is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851705, to David Labonte). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global 3D Mapping and 3D Modeling market has been witnessing robust growth, with the market size estimated at USD 6.4 billion in 2023, projected to reach USD 18.2 billion by 2032, growing at a CAGR of 12.3% during the forecast period. The increasing demand for 3D visualization across various sectors and technological advancements are key growth factors driving this market. The integration of 3D technologies in emerging fields such as augmented reality (AR) and virtual reality (VR), coupled with the rising use of 3D mapping in urban planning and smart city projects, is further boosting market expansion.
One of the primary growth factors of the 3D Mapping and 3D Modeling market is the burgeoning demand across multiple sectors such as construction, automotive, and healthcare. In construction and engineering, 3D models are crucial for visualizing complex designs and ensuring precision. The need for detailed mapping and modeling in construction is further propelled by the growth of smart city projects, which rely heavily on these technologies for planning and execution. The automotive industry utilizes 3D modeling for designing and testing vehicle components, which reduces prototyping time and costs. Similarly, in healthcare, 3D technologies are used for creating precise anatomical models, which enhance surgical planning and improve patient outcomes.
Technological advancements, particularly the integration of artificial intelligence (AI) and machine learning (ML) with 3D technologies, are significantly contributing to the market's growth. AI and ML algorithms enhance the capabilities of 3D mapping and modeling systems by providing automated, accurate, and efficient solutions. This integration also facilitates real-time data analysis and visualization, which is crucial for decision-making in various industries. The continuous development of user-friendly software tools and applications has made 3D mapping and modeling more accessible to non-specialist users, thereby expanding its application scope and market reach.
The proliferation of cloud computing is another pivotal factor aiding market growth. Cloud-based 3D mapping and modeling solutions provide scalability, cost-efficiency, and flexibility, making them highly attractive to businesses of all sizes. These solutions allow for the storage and processing of large datasets, essential for creating detailed 3D models, without the need for substantial on-premises infrastructure. The shift towards cloud-based solutions is expected to continue, driven by the need for collaborative platforms that enable real-time data sharing and remote access, which are critical in industries such as construction, logistics, and media.
The advent of 3D technology has revolutionized various industries by providing enhanced visualization and precision. In the realm of urban planning, 3D mapping allows city planners to create detailed models of urban landscapes, facilitating better decision-making and resource allocation. This technology is particularly beneficial in smart city projects, where it aids in the efficient design and management of urban infrastructure. By incorporating real-time data, 3D mapping provides a dynamic view of urban environments, enabling planners to simulate different scenarios and assess the impact of various urban development strategies. This capability is crucial for addressing the challenges of rapid urbanization and ensuring sustainable city growth.
Regionally, North America leads the 3D mapping and 3D modeling market, driven by technological advancements and high adoption rates in the construction and automotive industries. The presence of major technology companies and widespread use of 3D technologies in urban planning and development are also contributing factors. Following North America, the Asia-Pacific region is witnessing rapid growth, with countries like China, Japan, and India investing heavily in smart city projects and infrastructure development, which require 3D mapping and modeling technologies. Europe also shows significant potential due to the increasing adoption of these technologies in sectors like transportation and logistics, driven by regulatory standards that prioritize advanced technological solutions.
The component segment of the 3D Mapping and 3D Modeling market is crucial for understanding the technological and commercial aspects of the industry. In this segment, software plays a pivotal role, accounting for a su
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Algorithms that classify hyper-scale multi-modal datasets, comprising of millions of images, into constituent modality types can help researchers quickly retrieve and classify diagnostic imaging data, accelerating clinical outcomes. This research aims to demonstrate that a deep neural network that is trained on a hyper-scale dataset (4.5 million images) composed of heterogeneous multi-modal data can be used to obtain significant modality classification accuracy (96%). By combining 102 medical imaging datasets, a dataset of 4.5 million images was created. A ResNet-50, ResNet-18, and VGG16 were trained to classify these images by the imaging modality used to capture them (Computed Tomography (CT), Magnetic Resonance Imaging (MRI), Positron Emission Tomography (PET), and X-ray) across many body locations. The classification accuracy of the models was then tested on unseen data. The best performing model achieved classification accuracy of 96% on unseen data, which is on-par, or exceeds the accuracy of more complex implementations using EfficientNets or Vision Transformers (ViTs). The model achieved a balanced accuracy of 86%. This research shows it is possible to train Deep Learning (DL) Convolutional Neural Networks (CNNs) with hyper-scale multimodal datasets, composed of millions of images. Such models can find use in real-world applications with volumes of image data in the hyper-scale range, such as medical imaging repositories, or national healthcare institutions. Further research can expand this classification capability to include 3D-scans.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains three collections of porous media geometries and corresponding flow simulation data designed for training and evaluating Fourier Neural Operators (FNOs) for permeability prediction. The data supports the research presented in "Estimating the Permeability of Porous Media with Fourier Neural Operators" and enables physics-informed machine learning approaches to fluid flow prediction in complex geometries. The corresponding code is availibe on GitHub (https://github.com/NuxNux7/Permeability-from-FNO)
Dataset Collections 1. Shifted Spheres Dataset (3D) - 280 samples of structured sphere geometries in 256×128×128 voxel domains - Spheres with diameters ranging from 15-30 lattice units - Random positional shifts applied to increase geometric complexity - Achieves Reynolds numbers around 0.05 for creeping flow conditions - Serves as a controlled benchmark for model validation
Includes challenging cases with narrow connecting channels
Sandstone Dataset (2D)
~1000 smooth sandstone images with curved Voronoi polygon structures
~500 rough sandstone samples with increased surface complexity
~200 rock images for additional geometric diversity
512² resolution with 64-cell buffer zones for boundary conditions
Sourced from paper: Geng, S., Zhai, S., Li, C., 2024. Swin transformer based transfer learning model for predicting porous media permeability from 2d images. Computers and Geotechnics 168, 106–177. URL: http://dx.doi.org/10.1016/j.compgeo.2024.106177, doi:10.1016/j.compgeo.2024.106177)
Data Format and Structure All datasets are provided in HDF5 format for efficient storage and access. Each file contains: - Scaled Geometry data: Smoothed porous media structures (input) - Scaled Pressure fields: Complete 2D/3D pressure distributions from lattice Boltzmann simulations (target) - Boundaries: Used for scaling [offset, scale, pow, min_pow, max_pow] - Names
Spliting in train and validation - Due to a special sorting avoiding rotated geometries and the front to back inverted simulations in both sets, only the sorted datasets are present - Feel free to contact if there is intrest in the unaltered simulation data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This New Zealand Point Cloud Classification Deep Learning Package will classify point clouds into tree and background classes. This model is optimized to work with New Zealand aerial LiDAR data.The classification of point cloud datasets to identify Trees is useful in applications such as high-quality 3D basemap creation, urban planning, forestry workflows, and planning climate change response.Trees could have a complex irregular geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.This model is designed to extract Tree in both urban and rural area in New Zealand.The Training/Testing/Validation dataset are taken within New Zealand resulting of a high reliability to recognize the pattern of NZ common building architecture.Licensing requirementsArcGIS Desktop - ArcGIS 3D Analyst extension for ArcGIS ProUsing the modelThe model can be used in ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Note: Deep learning is computationally intensive, and a powerful GPU is recommended to process large datasets.InputThe model is trained with classified LiDAR that follows the LINZ base specification. The input data should be similar to this specification.Note: The model is dependent on additional attributes such as Intensity, Number of Returns, etc, similar to the LINZ base specification. This model is trained to work on classified and unclassified point clouds that are in a projected coordinate system, in which the units of X, Y and Z are based on the metric system of measurement. If the dataset is in degrees or feet, it needs to be re-projected accordingly. The model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification and scenarios with false positives.The model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time, and compute resources while improving accuracy. Another example where fine-tuning this model can be useful is when the object of interest is tram wires, railway wires, etc. which are geometrically similar to electricity wires. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 5 Trees / High-vegetationApplicable geographiesThe model is expected to work well in the New Zealand. It's seen to produce favorable results as shown in many regions. However, results can vary for datasets that are statistically dissimilar to training data.Training dataset - Wellington CityTesting dataset - Tawa CityValidation/Evaluation dataset - Christchurch City Dataset City Training Wellington Testing Tawa Validating ChristchurchModel architectureThis model uses the PointCNN model architecture implemented in ArcGIS API for Python.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. - Precision Recall F1-score Never Classified 0.991200 0.975404 0.983239 High Vegetation 0.933569 0.975559 0.954102Training dataThis model is trained on classified dataset originally provided by Open TopoGraphy with < 1% of manual labelling and correction.Train-Test split percentage {Train: 80%, Test: 20%} Chosen this ratio based on the analysis from previous epoch statistics which appears to have a descent improvementThe training data used has the following characteristics: X, Y, and Z linear unitMeter Z range-121.69 m to 26.84 m Number of Returns1 to 5 Intensity16 to 65520 Point spacing0.2 ± 0.1 Scan angle-15 to +15 Maximum points per block8192 Block Size20 Meters Class structure[0, 5]Sample resultsModel to classify a dataset with 5pts/m density Christchurch city dataset. The model's performance are directly proportional to the dataset point density and noise exlcuded point clouds.To learn how to use this model, see this story
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This New Zealand Point Cloud Classification Deep Learning Package will classify point clouds into building and background classes. This model is optimized to work with New Zealand aerial LiDAR data.The classification of point cloud datasets to identify Building is useful in applications such as high-quality 3D basemap creation, urban planning, and planning climate change response.Building could have a complex irregular geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.This model is designed to extract Building in both urban and rural area in New Zealand.The Training/Testing/Validation dataset are taken within New Zealand resulting of a high reliability to recognize the pattern of NZ common building architecture.Licensing requirementsArcGIS Desktop - ArcGIS 3D Analyst extension for ArcGIS ProUsing the modelThe model can be used in ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.Note: Deep learning is computationally intensive, and a powerful GPU is recommended to process large datasets.The model is trained with classified LiDAR that follows the The model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification and scenarios with false positives.The model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time, and compute resources while improving accuracy. Another example where fine-tuning this model can be useful is when the object of interest is tram wires, railway wires, etc. which are geometrically similar to electricity wires. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 6 BuildingApplicable geographiesThe model is expected to work well in the New Zealand. It's seen to produce favorable results as shown in many regions. However, results can vary for datasets that are statistically dissimilar to training data.Training dataset - Auckland, Christchurch, Kapiti, Wellington Testing dataset - Auckland, WellingtonValidation/Evaluation dataset - Hutt City Dataset City Training Auckland, Christchurch, Kapiti, Wellington Testing Auckland, Wellington Validating HuttModel architectureThis model uses the SemanticQueryNetwork model architecture implemented in ArcGIS Pro.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. - Precision Recall F1-score Never Classified 0.984921 0.975853 0.979762 Building 0.951285 0.967563 0.9584Training dataThis model is trained on classified dataset originally provided by Open TopoGraphy with < 1% of manual labelling and correction.Train-Test split percentage {Train: 75~%, Test: 25~%} Chosen this ratio based on the analysis from previous epoch statistics which appears to have a descent improvementThe training data used has the following characteristics: X, Y, and Z linear unitMeter Z range-137.74 m to 410.50 m Number of Returns1 to 5 Intensity16 to 65520 Point spacing0.2 ± 0.1 Scan angle-17 to +17 Maximum points per block8192 Block Size50 Meters Class structure[0, 6]Sample resultsModel to classify a dataset with 23pts/m density Wellington city dataset. The model's performance are directly proportional to the dataset point density and noise exlcuded point clouds.To learn how to use this model, see this story
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
As combination therapy becomes more common in clinical applications, predicting adverse effects of combination medications is a challenging task. However, there are three limitations of the existing prediction models. First, they rely on a single view of the drug and cannot fully utilize multiview information, resulting in limited performance when capturing complex structures. Second, they ignore subgraph information at different scales, which limits the ability to model interactions between subgraphs. Third, there has been limited research on effectively integrating multiview features of molecules. Therefore, we propose ComNet, a deep learning model that improves the accuracy of side effect prediction by integrating multiview features of drugs. First, to capture diverse features of drugs, a multiview feature extraction module is proposed, which not only uses molecular fingerprints but also extracts semantic information on SMILES and spatial information on 3D conformations. Second, to enhance the modeling ability of complex structures, a multiscale subgraph fusion mechanism is proposed, which can fuse local and global graph structures of drugs. Finally, a multiview feature fusion mechanism is proposed, which uses an attention mechanism to adaptively adjust the weights of different views to achieve multiview data fusion. Experiments on several publicly available data sets show that ComNet performs better than existing methods in various complex scenarios, especially in cold-start scenarios. Ablation experiments show that each core structure in ComNet contributes to the overall performance. Further analysis shows that ComNet not only converges rapidly and has good generalization ability but also identifies different substructures in the molecule. Finally, a case study on a self-collected data set validates the superior performance of ComNet in practical applications.
Background and Objective: Many biomedical, clinical, and industrial applications may benefit from musculoskeletal simulations. Three-dimensional macroscopic muscle models (3D models) can more accurately represent muscle architecture than their 1D (line-segment) counterparts. Nevertheless, 3D models remain underutilised in academic, clinical, and commercial environments. Among the reasons for this is a lack of modelling and simulation standardisation, verification, and validation. Here, we strive towards a solution by providing an open-access, characterised, constitutive relation for 3D musculotendon models. Methods: The musculotendon complex is modelled following the state- of-the-art active stress approach and is treated as hyperelastic, transversely isotropic, and nearly incompressible. Furthermore, force-length and -velocity relationships are incorporated, and muscle activation is derived from motor-unit information. The constitutive relation was implemented within the commercial finite-element software package Abaqus as a user-subroutine. A masticatory system model with left and right masseters was used to demonstrate active and passive movement. Results: The constitutive relation was characterised by various experimental data sets and was able to capture a wide variety of passive and active behaviours. Furthermore, the masticatory simulations revealed that joint movement was sensitive to the muscle’s in-fibre passive response. Conclusions: This user-material provides a “plug and play” template for 3D neuro-musculoskeletal finite-element modelling. We hope that this reduces modelling effort, fosters exchange, and contributes to the standardisation of such models.
Accurate biomass estimates are key to understanding a wide variety of ecological functions. In marine systems, epibenthic biomass estimates have traditionally relied on either destructive/extractive methods which are limited to horizontal soft-sediment environments, or simplistic geometry-based biomass conversions which are unsuitable for more complex morphologies. Consequently, there is a requirement for non-destructive, higher-accuracy methods that can be used in an array of environments, targeting more morphologically diverse taxa, and at ecological relevant scales. We used a combination of 3D photogrammetry, convolutional-neural-network (CNN) automated taxonomic identification, and taxa-specific biovolume:biomass calibrations to test the viability of estimating biomass of three species of morphologically-complex epibenthic taxa from in situ stereo 2D source imagery. Our trained CNN produced accurate and reliable annotations of our target taxa across a wide range of conditions. When ..., Biomass regressions Biomass regressions for target taxa were conducted using in-situ and ex-situ photogrammetry to estimate biovolume and subsequent weighing using dry- or wet-weight methods, depending upon the taxa. Field Biomass Validation Five underwater transects were conducted using photogrammetric video surveys. On each transect, three 0.5 quadrats were placed and the biovolume of the target taxa was measured from the photogrammetric 3D model. All target taxa within each of the quadrats were subsequently collected and retained for biomass measurements. Validation of field biomass estimates was achieved by comparing the “true†biomass of the target taxa (measured from weighing subsampled quadrats) with that predicted from our biovolume conversions.Â
Machine-learning 3D annotation validation To test the accuracy of the automated model annotation, we compared the biovolume from manually annotated meshes with the biovolume from meshes created using machine-learning annotated dense c..., , # 3D photogrammetry and deep-learning deliver accurate estimates of epibenthic biomass
https://doi.org/10.5061/dryad.1rn8pk11z
A combination of biomass and biovolume data for NE Atlantic Epibenthic Species and a machine-learning code for automated identification of these species.
Biomass data is collected in kg or g, dry mass or wet mass. Biovolume is collected using photogrammetric approaches measured in cm^3 or m^3. These data are in Excel format. Data can be used to generate density regressions.
This data will also be made available as part of a submission to the NERC British Oceanographic Data Centre.
All training data/code for the machine learning semantic segmentation is contained in a folder called SemanticSegmentation. The training image set is in JPEG format partitioned into train/validate/test sets, and contained in folders with associa...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This paper introduces an innovative multi-view stereo matching network—the Multi-Step Depth Enhancement Refine Network (MSDER-MVS), aimed at improving the accuracy and computational efficiency of high-resolution 3D reconstruction. The MSDER-MVS network leverages the potent capabilities of modern deep learning in conjunction with the geometric intuition of traditional 3D reconstruction techniques, with a particular focus on optimizing the quality of the depth map and the efficiency of the reconstruction process.Our key innovations include a dual-branch fusion structure and a Feature Pyramid Network (FPN) to effectively extract and integrate multi-scale features. With this approach, we construct depth maps progressively from coarse to fine, continuously improving depth prediction accuracy at each refinement stage. For cost volume construction, we employ a variance-based metric to integrate information from multiple perspectives, optimizing the consistency of the estimates. Moreover, we introduce a differentiable depth optimization process that iteratively enhances the quality of depth estimation using residuals and the Jacobian matrix, without the need for additional learnable parameters. This innovation significantly increases the network’s convergence rate and the fineness of depth prediction.Extensive experiments on the standard DTU dataset (Aanas H, 2016) show that MSDER-MVS surpasses current advanced methods in accuracy, completeness, and overall performance metrics. Particularly in scenarios rich in detail, our method more precisely recovers surface details and textures, demonstrating its effectiveness and superiority for practical applications.Overall, the MSDER-MVS network offers a robust solution for precise and efficient 3D scene reconstruction. Looking forward, we aim to extend this approach to more complex environments and larger-scale datasets, further enhancing the model’s generalization and real-time processing capabilities, and promoting the widespread deployment of multi-view stereo matching technology in practical applications.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
readme_dataSet1dAnd3d.txt
## Description ##
This readme describes the folder and file structure of the data sets published along contribution [1] via "DataSet1dAnd3d.zip".
## Context ##
[1] L. Laubert, F. Weber, F. Detrez, and S. Pfaller, "Approaching and overcoming the limitations of the multiscale Capriccio method for simulating the mechanical behavior of amorphous materials", International Journal of Engineering Science, vol. 217, p. 104317, 2025. https://doi.org/10.1016/j.ijengsci.2025.104317
[2] L. Laubert, "One-dimensional framework imitating the Capriccio method for coupling the finite element method with particle-based techniques", Software, Zenodo, 2024. https://doi.org/10.5281/zenodo.14796809
[3] L. Laubert, "Framework for projecting displacements of particles and nodes resulting from Capriccio method coupled deformation simulations to a one-dimensional representation", Software, Zenodo, 2025. https://doi.org/10.5281/zenodo.14796824
[4] S. Pfaller, M. Ries, W. Zhao, C. Bauer , F. Weber, L. Laubert, "CAPRICCIO - Tool to run concurrent Finite Element-Molecular Dynamics Simulations", Software, Zenodo, 2025. https://doi.org/10.5281/zenodo.12606758
[5] L. Laubert, "Establishing a framework for conducting comparative one- and multidimensional studies on the coupling of the finite element method with particle-based techniques", Project Thesis, Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU), 2023. https://doi.org/10.5281/zenodo.7924368
[6] W. Felix, M. Vassaux, L. Laubert, and S. Pfaller, "Measuring the fracture properties of brittle materials using molecular simulations: Application to mode I and mode III in silica glass", in preparation, 2025.
## Abstract (from [1]) ##
The Capriccio method is a computational technique for coupling finite element (FE) and molecular dynamics (MD) domains to bridge their length scales and to provide boundary conditions typically employed in large-scale engineering applications. Earlier studies showed that strain inconsistencies between the coupled domains are caused by the coupling region’s (bridging domain, BD) resistance to spatial motion. Thus, this work examines influences of coupling parameters on strain convergence in Capriccio-coupled setups to study the mechanical behavior of solid amorphous materials. To this end, we employ a linear elastic 1D setup, imitating essential features of the Capriccio method, including force-transmitting anchor points (AP), which couple the domains via linear elastic springs. To assess the effect of more complex interactions in 3D models versus 1D results, we use an interdimensional mapping scheme, allowing qualitative and quantitative comparisons. For validation, we employ both an inelastic polystyrene MD model and a predominantly elastic silica glass MD model, each coupled to a corresponding FE material description. Our 1D results demonstrate that decreasing the conventionally high AP stiffness, along with other less significant measures, diminishes this motion resistance, revealing an optimal ratio between the material stiffness of the coupled domains and the cumulative AP stiffness. The 3D silica setup confirms that these measures ensure decent domain adherence and sufficiently low strain incompatibilities to study the mechanical behavior of elastic models. However, these measures turn out limited and may not ensure sufficient accuracy for studying the deformation and fracture behavior of Capriccio-coupled inelastic models. To overcome this, we employ a modified coupling approach, revising the Capriccio method’s AP concept by introducing a much lower so-called molecular statics stiffness during the FE calculation and a higher AP stiffness during only the MD calculation. Initial results on the 1D setup indicate that essential coupling limitations can be overcome, albeit with the risk of oscillatory strain amplifications depending on the BD’s design. This novel approach may enable a more accurate analysis of the mechanical behavior of coupled inelastic amorphous materials. We recommend evaluating its performance in 3D alongside additional methodological extensions. Overall, our results outline the current limitations of the Capriccio method and lay the groundwork for its targeted extension to study the mechanical behavior and, in particular, fracture phenomena in inelastic amorphous materials.
## Contact ##
Lukas Laubert
Institute of Applied Mechanics
Friedrich-Alexander-Universiät Erlangen-Nürnberg
Egerlandstraße 5
91058 Erlangen
## License ##
Creative Commons Attribution Non Commercial 4.0 International
## Folder structure, files, and notation ##
The data sets in "DataSet1dAnd3d" are split across two main folders:
a) "1dDataSet" contains all simulation and postprocessing data as well as a documentation of all studies conducted using the one-dimensional framework [2]:
- The spreadsheet "Documentation_data1d-OverCapLimit.xlsx" comprises the system and parameter settings for all conducted 1D simulations and resulting error values; it relates the folder and file names of related simulations as well as postprocessing data using the identifiers and structure from [1] with indication of the associated tables and figures.
- The folders comprise visualizations for all studies documented in "Documentation_data1d-OverCapLimit.xlsx", also covering all simulations referred to in [1]:
[file types]
* "*.mat" files contain the full simulation data, which can be used to reconstruct every diagram.
* ".fig" files contain visualizations of data that can be handled interactively using MATLAB.
* ".pdf" files are PDF exports of the ".fig" files that can be opened using any PDF reader.
* ".mp4" files are compressed video files, also indicated through "vid" after the second underscore, that contain videos of displacement of or position plots:
' "position" indicates a video showing all system parts their current position.
' "displacement" indicates a video showing the current displacement of all system parts over their initial position before the first load step.
* ".txt" files contain text output from the simulation or postprocessing, including error values, as also indicated by "output" in the file names.
* ".m" files contain text that were used for system definitions in [2]
[file notation]
* The abbreviation in the beginning of each file name before the first underscore, e.g., "com1BD", refers to the substudy name, which is listed in "Documentation_data1d-OverCapLimit.xlsx".
* "disp_over_initPos" refers to a "displacement over initial position" diagram
* "energy_over_LS" refers to a "total system energy over load step number" diagram
* "energy_over_strain" refers to a "total system energy over target strain" diagram
* "energy_over_totIS" refers to a "total system energy over the total number of iteration steps" diagram
* "ls_over_currPos" shows the system state of all system parts at certain load step numbers by their current position along the deformation axis
* "pos_zero" shows a system in its inital state by the spatial positions of all system parts
* "srain_over_LS" refers to a "actual strain over load step number" diagram
* "srain_over_strain" refers to a "actual strain over target strain" diagram
* "srain_over_totIS" refers to a "actual strain over the total number of iteration steps" diagram
* "forces" indicates a video or diagram that contains the resulting nodal (blue), particle (red), and anchor point (green) forces as well as the Lagrange Multiplier values (yellow); more information is provided in [5]
* "monolithic" refers to monolithically solved load setups
* "staggered" refers to staggered solved load setups
b) "3dDataSet" contains 3D simulation and preparational data using [4] and a documentation:
- The spreadsheet "Documentation_data3d_OverCapLimit.xlsx" comprises the system and parameter settings for all conducted 3D simulations, parameter values and procedure settings from the system preparation, as well as resulting error values; also stating corresponding information and results from 1D imitations; it relates the folder and file names of related simulations as well as postprocessing data using the identifiers and structure from [1] with indication of the associated tables and figures.
- "3dSandwich_PS_sysPrepData" contains input, output, and postprocessing files from the 3D system setups related to system data from before and after the equilibration process for the polystyrene MD model, split across the folders "beforeEquilibration" and "afterEquilibration", respectively. The (folder) names assigned to the respective preparational setups used in this study can be found in “Documentation_data3d-OverCapLimit.xlsx”.
* the respective folders contain files similar to those defined under [simulation input files] above, which are related to the system state before or after the equilibration, respectively
* "Capriccio.prm" contains Parameter values used for the simulation in LAMMPS and the FEM package from [4]
* "bounds.dat" states the x, y, and z dimensions of the MD setups
- "3dSandwich_PS_resultData" (inlcuding subpath "ACK_lenBD") comprises results from three-dimensional Capriccio-based coupled simulations using the polystyrene (PS) MD model as specified in "Documentation_data3d-OverCapLimit.xlsx":
* "input_files" contains the simulation input files and potential tables:
[potential tables]
Classifying trees from point cloud data is useful in applications such as high-quality 3D basemap creation, urban planning, and forestry workflows. Trees have a complex geometrical structure that is hard to capture using traditional means. Deep learning models are highly capable of learning these complex structures and giving superior results.Using the modelFollow the guide to use the model. The model can be used with the 3D Basemaps solution and ArcGIS Pro's Classify Point Cloud Using Trained Model tool. Before using this model, ensure that the supported deep learning frameworks libraries are installed. For more details, check Deep Learning Libraries Installer for ArcGIS.InputThe model accepts unclassified point clouds with the attributes: X, Y, Z, and Number of Returns.Note: This model is trained to work on unclassified point clouds that are in a projected coordinate system, where the units of X, Y, and Z are based on the metric system of measurement. If the dataset is in degrees or feet, it needs to be re-projected accordingly. The provided deep learning model was trained using a training dataset with the full set of points. Therefore, it is important to make the full set of points available to the neural network while predicting - allowing it to better discriminate points of 'class of interest' versus background points. It is recommended to use 'selective/target classification' and 'class preservation' functionalities during prediction to have better control over the classification.This model was trained on airborne lidar datasets and is expected to perform best with similar datasets. Classification of terrestrial point cloud datasets may work but has not been validated. For such cases, this pre-trained model may be fine-tuned to save on cost, time and compute resources while improving accuracy. When fine-tuning this model, the target training data characteristics such as class structure, maximum number of points per block, and extra attributes should match those of the data originally used for training this model (see Training data section below).OutputThe model will classify the point cloud into the following 2 classes with their meaning as defined by the American Society for Photogrammetry and Remote Sensing (ASPRS) described below: 0 Background 5 Trees / High-vegetationApplicable geographiesThis model is expected to work well in all regions globally, with an exception of mountainous regions. However, results can vary for datasets that are statistically dissimilar to training data.Model architectureThis model uses the PointCNN model architecture implemented in ArcGIS API for Python.Accuracy metricsThe table below summarizes the accuracy of the predictions on the validation dataset. Class Precision Recall F1-score Trees / High-vegetation (5) 0.975374 0.965929 0.970628Training dataThis model is trained on a subset of UK Environment Agency's open dataset. The training data used has the following characteristics: X, Y and Z linear unit meter Z range -19.29 m to 314.23 m Number of Returns 1 to 5 Intensity 1 to 4092 Point spacing 0.6 ± 0.3 Scan angle -23 to +23 Maximum points per block 8192 Extra attributes Number of Returns Class structure [0, 5]Sample resultsHere are a few results from the model.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global augmented reality smart glasses market size was valued at USD 8.5 billion in 2023 and is projected to reach USD 52.3 billion by 2032, growing at a compound annual growth rate (CAGR) of 22.5% from 2024 to 2032. This robust growth can be attributed to advancements in AR technologies and increasing adoption in various industries such as healthcare, gaming, and retail. The market's growth is further fueled by the rise in demand for immersive experiences and enhanced user interaction offered by augmented reality smart glasses.
One of the primary growth factors for the augmented reality smart glasses market is the increasing adoption of AR in the healthcare sector. AR smart glasses are revolutionizing medical training, surgery, and diagnostics by providing real-time data and hands-free access to critical information. Surgeons can use AR glasses to view patient data and 3D models of organs during operations, enhancing precision and reducing risks. Similarly, medical trainees can practice complex procedures in a simulated environment, improving their skills and confidence. This widespread application in healthcare is expected to drive market growth significantly.
Another significant growth factor is the rising popularity of AR in the gaming industry. AR smart glasses are transforming gaming experiences by overlaying digital content onto the real world, creating an immersive and interactive environment. Gamers can engage with virtual characters and objects in their physical surroundings, enhancing the overall gameplay. With the increasing demand for immersive gaming experiences, major gaming companies are investing heavily in AR technology, further propelling the market's growth. The gaming sector's continuous innovation and expansion are anticipated to contribute substantially to the augmented reality smart glasses market.
The industrial sector is also a key contributor to the market's growth. AR smart glasses are being widely adopted in manufacturing, maintenance, and logistics to improve operational efficiency and safety. Workers can access real-time instructions, schematics, and data overlays directly in their field of view, reducing errors and downtime. Moreover, AR glasses facilitate remote assistance and collaboration, allowing experts to guide on-site workers from different locations. The industrial sector's increasing focus on digitization and automation is expected to boost the demand for AR smart glasses, driving market growth.
In the realm of business-to-business interactions, Smart Augmented Reality Glasses for B2B are emerging as a transformative tool. These glasses are designed to enhance communication and collaboration among businesses by providing real-time data visualization and interactive capabilities. For instance, in a corporate setting, these glasses can facilitate virtual meetings where participants can share and manipulate 3D models or data sets in real-time, regardless of their physical location. This not only improves decision-making processes but also reduces the need for travel, thus saving time and resources. As businesses increasingly seek innovative solutions to improve efficiency and productivity, the adoption of smart AR glasses in B2B environments is expected to grow, driving further advancements in the technology.
Regionally, North America holds a dominant position in the augmented reality smart glasses market, driven by the presence of major technology companies and high consumer adoption. The region's advanced infrastructure and strong emphasis on technology innovation are key factors supporting market growth. Additionally, Asia Pacific is emerging as a significant market due to the growing adoption of AR technology in countries like China, Japan, and South Korea. The region's expanding gaming industry and increasing investments in AR research and development are expected to drive market growth during the forecast period.
In the augmented reality smart glasses market, product types are categorized into standalone, tethered, and hybrid. Standalone AR smart glasses operate independently without the need for an external device, offering users complete mobility and convenience. These glasses are equipped with built-in processors, displays, and sensors, making them ideal for various applications, particularly in industrial and healthcare sectors. The demand for standalone AR glasses i
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This project was funded by the University of Sheffield Digital Humanities Exploration Fund in 2015.
The project integrated computer science and archaeological approaches in an investigation of the subterranean medieval charnel chapel of Holy Trinity church in Rothwell (Northamptonshire), which houses one of only two remaining in situ medieval ossuaries in England. The chapel, which was constructed during the 13th century, houses disinterred human skeletal remains radiocarbon dated to the 13th-15th and 18th-19th centuries. While medieval charnelling was a European-wide phenomenon, evidence has largely been lost in England following the early 16th-century Reformation, and Rothwell is the most complete surviving example of a charnel chapel with in situ medieval remains. Recent research within the Department of Archaeology has suggested that these charnel structures served a much more complex liturgical role than merely permitting the clearance of overly-full graveyards (which has long been presumed to be their prosaic role); they also provided places of pilgrimage and were the focus of intercessory devotion, where the faithful could pray for the souls of the departed whilst in the physical presence of their corporeal remains. Rothwell charnel chapel is, hence, a site of major international significance, but analysis of the site is hampered by issues of access and preservation. The proposed project has four principal aims:
to develop analysis of the hitherto largely unstudied medieval charnel chapel by collecting digital records of the charnel deposit and their environment;
to enhance interpretation of the manner in which the ossuary was utilized in the medieval period, through digital capturing of the spatial arrangements within the chapel, and the range of medieval vantage points into the chapel;
to present this fragile, and largely inaccessible (due to narrow stair access, now blocked medieval windows and cramped internal space), heritage resource to the public in a sustainable manner; and
to facilitate preservation of the ossuary, which is in a fragile state, in the form of digital preservation in situ.
A Leica ScanStation P20 3D scanner was used to capture a 3D point cloud of the charnel chapel. Seventeen scans were taken at different locations and registered (using Leica Cyclone) to produce a model containing 60 million points. This data set is supplied in the following formats:E57 file format – Oss_E57.e57
.ptx file format – Ossuary_PTX.ptx Initial work was done (see publications) to convert the point cloud into a 3D virtual reality model of the space. A simplified (decimated) mesh containing approx. 3.5 million faces is available in .obj format as mesh.zip (which contains mesh.obj, mesh.mtl, and eight supporting texture files: tex_0.jpg to tex_7.jpg). PublicationsElizabeth Craig-Atkins, Jennifer Crangle and Dawn Hadley. Rothwell Charnel Chapel – The nameless dead. Current Archaeology magazine issue 321, November 2016. Jenny Crangle, Elizabeth Craig-Atkins, Dawn Hadley, Peter Heywood, Tom Hodgson, Steve Maddock, Robin Scott, Adam Wiles. The Digital Ossuary: Rothwell (Northamptonshire, UK). Proc. CAA2016, the 44th Annual Conference on Computer Applications and Quantitative Methods in Archaeology, Oslo, 29 March – 2 April, Session 06 Computer tools for depicting shape and detail in 3D archaeological models.Wuyang Shui, Steve Maddock, Peter Heywood, Elizabeth Craig-Atkins, Jennifer Crangle, Dawn Hadley and Rab Scott. Using semi-automatic 3D scene reconstruction to create a digital medieval charnel chapel. Proc. CGVC2016, 15-16 September, 2016, Bournemouth University, United Kingdom.
Wuyang Shui, Jin Liu, Pu Ren, Steve Maddock and Mingquan Zhou. Automatic planar shape segmentation from indoor point clouds. Proc. VRCAI2016, 3-4 December 2016, Zhuhai, China.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Although modern fluorescence microscopy produces detailed three-dimensional (3D) datasets, colocalization analysis and region of interest (ROI) selection is most commonly performed two-dimensionally (2D) using maximum intensity projections (MIP). However, these 2D projections exclude much of the available data. Furthermore, 2D ROI selections cannot adequately select complex 3D structures which may inadvertently lead to either the exclusion of relevant or the inclusion of irrelevant data points, consequently affecting the accuracy of the colocalization analysis. Using a virtual reality (VR) enabled system, we demonstrate that 3D visualization, sample interrogation and analysis can be achieved in a highly controlled and precise manner. We calculate several key colocalization metrics using both 2D and 3D derived super-resolved structured illumination-based data sets. Using a neuronal injury model, we investigate the change in colocalization between Tau and acetylated α-tubulin at control conditions, after 6 hours and again after 24 hours. We demonstrate that performing colocalization analysis in 3D enhances its sensitivity, leading to a greater number of statistically significant differences than could be established when using 2D methods. Moreover, by carefully delimiting the 3D structures under analysis using the 3D VR system, we were able to reveal a time dependent loss in colocalization between the Tau and microtubule network as an early event in neuronal injury. This behavior could not be reliably detected using a 2D based projection. We conclude that, using 3D colocalization analysis, biologically relevant samples can be interrogated and assessed with greater precision, thereby better exploiting the potential of fluorescence-based image analysis in biomedical research.
https://ega-archive.org/dacs/EGAC00001000205https://ega-archive.org/dacs/EGAC00001000205
There is currently a drive to establish cell based assay systems of greater human biological and disease relevance through the use of well characterised transformed cell lines, primary cells and complex cellular models (e.g. co-culture, 3D models). However, although the field is gaining valuable experience in running more non-standard & complex cell assays for target validation and compound pharmacology studies, there is the lack of a systematic approach to determine if this expansion in cell assay models is reflected in increased human biological and disease relevance. The increasing wealth of publically available transcriptomic, and epigenome (ENCODE and Epigenome Roadmap) data represents an ideal reference mechanism for determining the relationship between cell types used for target & compound studies to primary human cells and tissues from both healthy volunteers & patients. The CTTV020 epigenomes of cell line project aims to generate epigenetic and transcriptomic profiles of cell lines and compare these with existing and newly generated reference data sets from human tissue and cell types. The aim is to identify assay systems which will provide greater confidence in translating target biology and compound pharmacology to patients. Multiple cell types commonly used within research have been grouped according to biology. Examples include erythroid, lung epithelial, hepatocyte cell types and immortalised models of monocyte / macrophage biology. This data is part of a pre-publication release. For information on the proper use of pre-publication data shared by the Wellcome Trust Sanger Institute (including details of any publication moratoria), please see http://www.sanger.ac.uk/datasharing/ . This dataset contains all the data available for this study on 2018-10-23.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the second See & Grasp data set introduced from Yildirim & Jacobs (2013). It is a data set containing both the visual and haptic features for "Fribbles". The set of "Fribbles" is larger than the original See & Grasp dataset, which is available here.
Fribbles are complex, 3-D objects with multiple parts and spatial relations among the parts. Moreover, Fribbles have a categorical structure—that is, each Fribble is an exemplar from a category formed by perturbing a category prototype. There are 4 Fribble families, each with 4 species. Each Fribble has 4 slots, and one of 3 possible parts are attached to each slot, leading to (4 × (34)) = 324 total Fribbles in each family. The unmodified 3-D object files for the whole set of Fribbles can be found on Mike Tarr's (Department of Psychology, Carnegie Mellon University) web pages (TarrLab webpage).
The See & Grasp data set contains 891 items corresponding to 891 Fribbles (324 for the A and C families, and 243 for family B because the 3D model files for one of 4 species in family B seem to be corrupt). There are 3 entries associated with each item. One entry is the 3-D object model for a Fribble. The second entry is an image of a Fribble rendered from a canonical viewpoint so that the Fribble's parts and spatial relations among the parts are clearly visible. (Using the 3-D object model, users can easily generate new images of a Fribble from any desired viewpoint.) The third entry is a way of representing a Fribble's haptic features. It is a set of joint angles obtained from a grasp simulator known as "GraspIt!" (Miller & Allen, 2004). GraspIt! contains a simulator of a human hand. When forming the representation of a Fribble's haptic features, the input to GraspIt! was the 3-D object model for the Fribble. Its output was a set of 16 joint angles of the fingers of a simulated human hand obtained when the simulated hand "grasped" the Fribble. Grasps—or closings of the fingers around a Fribble—were performed using GraspIt!'s AutoGrasp function. Each Fribble was grasped 24 times, with the fribble rotated 8 times (by 45°) on each axis. To be sure that Fribbles fit inside GraspIt!'s hand, their sizes were reduced by 29%. Please see the figure for sample images from the dataset and an illustration of the grasping procedure.
The file contains three folders and a CSV file. The folders “Fa”, “Fb”, and “Fc” contain the files for each family of Fribble we examined—A, B, and C. These files each contain the following folders:
“renders” contains the images of each Fribble in the species.
“obj” contains the MTL and OBJ files for each Fribble in the species.
“parts” contains the original Fribble parts from Mike Tarr's web pages which are composited into the individual Fribbles.
“vrml” contains the VRML file for each Fribble in the species.
The files are named based on the parts which were combined to make that Fribble. E.g., 4_a1b3c2d3 was made by combining 4_body.obj, 4_a1.obj, 4_b3.obj, 4_c2.obj, and 4_d3.obj.
Finally, “haptics.csv” contains the haptic features for each of the Fribbles. It is a CSV file with 892 rows (including one header) and 255 columns. The first column is the Fribble's species and name, the next 254 columns are the joint angles from Graspit!: we performed a grasp for each of 24 different angles (rotating 45° around each axis), and each grasp has 16 DOF values associated with it, representing the position of the hand's joints.
The rest of the columns contain the data for the contact points between the Fribble and the hand in Graspit. The first number for each grasp (there are 24 grasps for each Fribble) is the number of contact points for that grasp. For each contact point, the first six numbers relate to the contact's wrench, the next 3 numbers specify the contact's location in body coordinates, and the last number specifies the scalar constraint error for that contact.
Sample code for rendering objects visually with VTK and haptically with GraspIt can be seen in the folder SampleCode. Please read GraspItUpdate.pdf file to see how to update GraspIt’s TCP server code for haptic rendering. Sample code folder also contains aomr.xml, the world file (i.e., setup of the human hand) used for haptic rendering in GraspIt.
Please cite the following paper in relation to the See & Grasp 2 data set.
Yildirim, I. & Jacobs, R. A. (2013). Transfer of object category knowledge across visual and haptic modalities: Experimental and computational studies. Cognition, 126, 135-148.
Citation for GraspIt! Miller, A., & Allen, P. K. (2004). Graspit!: A versatile simulator for robotic grasping. IEEE Robotics and Automation Magazine, 11, 110-122.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Project Description
Drug discovery pipelines nowadays rely on machine learning models to explore and evaluate large chemical spaces. While the inclusion of 3D complex information is considered to be beneficial, structural ML for affinity prediction suffers from data scarcity.
We provide kinodata-3D, a dataset of ~138 000 docked complexes to enable more robust training of 3D-based ML models for kinase activity prediction (see github.com/volkamerlab/kinodata-3D-affinity-prediction).
This data set consists of three-dimensional protein-ligand complexes that were generated using computational docking from the OpenEye toolkit. The modeled proteins cover the kinase family for which a fair amount of structural data, i.e. co-crystallized protein-ligand complexes in the PDB, enriched through KLIFS annotations, is available. This enables us to use template docking (OpenEye’s POSIT functionality) in which the ligand placement is guided according to a similar co-crystallized ligand pose. The kinase-ligand pairs to dock are sourced from binding assay data via the public ChEMBL archive, version 33. In particular, we use kinase activity data as curated through the OpenKinome kinodata project. The final protein-ligand complexes are annotated with a predicted RMSD of the docked poses. The RMSD model is a simple neural network trained on a kinase-docking benchmark data set using ligand (fingerprint) similarity, docking score (ChemGauss 4), and Posit probability (see kinodata-3D repository).
The final data set contains in total 138 286 deduplicated kinase-ligand pairs, covering ~98 000 distinct compounds and ~271 distinct kinase structures.
The archive kinodata_3d.zip uses the following file structure
data/raw
| kinodata_docked_with_rmsd.sdf.gz
| pocket_sequences.csv
| mol2/pocket
| 1_pocket.mol2
| ...
The file kinodata_docked_with_rmsd.sdf.gz contains the docked ligand poses and the information on the protein-ligand pair inherited from kinodata. The protein pockets located in mol2/pocket are stored according to the MOL2 file format.
The pocket structures were sourced from KLIFS (klifs.net) and complete the poses in the aforementioned SDF file. The files are named {klifs_structure_id}_pocket.mol2. The structure ID is given in the SDF file along with the ligand poses.
The file pocket_sequences.csv contains all KLIFS pocket sequences relevant to the kinodata-3D dataset.
The code used to create the poses can be found in the kinodata-3D repository. The docking pipeline makes heavy use of the kinoml framework, which in turn uses OpenEye's Posit template docking implementation. The details of the original pipeline can also be found in the manuscript by Schaller et al. (2023). Benchmarking Cross-Docking Strategies for Structure-Informed Machine Learning in Kinase Drug Discovery. bioRxiv.
https://www.bibliotek.dtu.dk/en/RDMvariouslicenseshttps://www.bibliotek.dtu.dk/en/RDMvariouslicenses
SportsPose is a large-scale 3D human pose dataset consisting of highly dynamic sports movements. With more than 176.000 3D poses from 24 different subjects performing 5 different sports activities filmed from 7 different points of view, SportsPose provides a diverse and comprehensive set of 3D poses that reflect the complex and dynamic nature of sports movements. The precision of SportsPose has been quantitatively evaluated by comparing our poses with a commercial marker-based system, achieving a mean error of 34.5 mm across all evaluation sequences. We hope that SportsPose will allow researchers and practitioners to develop and evaluate more effective models for the analysis of sports performance and injury prevention and for advancing the state-of-the-art in pose estimation in sports.For additional information regarding the dataset and accompanying code, kindly refer to the project webpage: https://christianingwersen.github.io/SportsPose/The dataset is intended for academic use only, and access is granted by request. The data is licensed under a custom academic license. For more information, see the 'data' page on the project webpage.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the study titled "Modeling geologically complex volcanic watersheds for integrated water resources management," which presents a transferable and reproducible workflow for developing high-resolution 3D geological models for use in integrated surface–subsurface hydrological simulations. The workflow is demonstrated using the Mt. Fuji catchment in Japan—a region characterized by complex volcanic geology, active tectonics, and increasing stress on groundwater resources. The dataset includes all relevant model input files, exported hydrofacies surfaces, meshing outputs, and scripts used in the study. Specifically, the repository provides: (i) 3D hydrofacies bounding surfaces in raster and DXF format, (ii) the 2D numerical mesh, and all the input configuration files for the HydroGeoSphere model used to run the simulations. These resources are intended to enable the reuse, adaptation, or extension of the modeling framework in similar volcanic or geologically complex regions. The dataset facilitates further work in groundwater resource evaluation, model calibration, scenario analysis, and environmental risk and hazard assessment.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains all recorded and hand-annotated as well as all synthetically generated data as well as representative trained networks used for semantic and instance segmentation experiments in the replicAnt - generating annotated images of animals in complex environments using Unreal Engine manuscript. Unless stated otherwise, all 3D animal models used in the synthetically generated data have been generated with the open-source photgrammetry platform scAnt peerj.com/articles/11155/. All synthetic data has been generated with the associated replicAnt project available from https://github.com/evo-biomech/replicAnt.
Abstract:
Deep learning-based computer vision methods are transforming animal behavioural research. Transfer learning has enabled work in non-model species, but still requires hand-annotation of example footage, and is only performant in well-defined conditions. To overcome these limitations, we created replicAnt, a configurable pipeline implemented in Unreal Engine 5 and Python, designed to generate large and variable training datasets on consumer-grade hardware instead. replicAnt places 3D animal models into complex, procedurally generated environments, from which automatically annotated images can be exported. We demonstrate that synthetic data generated with replicAnt can significantly reduce the hand-annotation required to achieve benchmark performance in common applications such as animal detection, tracking, pose-estimation, and semantic segmentation; and that it increases the subject-specificity and domain-invariance of the trained networks, so conferring robustness. In some applications, replicAnt may even remove the need for hand-annotation altogether. It thus represents a significant step towards porting deep learning-based computer vision tools to the field.
Benchmark data
Two pose-estimation datasets were procured. Both datasets used first instar Sungaya nexpectata (Zompro 1996) stick insects as a model species. Recordings from an evenly lit platform served as representative for controlled laboratory conditions; recordings from a hand-held phone camera served as approximate example for serendipitous recordings in the field.
For the platform experiments, walking S. inexpectata were recorded using a calibrated array of five FLIR blackfly colour cameras (Blackfly S USB3, Teledyne FLIR LLC, Wilsonville, Oregon, U.S.), each equipped with 8 mm c-mount lenses (M0828-MPW3 8MM 6MP F2.8-16 C-MOUNT, CBC Co., Ltd., Tokyo, Japan). All videos were recorded with 55 fps, and at the sensors’ native resolution of 2048 px by 1536 px. The cameras were synchronised for simultaneous capture from five perspectives (top, front right and left, back right and left), allowing for time-resolved, 3D reconstruction of animal pose.
The handheld footage was recorded in landscape orientation with a Huawei P20 (Huawei Technologies Co., Ltd., Shenzhen, China) in stabilised video mode: S. inexpectata were recorded walking across cluttered environments (hands, lab benches, PhD desks etc), resulting in frequent partial occlusions, magnification changes, and uneven lighting, so creating a more varied pose-estimation dataset.
Representative frames were extracted from videos using DeepLabCut (DLC)-internal k-means clustering. 46 key points in 805 and 200 frames for the platform and handheld case, respectively, were subsequently hand-annotated using the DLC annotation GUI.
Synthetic data
We generated a synthetic dataset of 10,000 images at a resolution of 1500 by 1500 px, based on a 3D model of a first instar S. inexpectata specimen, generated with the scAnt photogrammetry workflow. Generating 10,000 samples took about three hours on a consumer-grade laptop (6 Core 4 GHz CPU, 16 GB RAM, RTX 2070 Super). We applied 70\% scale variation, and enforced hue, brightness, contrast, and saturation shifts, to generate 10 separate sub-datasets containing 1000 samples each, which were combined to form the full dataset.
Funding
This study received funding from Imperial College’s President’s PhD Scholarship (to Fabian Plum), and is part of a project that has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation program (Grant agreement No. 851705, to David Labonte). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.