Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A brain tumor is one of the life-threatening neurological conditions affecting millions of people worldwide. Early diagnosis and classification of brain tumor types facilitate prompt treatment, thereby increasing the patient’s chances of survival. The advent of Deep Learning methods has significantly improved the field of medical image classification and aids neurologists in brain tumor diagnosis. However, the existing methods using Magnetic Resonance Imaging (MRI) face significant difficulties due to the complexities of brain tumors and the variability in tumor characteristics. Consequently, this research proposes the Inception V3 enabled Bidirectional Long Short Term Memory Network (IV3TM) for Brain Tumor Classification. In the proposed approach, the preprocessing and data augmentation techniques are presented to enhance classification performance. At the pre-processing stage, an iterative weighted-mean Filter approach is utilized to cope with bias field-effect fluctuations, noise, and blurring in input images to enhance the edges. Further, the data augmentation strategy increases the size of the available training data. SqueezeNet is used to segment images for further classification operations. Further, the proposed model combines the strengths of Inception V3 and BiLSTM to learn the sequential dependencies significant for understanding the intricate structural relationships in brain MRI data. The effectiveness of the proposed method is evaluated using several metrics, including specificity, accuracy, precision, F1-score, and sensitivity. Furthermore, the proposed method’s error is evaluated using root mean square error (RMSE). Experiments using the Brain Magnetic Resonance Imaging (MRI) images dataset and Figshare brain tumor datasets have shown encouraging results.
Facebook
Twitterhttps://www.law.cornell.edu/uscode/text/17/106https://www.law.cornell.edu/uscode/text/17/106
Medical image analysis is critical to biological studies, health research, computer- aided diagnoses, and clinical applications. Recently, deep learning (DL) techniques have achieved remarkable successes in medical image analysis applications. However, these techniques typically require large amounts of annotations to achieve satisfactory performance. Therefore, in this dissertation, we seek to address this critical problem: How can we develop efficient and effective DL algorithms for medical image analysis while reducing annotation efforts? To address this problem, we have outlined two specific aims: (A1) Utilize existing annotations effectively from advanced models; (A2) extract generic knowledge directly from unannotated images.
To achieve the aim (A1): First, we introduce a new data representation called TopoImages, which encodes the local topology of all the image pixels. TopoImages can be complemented with the original images to improve medical image analysis tasks. Second, we propose a new augmentation method, SAMAug-C, that lever- ages the Segment Anything Model (SAM) to augment raw image input and enhance medical image classification. Third, we propose two advanced DL architectures, kCBAC-Net and ConvFormer, to enhance the performance of 2D and 3D medical image segmentation. We also present a gate-regularized network training (GrNT) approach to improve multi-scale fusion in medical image segmentation. To achieve the aim (A2), we propose a novel extension of known Masked Autoencoders (MAEs) for self pre-training, i.e., models pre-trained on the same target dataset, specifically for 3D medical image segmentation.
Scientific visualization is a powerful approach for understanding and analyzing various physical or natural phenomena, such as climate change or chemical reactions. However, the cost of scientific simulations is high when factors like time, ensemble, and multivariate analyses are involved. Additionally, scientists can only afford to sparsely store the simulation outputs (e.g., scalar field data) or visual representations (e.g., streamlines) or visualization images due to limited I/O bandwidths and storage space. Therefore, in this dissertation, we seek to address this critical problem: How can we develop efficient and effective DL algorithms for scientific data generation and compression while reducing simulation and storage costs?
To tackle this problem: First, we propose a DL framework that generates un- steady vector fields data from a set of streamlines. Based on this method, domain scientists only need to store representative streamlines at simulation time and recon- struct vector fields during post-processing. Second, we design a novel DL method that translates scalar fields to vector fields. Using this approach, domain scientists only need to store scalar field data at simulation time and generate vector fields from their scalar field counterparts afterward. Third, we present a new DL approach that compresses a large collection of visualization images generated from time-varying data for communicating volume visualization results.
Facebook
Twitterhttps://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice
US Deep Learning Market Size 2025-2029
The deep learning market size in US is forecast to increase by USD 5.02 billion at a CAGR of 30.1% between 2024 and 2029.
The deep learning market is experiencing robust growth, driven by the increasing adoption of artificial intelligence (AI) in various industries for advanced solutioning. This trend is fueled by the availability of vast amounts of data, which is a key requirement for deep learning algorithms to function effectively. Industry-specific solutions are gaining traction, as businesses seek to leverage deep learning for specific use cases such as image and speech recognition, fraud detection, and predictive maintenance. Alongside, intuitive data visualization tools are simplifying complex neural network outputs, helping stakeholders understand and validate insights.
However, challenges remain, including the need for powerful computing resources, data privacy concerns, and the high cost of implementing and maintaining deep learning systems. Despite these hurdles, the market's potential for innovation and disruption is immense, making it an exciting space for businesses to explore further. Semi-supervised learning, data labeling, and data cleaning facilitate efficient training of deep learning models. Cloud analytics is another significant trend, as companies seek to leverage cloud computing for cost savings and scalability.
What will be the Size of the market During the Forecast Period?
Request Free Sample
Deep learning, a subset of machine learning, continues to shape industries by enabling advanced applications such as image and speech recognition, text generation, and pattern recognition. Reinforcement learning, a type of deep learning, gains traction, with deep reinforcement learning leading the charge. Anomaly detection, a crucial application of unsupervised learning, safeguards systems against security vulnerabilities. Ethical implications and fairness considerations are increasingly important in deep learning, with emphasis on explainable AI and model interpretability. Graph neural networks and attention mechanisms enhance data preprocessing for sequential data modeling and object detection. Time series forecasting and dataset creation further expand deep learning's reach, while privacy preservation and bias mitigation ensure responsible use.
In summary, deep learning's market dynamics reflect a constant pursuit of innovation, efficiency, and ethical considerations. The Deep Learning Market in the US is flourishing as organizations embrace intelligent systems powered by supervised learning and emerging self-supervised learning techniques. These methods refine predictive capabilities and reduce reliance on labeled data, boosting scalability. BFSI firms utilize AI image recognition for various applications, including personalizing customer communication, maintaining a competitive edge, and automating repetitive tasks to boost productivity. Sophisticated feature extraction algorithms now enable models to isolate patterns with high precision, particularly in applications such as image classification for healthcare, security, and retail.
How is this market segmented and which is the largest segment?
The market research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.
Application
Image recognition
Voice recognition
Video surveillance and diagnostics
Data mining
Type
Software
Services
Hardware
End-user
Security
Automotive
Healthcare
Retail and commerce
Others
Geography
North America
US
By Application Insights
The Image recognition segment is estimated to witness significant growth during the forecast period. In the realm of artificial intelligence (AI) and machine learning, image recognition, a subset of computer vision, is gaining significant traction. This technology utilizes neural networks, deep learning models, and various machine learning algorithms to decipher visual data from images and videos. Image recognition is instrumental in numerous applications, including visual search, product recommendations, and inventory management. Consumers can take photographs of products to discover similar items, enhancing the online shopping experience. In the automotive sector, image recognition is indispensable for advanced driver assistance systems (ADAS) and autonomous vehicles, enabling the identification of pedestrians, other vehicles, road signs, and lane markings.
Furthermore, image recognition plays a pivotal role in augmented reality (AR) and virtual reality (VR) applications, where it tracks physical objects and overlays digital content onto real-world scenarios. The model training process involves the backpropagation algorithm, which calculates the loss fu
Facebook
TwitterIn plant breeding research, several statistical machine learning methods have been developed and studied for assessing the genomic prediction (GP) accuracy of unobserved phenotypes. To increase the GP accuracy of unobserved phenotypes while simultaneously accounting for the complexity of genotype × environment interaction (GE), deep learning (DL) neural networks have been developed.These analyses can potentially include phenomics data obtained through imaging. The two datasets included in this study contain phenomic, phenotypic, and genotypic data for a set of wheat materials. They have been used to compare a novel DL method with conventional GP models.The results of these analyses are reported in the accompanying journal article.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is the training and testing data used to train a Residual Attention UNet for segmentation and detection of road culverts. The data consists of pairs of images with the size 256x256 pixels where one image is a labeled mask and the other a image with four channels containing the remote sensing data. The remote sensing data is a combination of topographical data extracted from arial laser scanning and ortophotos from arial imagery.
An extensive culvert survey was conducted in 25 watersheds in central Sweden by the Swedish Forest Agency during the snow-free periods of 2014–2017. A total of 24,083 culverts were mapped with a handheld GPS with a horizontal accuracy of 0.3 m. Densely populated urban areas with underground drainage systems were excluded from the survey (0.3% of the combined area). The coordinates of both ends of each culvert were measured, and metrics such as diameter, length, material, working condition, and sediment accumulation were collected for most of the culverts. Additional metrics, such as the elevation difference between the outlet and stream water level, were manually measured with a ruler. The inventoried watersheds were split up into training and testing data, where 20 watersheds (23,304 culverts) were used for training, and five watersheds (5,208 culverts) were used for testing.
A compact laser-based system (Leica ALS80-HP-8236) was used to collect the ALS data from an aircraft flying at 2888–3000 m. The ALS point clouds had a point density of 1–2 points m-2 and were divided into tiles with a size of 2.5 x 2.5 km each. A DEM with 0.5 m resolution was created from the ALS point clouds using a TIN gridding approach implemented in Whitebox tools 2.2.0. The topographical index max downslope elevation change was calculated from the DEM using Whitebox Tools . Max downslope elevation change represents the maximum elevation drop between each grid cell and its neighbouring cells within a DEM. This typically resulted in values between 0 and 10.
Orthophotos from aerial imagery captured at the same time as the lidar data is also included. The orthophotos had three bands (red, green and blue) in 8-bit color depth and had a resolution of 0.5 m. The LiDAR data and orthophotos were downloaded from the Swedish mapping, cadastral and land registration authority.
The topographical data and the ortophotos were merged into 8-bit four band images where the first three band is red, green and blue, and the last band is max downslope elevation change. The merged images where then split into smaller tiles with the size 256x256 pixels.
The trained model was used to predict culverts in Sweden and the file PredictedCulvertsByIsobasins.zip contains the predicted culverts stored as shapefiles split by the watersheds in the file "isobasins.zip".
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This work outlines an efficient deep learning approach for analyzing vascular wall fractures using experimental data with openly accessible source codes (https://doi.org/10.25835/weuhha72) for reproduction. Vascular disease remains the primary cause of death globally to this day. Tissue damage in these vascular disorders is closely tied to how the diseases develop, which requires careful study. Therefore, the scientific community has dedicated significant efforts to capture the properties of vessel wall fractures. The symmetry-constrained Compact Tension (symconCT) test combined with Digital Image Correlation (DIC), enabled the study of tissue fracture in various aorta specimens under different con- ditions. Main purpose of the experiments was to investigate the displacement and strain field ahead of the crack tip. These experimental data were to support the development and verification of computational models. The FEM model used the DIC information for the material parameters identification. Traditionally, the analysis of fracture processes in biological tissues involves extensive computational and experi- mental efforts due to the complex nature of tissue behavior under stress. These high costs have posed significant challenges, demanding efficient solutions to accelerate research progress and reduce embedded costs. Deep learning techniques have shown promise in overcoming these challenges by learning to indicate patterns and relationships between the Input and Label data. In this study, we integrate deep learning methodologies with the Attention Residual U-Net architecture to predict fracture responses in porcine aorta specimens, enhanced with a Monte Carlo Dropout technique. By training the network on a sufficient amount of data, the model learns to capture the features influencing fracture progression. These parameterized datasets consist of pictures describing the evolution of tissue fracture path along with the DIC measurements. The integration of deep learning should not only enhance the predictive accuracy, but also significantly reduce the computational and experimental burden, thereby enabling a more efficient analysis of fracture response.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a collection of ultrafast ultrasound acquisitions from nine volunteers and the CIRS 054G phantom. For a comprehensive understanding of the dataset, please refer to the paper: Viñals, R.; Thiran, J.-P. A KL Divergence-Based Loss for In Vivo Ultrafast Ultrasound Image Enhancement with Deep Learning. J. Imaging 2023, 9, 256. https://doi.org/10.3390/jimaging9120256. Please cite the original paper when using this dataset.
Due to data size restriction, the dataset has been divided into six subdatasets, each one published into a separate entry in Zenodo. This repository contains subdataset 2.
Number of Acquisitions: 20,000
Volunteers: Nine volunteers
File Structure: Each volunteer's data is compressed in a separate zip file.
Regions :
File Naming Convention: Incremental IDs from acquisition_00000 to acquisition_19999.
Two CSV files are provided:
invivo_dataset.csv :
invitro_dataset.csv :
The dataset has been divided into six subdatasets, each one published in a separate entry on Zenodo. The following table indicates, for each file or compressed folder, the Zenodo dataset split where it has been uploaded along with its size. Each dataset split is named "A KL Divergence-Based Loss for In Vivo Ultrafast Ultrasound Image Enhancement with Deep Learning: Dataset (ii/6)", where ii represents the split number. This repository contains the 2nd split.
| File name | Size | Zenodo subdataset number |
| invivo_dataset.csv | 995.9 kB | 1 |
| invitro_dataset.csv | 1.1 kB | 1 |
| cirs-phantom.zip | 418.2 MB | 1 |
| volunteer-1-lowerLimbs.zip | 29.7 GB | 1 |
| volunteer-1-carotids.zip | 8.8 GB | 1 |
| volunteer-1-back.zip | 7.1 GB | 1 |
| volunteer-1-abdomen.zip | 34.0 GB | 2 |
| volunteer-1-breast.zip | 15.7 GB | 2 |
| volunteer-1-upperLimbs.zip | 25.0 GB | 3 |
| volunteer-2.zip | 26.5 GB | 4 |
| volunteer-3.zip | 20.3 GB | 3 |
| volunteer-4.zip | 24.1 GB | 5 |
| volunteer-5.zip | 6.5 GB | 5 |
| volunteer-6.zip | 11.5 GB | 5 |
| volunteer-7.zip | 11.1 GB | 6 |
| volunteer-8.zip | 21.2 GB | 6 |
| volunteer-9.zip | 23.2 GB | 4 |
Beamforming:
Depth from 1 mm to 55 mm
Width spanning the probe aperture
Grid: 𝜆/8 × 𝜆/8
Resulting images shape: 1483 × 1189
Two beamformed RF images from each acquisition:
Normalization:
To display the images:
File Format: Saved in npy format, loadable using Python and numpy.load(file).
For the volunteer-based split used in the paper:
This dataset is licensed under the Creative Commons Attribution 4.0 International License (CC BY 4.0).
Please cite the original paper when using this dataset :
Viñals, R.; Thiran, J.-P. A KL Divergence-Based Loss for In Vivo Ultrafast Ultrasound Image Enhancement with Deep Learning. J. Imaging 2023, 9, 256. DOI: 10.3390/jimaging9120256
For inquiries or issues related to this dataset, please contact:
Facebook
TwitterThe source code and audio datasets of my PhD project. 1. https://www.openslr.org/12 LibriSpeech is a corpus of approximately 1000 hours of 16kHz read English speech, prepared by Vassil Panayotov with the assistance of Daniel Povey. The data is derived from read audiobooks from the LibriVox project, and has been carefully segmented and aligned. Acoustic models, trained on this data set, are available at kaldi-asr.org and language models, suitable for evaluation can be found at http://www.openslr.org/11/. For more information, see the paper "LibriSpeech: an ASR corpus based on public domain audio books", Vassil Panayotov, Guoguo Chen, Daniel Povey and Sanjeev Khudanpur, ICASSP 2015 2.https://www.openslr.org/17 MUSAN is a corpus of music, speech, and noise recordings. This work was supported by the National Science Foundation Graduate Research Fellowship under Grant No. 1232825 and by Spoken Communications. You can cite the data using the following BibTeX entry: @misc{musan2015, author = {David Snyder and Guoguo Chen and Daniel Povey}, title = {{MUSAN}: {A} {M}usic, {S}peech, and {N}oise {C}orpus}, year = {2015}, eprint = {1510.08484}, note = {arXiv:1510.08484v1} } 3. source_code.zip The program from parts of my PhD project. 4.SJ_EXP.zip The program of the subjective experiment corresponding to the last chapter.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a companion dataset for the paper titled "Class-specific data augmentation for plant stress classification" by Nasla Saleem, Aditya Balu, Talukder Zaki Jubery, Arti Singh, Asheesh K. Singh, Soumik Sarkar, and Baskar Ganapathysubramanian published in The Plant Phenome Journal, https://doi.org/10.1002/ppj2.20112
Abstract:
Data augmentation is a powerful tool for improving deep learning-based image classifiers for plant stress identification and classification. However, selecting an effective set of augmentations from a large pool of candidates remains a key challenge, particularly in imbalanced and confounding datasets. We propose an approach for automated class-specific data augmentation using a genetic algorithm. We demonstrate the utility of our approach on soybean [Glycine max (L.) Merr] stress classification where symptoms are observed on leaves; a particularly challenging problem due to confounding classes in the dataset. Our approach yields substantial performance, achieving a mean-per-class accuracy of 97.61% and an overall accuracy of 98% on the soybean leaf stress dataset. Our method significantly improves the accuracy of the most challenging classes, with notable enhancements from 83.01% to 88.89% and from 85.71% to 94.05%, respectively. A key observation we make in this study is that high-performing augmentation strategies can be identified in a computationally efficient manner. We fine-tune only the linear layer of the baseline model with different augmentations, thereby reducing the computational burden associated with training classifiers from scratch for each augmentation policy while achieving exceptional performance. This research represents an advancement in automated data augmentation strategies for plant stress classification, particularly in the context of confounding datasets. Our findings contribute to the growing body of research in tailored augmentation techniques and their potential impact on disease management strategies, crop yields, and global food security. The proposed approach holds the potential to enhance the accuracy and efficiency of deep learning-based tools for managing plant stresses in agriculture.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains all the necessary code and preprocessing pipelines used in the study: Deep Learning-Driven Diagnosis of Humerus Fractures from Radiographic Data. The project focuses on training convolutional neural networks (CNNs), specifically the Inception V3 model, to classify humerus fractures from radiographic images. Both preprocessing and model training codes are provided to ensure full reproducibility of results.
The project includes: - Preprocessing of radiographic data - Region of interest (ROI) extraction and segmentation - Data augmentation - InceptionV3-based image classification across different input resolutions
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This record contains the data and codes for the paper "SCANet: Self-Paced Semi-Curricular Attention Network for Non-Homogeneous Image Dehazing" published in 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). RequirementPython 3.7Pytorch 1.9.1Network ArchitectureTrainPlace the training and test image pairs in the data folder.Run data/makedataset.py to generate the NH-Haze20-21-23.h5 file.Run train.py to start training.TestPlace the pre-training weight in the checkpoint folder.Place test hazy images in the input folder.Modify the weight name in the test.py.parser.add_argument("--model_name", type=str, default='Gmodel_40', help='model name')Run test.pyThe results is saved in output folder.Pre-training Weight DownloadThe weight40 Gmodel_40.tar for the NTIRE2023 val/test datasets, i.e., the weight used in the NTIRE2023 challenge.The weight105 Gmodel_105.tar for the NTIRE2020/2021/2023 datasets.The weight120 Gmodel_120.tar for the NTIRE2020/2021/2023 datasets (Add the 15 tested images as the training dataset).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sampling complex free-energy surfaces is one of the main challenges of modern atomistic simulation methods. The presence of kinetic bottlenecks in such surfaces often renders a direct approach useless. A popular strategy is to identify a small number of key collective variables and to introduce a bias potential that is able to favor their fluctuations in order to accelerate sampling. Here, we propose to use machine-learning techniques in conjunction with the recent variationally enhanced sampling method [O. Valsson, M. Parrinello, Phys. Rev. Lett. 113, 090601 (2014)] in order to determine such potential. This is achieved by expressing the bias as a neural network. The parameters are determined in a variational learning scheme aimed at minimizing an appropriate functional. This required the development of a more efficient minimization technique. The expressivity of neural networks allows representing rapidly varying free-energy surfaces, removes boundary effects artifacts, and allows several collective variables to be handled.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The use of deep learning (DL) is steadily gaining traction in scientific challenges such as cancer research. Advances in enhanced data generation, machine learning algorithms, and compute infrastructure have led to an acceleration in the use of deep learning in various domains of cancer research such as drug response problems. In our study, we explored tree-based models to improve the accuracy of a single drug response model and demonstrate that tree-based models such as XGBoost (eXtreme Gradient Boosting) have advantages over deep learning models, such as a convolutional neural network (CNN), for single drug response problems. However, comparing models is not a trivial task. To make training and comparing CNNs and XGBoost more accessible to users, we developed an open-source library called UNNT (A novel Utility for comparing Neural Net and Tree-based models). The case studies, in this manuscript, focus on cancer drug response datasets however the application can be used on datasets from other domains, such as chemistry.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
this dataset contain figure code and data of paper “Fourier phase retrieval using physics-enhanced deep learning”
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The use of deep learning (DL) is steadily gaining traction in scientific challenges such as cancer research. Advances in enhanced data generation, machine learning algorithms, and compute infrastructure have led to an acceleration in the use of deep learning in various domains of cancer research such as drug response problems. In our study, we explored tree-based models to improve the accuracy of a single drug response model and demonstrate that tree-based models such as XGBoost (eXtreme Gradient Boosting) have advantages over deep learning models, such as a convolutional neural network (CNN), for single drug response problems. However, comparing models is not a trivial task. To make training and comparing CNNs and XGBoost more accessible to users, we developed an open-source library called UNNT (A novel Utility for comparing Neural Net and Tree-based models). The case studies, in this manuscript, focus on cancer drug response datasets however the application can be used on datasets from other domains, such as chemistry.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Computational chemistry combines theoretical chemistry with computer simulations to simulate and calculate molecular properties and chemical reactions. Due to their complex structures, organic molecules are relatively difficult to handle in computational chemistry. As an efficient quantum mechanical method for solving the Schrödinger equation, density functional theory (DFT) describes the system through electron density, avoiding complex multi-electron wave functions, and thus has advantages in computational efficiency and application scope. However, the core challenge of DFT theory lies in the uncertainty of the exchange-correlation (XC) functional, meaning that the exact XC potential and its corresponding energy are not yet fully determined. The widely used B3LYP and CCSD methods for finding the XC functional also have their own limitations and drawbacks.To address these issues, this work presents a comprehensive investigation of machine learning-enhanced density functional theory through systematic construction of a large-scale quantum chemical dataset and neural network based correction methods. Inspired by the Holographic Electron Density Theorem, a comprehensive dataset was constructed encompassing 593 diverse molecular systems from established benchmarks (G2, W4-11, GMTKN55), systematically expanded through geometric perturbations including bond stretching, angular bending, and conformational sampling. The dataset employed rigorous computational protocols using PySCF with cc-pVDZ and cc-pVTZ basis sets, calculating electronic structures across multiple theory levels.Local electronic environments were standardized as 9×9×9 density cubes centered at atomic positions, with principal axis alignment ensuring rotational invariance and systematic re-gridding producing uniform 5×5×5 representations. Three machine learning approaches were developed and tested: anMLP Mixer for exchange-correlation potential prediction, a fully connected network for energy corrections, and an ML-PBE method for systematic error reduction. Results demonstrate exceptional performance improvements with accuracy enhancements spanning multiple orders of magnitude compared to conventional functionals. This approach advances accurate calculation of exchange-correlation potentials from electron density distributions, enhancing DFT development through systematic dataset construction and machine learning enhancement.
Facebook
TwitterTraining datasets for generalized block diagram models of grid-forming (GFM) inverter-based resources (IBR), particularly with solar photovoltaic (PV) sources. Datasets consist of transient data collected from electromagnetic transient (EMT) simulations or laboratory tests of GFM inverters. See https://pecblocks.readthedocs.io/en/latest/ and https://github.com/pnnl/pecblocks/tree/master/data for details.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the L3DAS22 IEEE ICASSP Gand Challenge. The challenge is supported by a Python API that facilitates the dataset download and preprocessing, the training and evaluation of the baseline models and the results submission.
The L3DAS22 Challenge aims at encouraging and fostering research on machine learning for 3D audio signal processing. 3D audio is gaining increasing interest in the machine learning community in recent years. The range of applications is incredibly wide, extending from virtual and real conferencing to autonomous driving, surveillance and many more. In these contexts, a fundamental procedure is to properly identify the nature of events present in a soundscape, their spatial position and eventually remove unwanted noises that can interfere with the useful signal. To this end, L3DAS22 Challenge presents two tasks: 3D Speech Enhancement and 3D Sound Event Localization and Detection, both relying on first-order Ambisonics recordings in reverberant office environments. Each task involves 2 separate tracks: 1-mic and 2-mic recordings, respectively containing sounds acquired by one 1st order Ambisonics microphone and by an array of two ones. The use of two Ambisonics microphones represents one of the main novelties of the L3DAS22 Challenge. We expect higher accuracy/reconstruction quality when taking advantage of the dual spatial perspective of the two microphones. Moreover, we are very interested in identifying other possible advantages of this configuration over standard Ambisonics formats. Interactive demos of our baseline models are available on Replicate. Top 5 ranked teams can submit a regular paper according to the ICASSP guidelines. Prizes will be awarded to the challenge winners thanks to the support of Kuaishou Technology.
Tasks The tasks we propose are: * 3D Speech Enhancement The objective of this task is the enhancement of speech signals immersed in the spatial sound field of a reverberant office environment. Here the models are expected to extract the monophonic voice signal from the 3D mixture containing various background noises. The evaluation metric for this task is a combination of short-time objective intelligibility (STOI) and word error rate (WER). * 3D Sound Event Localization and Detection The aim of this task is to detect the temporal activities of a known set of sound event classes and, in particular, to further locate them in the space. Here the models must predict a list of the active sound events and their respective location at regular intervals of 100 milliseconds. Performance on this task is evaluated according to the location-sensitive detection error, which joins the localization and detection error metrics.
The L3DAS22 datasets contain multiple-source and multiple-perspective B-format Ambisonics audio recordings. We sampled the acoustic field of a large office room, placing two first-order Ambisonics microphones in the center of the room and moving a speaker reproducing the analytic signal in 252 fixed spatial positions. Relying on the collected Ambisonics impulse responses (IRs), we augmented existing clean monophonic datasets to obtain synthetic tridimensional sound sources by convolving the original sounds with our IRs. We extracted speech signals from the Librispeech dataset and office-like background noises from the FSD50K dataset. We aimed at creating plausible and variegate 3D scenarios to reflect possible real-life situations in which sound and disparate types of background noises coexist in the same 3D reverberant environment. We provide normalized raw waveforms as predictors data and the target data varies according to the task.
The dataset is divided in two main sections, respectively dedicated to the challenge tasks.
The first section is optimized for 3D Speech Enhancement and contains more than 60000 virtual 3D audio environments with a duration up to 12 seconds. In each sample, a spoken voice is always present alongside with other office-like background noises. As target data for this section we provide the clean monophonic voice signals. For each subset we also provide a csv file, where we annotated the coordinates and spatial distance of the IR convolved with the target voice signals for each datapoint. This may be useful to estimate the delay caused by the virtual time-of-flight of the target voice signal and to perform a sample-level alignment of the input and ground truth signals.
The other sections, instead, is dedicated to the 3D Sound Event Localization and Detection task and contains 900 30-seconds-long audio files. Each data point contains a simulated 3D office audio...
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global Artificial Intelligence (AI) Training Dataset market is experiencing robust growth, driven by the increasing adoption of AI across diverse sectors. The market's expansion is fueled by the burgeoning need for high-quality data to train sophisticated AI algorithms capable of powering applications like smart campuses, autonomous vehicles, and personalized healthcare solutions. The demand for diverse dataset types, including image classification, voice recognition, natural language processing, and object detection datasets, is a key factor contributing to market growth. While the exact market size in 2025 is unavailable, considering a conservative estimate of a $10 billion market in 2025 based on the growth trend and reported market sizes of related industries, and a projected CAGR (Compound Annual Growth Rate) of 25%, the market is poised for significant expansion in the coming years. Key players in this space are leveraging technological advancements and strategic partnerships to enhance data quality and expand their service offerings. Furthermore, the increasing availability of cloud-based data annotation and processing tools is further streamlining operations and making AI training datasets more accessible to businesses of all sizes. Growth is expected to be particularly strong in regions with burgeoning technological advancements and substantial digital infrastructure, such as North America and Asia Pacific. However, challenges such as data privacy concerns, the high cost of data annotation, and the scarcity of skilled professionals capable of handling complex datasets remain obstacles to broader market penetration. The ongoing evolution of AI technologies and the expanding applications of AI across multiple sectors will continue to shape the demand for AI training datasets, pushing this market toward higher growth trajectories in the coming years. The diversity of applications—from smart homes and medical diagnoses to advanced robotics and autonomous driving—creates significant opportunities for companies specializing in this market. Maintaining data quality, security, and ethical considerations will be crucial for future market leadership.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CNN model trained with all features on an NVIDIA V100 GPU.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A brain tumor is one of the life-threatening neurological conditions affecting millions of people worldwide. Early diagnosis and classification of brain tumor types facilitate prompt treatment, thereby increasing the patient’s chances of survival. The advent of Deep Learning methods has significantly improved the field of medical image classification and aids neurologists in brain tumor diagnosis. However, the existing methods using Magnetic Resonance Imaging (MRI) face significant difficulties due to the complexities of brain tumors and the variability in tumor characteristics. Consequently, this research proposes the Inception V3 enabled Bidirectional Long Short Term Memory Network (IV3TM) for Brain Tumor Classification. In the proposed approach, the preprocessing and data augmentation techniques are presented to enhance classification performance. At the pre-processing stage, an iterative weighted-mean Filter approach is utilized to cope with bias field-effect fluctuations, noise, and blurring in input images to enhance the edges. Further, the data augmentation strategy increases the size of the available training data. SqueezeNet is used to segment images for further classification operations. Further, the proposed model combines the strengths of Inception V3 and BiLSTM to learn the sequential dependencies significant for understanding the intricate structural relationships in brain MRI data. The effectiveness of the proposed method is evaluated using several metrics, including specificity, accuracy, precision, F1-score, and sensitivity. Furthermore, the proposed method’s error is evaluated using root mean square error (RMSE). Experiments using the Brain Magnetic Resonance Imaging (MRI) images dataset and Figshare brain tumor datasets have shown encouraging results.