5 datasets found

R
Solar flare forecasting based on magnetogram sequences learning with MViT...
redu.unicamp.br
data.niaid.nih.gov
+1more
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Repositório de Dados de Pesquisa da Unicamp (2024). Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation [Dataset]. http://doi.org/10.25824/redu/IH0AH0
Explore at:
Unique identifier
https://doi.org/10.25824/redu/IH0AH0
Dataset updated
Jul 15, 2024
Dataset provided by
Repositório de Dados de Pesquisa da Unicamp
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Dataset funded by
Coordenação de Aperfeiçoamento de Pessoal de Nível Superior
Description
Source codes and dataset of the research "Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation". Our work employed PyTorch, a framework for training Deep Learning models with GPU support and automatic back-propagation, to load the MViTv2 s models with Kinetics-400 weights. To simplify the code implementation, eliminating the need for an explicit loop to train and the automation of some hyperparameters, we use the PyTorch Lightning module. The inputs were batches of 10 samples with 16 sequenced images in 3-channel resized to 224 × 224 pixels and normalized from 0 to 1. Most of the papers in our literature survey split the original dataset chronologically. Some authors also apply k-fold cross-validation to emphasize the evaluation of the model stability. However, we adopt a hybrid split taking the first 50,000 to apply the 5-fold cross-validation between the training and validation sets (known data), with 40,000 samples for training and 10,000 for validation. Thus, we can evaluate performance and stability by analyzing the mean and standard deviation of all trained models in the test set, composed of the last 9,834 samples, preserving the chronological order (simulating unknown data). We develop three distinct models to evaluate the impact of oversampling magnetogram sequences through the dataset. The first model, Solar Flare MViT (SF MViT), has trained only with the original data from our base dataset without using oversampling. In the second model, Solar Flare MViT over Train (SF MViT oT), we only apply oversampling on training data, maintaining the original validation dataset. In the third model, Solar Flare MViT over Train and Validation (SF MViT oTV), we apply oversampling in both training and validation sets. We also trained a model oversampling the entire dataset. We called it the "SF_MViT_oTV Test" to verify how resampling or adopting a test set with unreal data may bias the results positively. GitHub version The .zip hosted here contains all files from the project, including the checkpoint and the output files generated by the codes. We have a clean version hosted on GitHub (https://github.com/lfgrim/SFF_MagSeq_MViTs), without the magnetogram_jpg folder (which can be downloaded directly on https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip) and the output and checkpoint files. Most code files hosted here also contain comments on the Portuguese language, which are being updated to English in the GitHub version. Folders Structure In the Root directory of the project, we have two folders: magnetogram_jpg: holds the source images provided by Space Environment Artificial Intelligence Early Warning Innovation Workshop through the link https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip. It comprises 73,810 samples of high-quality magnetograms captured by HMI/SDO from 2010 May 4 to 2019 January 26. The HMI instrument provides these data (stored in hmi.sharp_720s dataset), making new samples available every 12 minutes. However, the images from this dataset were collected every 96 minutes. Each image has an associated magnetogram comprising a ready-made snippet of one or most solar ARs. It is essential to notice that the magnetograms cropped by SHARP can contain one or more solar ARs classified by the National Oceanic and Atmospheric Administration (NOAA). Seq_Magnetogram: contains the references for source images with the corresponding labels in the next 24 h. and 48 h. in the respectively M24 and M48 sub-folders. M24/M48: both present the following sub-folders structure: Seqs16; SF_MViT; SF_MViT_oT; SF_MViT_oTV; SF_MViT_oTV_Test. There are also two files in root: inst_packages.sh: install the packages and dependencies to run the models. download_MViTS.py: download the pre-trained MViTv2_S from PyTorch and store it in the cache. M24 and M48 folders hold reference text files (flare_Mclass...) linking the images in the magnetogram_jpg folders or the sequences (Seq16_flare_Mclass...) in the Seqs16 folders with their respective labels. They also hold "cria_seqs.py" which was responsible for creating the sequences and "test_pandas.py" to verify head info and check the number of samples categorized by the label of the text files. All the text files with the prefix "Seq16" and inside the Seqs16 folder were created by "criaseqs.py" code based on the correspondent "flare_Mclass" prefixed text files. Seqs16 folder holds reference text files, in which each file contains a sequence of images that was pointed to the magnetogram_jpg folders. All SF_MViT... folders hold the model training codes itself (SF_MViT...py) and the corresponding job submission (jobMViT...), temporary input (Seq16_flare...), output (saida_MVIT... and MViT_S...), error (err_MViT...) and checkpoint files (sample-FLARE...ckpt). Executed model training codes generate output, error, and checkpoint files. There is also a folder called "lightning_logs" that stores logs of trained models. Naming pattern for the files: magnetogram_jpg: follows the format "hmi.sharp_720s...magnetogram.fits.jpg" and Seqs16: follows the format "hmi.sharp_720s...to.", where: hmi: is the instrument that captured the image sharp_720s: is the database source of SDO/HMI. : is the identification of SHARP region, and can contain one or more solar ARs classified by the (NOAA). : is the date-time the instrument captured the image in the format yyyymmdd_hhnnss_TAI (y:year, m:month, d:day, h:hours, n:minutes, s:seconds). : is the date-time when the sequence starts, and follow the same format of . : is the date-time when the sequence ends, and follow the same format of . Reference text files in M24 and M48 or inside SF_MViT... folders follows the format "flare_Mclass_.txt", where: : is Seq16 if refers to a sequence, or void if refers direct to images. : "24h" or "48h". : is "TrainVal" or "Test". The refers to the split of Train/Val. : void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. All SF_MViT...folders: Model training codes: "SF_MViT_M+_", where: : void or "oT" (over Train) or "oTV" (over Train and Val) or "oTV_Test" (over Train, Val and Test); : "24h" or "48h"; : "oneSplit" for a specific split or "allSplits" if run all splits. : void is default to run 1 GPU or "2gpu" to run into 2 gpus systems; Job submission files: "jobMViT_", where: : point the queue in Lovelace environment hosted on CENAPAD-SP (https://www.cenapad.unicamp.br/parque/jobsLovelace) Temporary inputs: "Seq16_flare_Mclass_.txt: : train or val; : void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. Outputs: "saida_MViT_Adam_10-7", where: : k0 to k4, means the correlated split of the output, or void if the output is from all splits. Error files: "err_MViT_Adam_10-7", where: : k0 to k4, means the correlated split of the error log file, or void if the error file is from all splits. Checkpoint files: "sample-FLARE_MViT_S_10-7-epoch=-valid_loss=-Wloss_k=.ckpt", where: : epoch number of the checkpoint; : corresponding valid loss; : 0 to 4.
Aluminum alloy industrial materials defect
figshare.com
zip
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ying Han; Yugang Wang (2024). Aluminum alloy industrial materials defect [Dataset]. http://doi.org/10.6084/m9.figshare.27922929.v3
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27922929.v3
Dataset updated
Dec 3, 2024
Dataset provided by
figshare
Authors
Ying Han; Yugang Wang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset used in this study experiment was from the preliminary competition dataset of the 2018 Guangdong Industrial Intelligent Manufacturing Big Data Intelligent Algorithm Competition organized by Tianchi Feiyue Cloud (https://tianchi.aliyun.com/competition/entrance/231682/introduction). We have selected the dataset, removing images that do not meet the requirements of our experiment. All datasets have been classified for training and testing. The image pixels are all 2560×1960. Before training, all defects need to be labeled using labelimg and saved as json files. Then, all json files are converted to txt files. Finally, the organized defect dataset is detected and classified.Description of the data and file structureThis is a project based on the YOLOv8 enhanced algorithm for aluminum defect classification and detection tasks.All code has been tested on Windows computers with Anaconda and CUDA-enabled GPUs. The following instructions allow users to run the code in this repository based on a Windows+CUDA GPU system already in use.Files and variablesFile: defeat_dataset.zipDescription:SetupPlease follow the steps below to set up the project:Download Project RepositoryDownload the project repository defeat_dataset.zip from the following location.Unzip and navigate to the project folder; it should contain a subfolder: quexian_datasetDownload data1.Download data .defeat_dataset.zip2.Unzip the downloaded data and move the 'defeat_dataset' folder into the project's main folder.3. Make sure that your defeat_dataset folder now contains a subfolder: quexian_dataset.4. Within the folder you should find various subfolders such as addquexian-13, quexian_dataset, new_dataset-13, etc.softwareSet up the Python environment1.Download and install the Anaconda.2.Once Anaconda is installed, activate the Anaconda Prompt. For Windows, click Start, search for Anaconda Prompt, and open it.3.Create a new conda environment with Python 3.8. You can name it whatever you like; for example. Enter the following command: conda create -n yolov8 python=3.84.Activate the created environment. If the name is , enter: conda activate yolov8Download and install the Visual Studio Code.Install PyTorch based on your system:For Windows/Linux users with a CUDA GPU: bash conda install pytorch==1.10.0 torchvision==0.11.0 torchaudio==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forgeInstall some necessary libraries:Install scikit-learn with the command: conda install anaconda scikit-learn=0.24.1Install astropy with: conda install astropy=4.2.1Install pandas using: conda install anaconda pandas=1.2.4Install Matplotlib with: conda install conda-forge matplotlib=3.5.3Install scipy by entering: conda install scipy=1.10.1RepeatabilityFor PyTorch, it's a well-known fact:There is no guarantee of fully reproducible results between PyTorch versions, individual commits, or different platforms. In addition, results may not be reproducible between CPU and GPU executions, even if the same seed is used.All results in the Analysis Notebook that involve only model evaluation are fully reproducible. However, when it comes to updating the model on the GPU, the results of model training on different machines vary.Access informationOther publicly accessible locations of the data:https://tianchi.aliyun.com/dataset/public/Data was derived from the following sources:https://tianchi.aliyun.com/dataset/140666Data availability statementThe ten datasets used in this study come from Guangdong Industrial Wisdom Big Data Innovation Competition - Intelligent Algorithm Competition Rematch. and the dataset download link is https://tianchi.aliyun.com/competition/entrance/231682/information?lang=en-us. Officially, there are 4,356 images, including single blemish images, multiple blemish images and no blemish images. The official website provides 4,356 images, including single defect images, multiple defect images and no defect images. We have selected only single defect images and multiple defect images, which are 3,233 images in total. The ten defects are non-conductive, effacement, miss bottom corner, orange, peel, varicolored, jet, lacquer bubble, jump into a pit, divulge the bottom and blotch. Each image contains one or more defects, and the resolution of the defect images are all 2560×1920.By investigating the literature, we found that most of the experiments were done with 10 types of defects, so we chose three more types of defects that are more different from these ten types and more in number, which are suitable for the experiments. The three newly added datasets come from the preliminary dataset of Guangdong Industrial Wisdom Big Data Intelligent Algorithm Competition. The dataset can be downloaded from https://tianchi.aliyun.com/dataset/140666. There are 3,000 images in total, among which 109, 73 and 43 images are for the defects of bruise, camouflage and coating cracking respectively. Finally, the 10 types of defects in the rematch and the 3 types of defects selected in the preliminary round are fused into a new dataset, which is examined in this dataset.In the processing of the dataset, we tried different division ratios, such as 8:2, 7:3, 7:2:1, etc. After testing, we found that the experimental results did not differ much for different division ratios. Therefore, we divide the dataset according to the ratio of 7:2:1, the training set accounts for 70%, the validation set accounts for 20%, and the testing set accounts for 10%. At the same time, the random number seed is set to 0 to ensure that the results obtained are consistent every time the model is trained.Finally, the mean Average Precision (mAP) metric obtained from the experiment was tested on the dataset a total of three times. Each time the results differed very little, but for the accuracy of the experimental results, we took the average value derived from the highest and lowest results. The highest was 71.5% and the lowest was 71.1%, resulting in an average detection accuracy of 71.3% for the final experiment.All data and images utilized in this research are from publicly available sources, and the original creators have given their consent for these materials to be published in open-access formats.The settings for other parameters are as follows. epochs: 200，patience: 50，batch: 16，imgsz: 640，pretrained: true，optimizer: SGD，close_mosaic: 10，iou: 0.7，momentum: 0.937，weight_decay: 0.0005，box: 7.5，cls: 0.5，dfl: 1.5，pose: 12.0，kobj: 1.0，save_dir: runs/trainThe defeat_dataset.(ZIP)is mentioned in the Supporting information section of our manuscript. The underlying data are held at Figshare. DOI: 10.6084/m9.figshare.27922929.The results_images.zipin the system contains the experimental results graphs.The images_1.zipand images_2.zipin the system contain all the images needed to generate the manuscript.tex manuscript.
o
Data from: Linear-Transformer
explore.openaire.eu
Updated May 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juncai Guo (2021). Linear-Transformer [Dataset]. http://doi.org/10.5281/zenodo.4761073
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4761073
Dataset updated
May 14, 2021
Authors
Juncai Guo
Description
Runtime Environment: 4 NVIDIA 2080 Ti GPUs Ubuntu 16.04 CUDA 10.0 (with CuDNN of the corresponding version) Python 3.7 PyTorch 1.2.0 2. Before the test, you should parse the GLOVE embeddings: 1) Download the embedding data from http://downloads.cs.stanford.edu/nlp/data/glove.840B.300d.zip to the directory “ ./exp_linear_transformer/data/glove/ ” 2) unzip the zip file with the name “ glove.840B.300d.txt ” 3) run the python file “ parse_glove.py ” in the directory “ ./exp_linear_transformer/src_code/parse_glove/ “ 3. Then, install the fastnlp 0.4.1 with pip. If other packages are missed, install them with pip. 4. Details about model tests are as follows: 1) Ex1. Sentiment Classification on SST dataset a. Download the dataset files from https://nlp.stanford.edu/sentiment/trainDevTestTrees_PTB.zip and https://nlp.stanford.edu/~socherr/stanfordSentimentTreebank.zip, Unzip both the zip files and copy all the files to the directory “ ./exp_linear_transformer/data/SST/stanfordSentimentTreebank/ ”. Then, go to the directory “./exp_linear_transformer/src_code/test_SST ”, and b. Run the file “ make_raw_data.py ”; c. Run the file “ preprocessor.py ”; d. Run the file “ linear_transformer_model.py ” to test the model. The result will be printed in the end. We have saved the test log into the directory " ./exp_linear_transformer/data/SST/log/ " 2) Ex2. Semantic Matching on STS dataset a. Download the dataset file from http://ixa2.si.ehu.es/stswiki/images/4/48/Stsbenchmark.tar.gz , Unzip the file, Delete the top 4 columns in the file sts-train.csv, sts-dev.csv, and sts-dev.csv, Rename the 3 files as sts-train.txt, sts-dev.txt, and sts-dev.txt and copy the files to the directory " ./exp_linear_transformer/data/SEMEVAL2017T1/stsbenchmark/ ". Then, go to the directory “./exp_linear_transformer/src_code/test_SEMEVAL2017T1/ ”, and b. Run the file “ make_raw_data.py ”; c. Run the file “ preprocessor.py ”; d. Run the file “ linear_transformer_model.py ”, "star_transformer_model.py", and "transformer_model.py" to test the models. The results will be printed in the end. We have saved the test logs into the directory " ./exp_linear_transformer/data/SEMEVAL2017T1/log/ " 3) Ex3. Language Inference on SNLI dataset a. Download the dataset file from https://nlp.stanford.edu/projects/snli/snli_1.0.zip unzip the file and copy the unzipped files to the directory " ./exp_linear_transformer/data/SNLI/snli_1.0/ ". Then, go to the directory “./exp_linear_transformer/src_code/test_SNLI ", and b. Run the file “ make_raw_data.py ”; c. Run the file “ preprocessor.py ”; d. Run the file “ linear_transformer_model.py ” to test the model. The result will be printed in the end. We have saved the test log into the directory " ./exp_linear_transformer/data/SNLI/log/ " 4) Ex4. Computational Analysis on STS dataset a. Go to the directory “./exp_linear_transformer/src_code/test_SEMEVAL2017T1/ ”, and run the file " time_plot.py " to get the graph of " Training time v.s. sequence length "; b. Reset " batch_size=16 " in the file " config.py ", run the file “ linear_transformer_model.py ”, "star_transformer_model.py", and "transformer_model.py" separately on a single GPU. According to the diffrent " Max Seq Len "s printed, record the GPU memory to the variable " length_memory " in the file " time_plot.py ". Then run the file to get the graph of " GPU memory v.s. sequence length "
z
Data of "Self-consistency Reinforced minimal Gated Recurrent Unit for...
zenodo.org
bin, zip
Updated Mar 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling Wu; Ling Wu; Ludovic Noels; Ludovic Noels (2024). Data of "Self-consistency Reinforced minimal Gated Recurrent Unit for surrogate modeling of history-dependent non-linear problems: application to history-dependent homogenized response of heterogeneous materials" [Dataset]. http://doi.org/10.5281/zenodo.10551272
Explore at:
zip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10551272
Dataset updated
Mar 25, 2024
Dataset provided by
Zenodo
Authors
Ling Wu; Ling Wu; Ludovic Noels; Ludovic Noels
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Development of the Self-Consistency reinforced Minimum Recurrent Unit (SC-MRU)

This directory contains the data and algorithms generated in publication¹

Table of Contents

Dependencies and Prerequisites

Structure of Repository

Part 1: Data preparation

Part 2: RNN training

Part 3: Multiscale analysis

Part 4: Reproduce paper[^1] figures

Dependencies and Prerequisites

Python, pandas, matplotlib, texttabble and latextable are pre requisites for visualizing and navigating the data.

For generating mesh and for vizualization, gmsh (www.gmsh.info) is required.

For running simulations, cm3Libraries (http://www.ltas-cm3.ulg.ac.be/openSource.htm) is required.

Instructions using apt & pip3 package manager

Instructions for Debian/Ubuntu based workstations are as follows.

python, pandas and dependencies

sudo apt install python3 python3-scipy libpython3-dev python3-numpy python3-pandas

matplotlib, texttabble and latextable

pip3 install matplotlib texttable latextable

Pytorch (only for run with cm3Libraries)

Without GPU

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

With GPU

pip3 install torch torchvision torchaudio

Libtorch (for compiling the cells)

Without GPU: In a local directory (e.g. ~/local with export TORCHDIR=$HOME/local/libtorch)

wget https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-2.1.1%2Bcpu.zip unzip libtorch-shared-with-deps-2.1.1%2Bcpu.zip

With GPU: In a local directory (e.g. ~/local with export TORCHDIR=$HOME/local/libtorch)

wget https://download.pytorch.org/libtorch/cu121/libtorch-shared-with-deps-2.1.1%2Bcu121.zip unzip libtorch-shared-with-deps-2.1.1+cu121.zip

Structure of Repository

All_Path_Res: results of the direct numerical simulations used as training and testing data, see details in Part 1: Data preparation.

ConstRVE: script to run direct numerical finite element simulations, see details in Part 1: Data preparation.

MultiScale: scripts to run and visualise the multiscale analyses, see details in Part 3: Multiscale analysis.

SC_MRU: implementation of the RNN and scripts to train them, see details in Part 2: RNN training.

TrainingData: scripts to collect, normalise and truncate the RVEs direct simulation results as training and testing data, see details in Part 1: Data preparation. The director also contained the storred processed data used in ¹.

TrainingPaths: scripts to generate the different loading paths for the direct numerical simulations used as training and testing data, see details in Part 1: Data preparation.

Part 1: Data preparation

Generate the loading paths

TrainingPaths/testGenerationData.py is used to generate random walk paths, with the options

Rmax = 0.11 # bound on the final Green Lagrange strain

TimeStep = 1. # in second

EvalStep = [1e-4,5e-3] #Bounds on the Green Lagrange increments

Nmax = 2500 #maximum length of the sequence

k = 4000 # number of path to generate

The path are storred by default in ConstRVE/Paths/. The path has to be existing before launching the script. You can change the name in line 123 saveDir = '../ConstRVE'+'/Paths/'.

Examples of generated paths can be found in ConstRVE/PathsExamples/

The command to be run from the directory TrainingPaths is

(mkdir ../ConstRVE/Paths) #if needed python3 testGenerationData.py

TrainingPaths/generationData_Cyclic.py is used to generate random cylic paths, with the options

Rmax = [np.random.uniform(0.,0.04),np.random.uniform(0.,0.06),np.random.uniform(0.0,0.09),0.12] # bound on the final Green Lagrange strain is random

TimeStep = 1. # in second

EvalStep = [1e-4,5e-3] #Bounds on the Green Lagrange increments

Nmax = 2500 #maximum length of the sequence

k = 2000 # number of path to generate

The path are stored by default in ConstRVE/Paths/. You can change the name in line 123 saveDir = '../ConstRVE'+'/Paths/'.

The command to be run from the directory TrainingPaths is

(mkdir ../ConstRVE/Paths) #if needed python3 generationData_Cyclic.py

TrainingPaths/countPathLength.py gives average, minimum and maximum lengths of the generated paths and the distribution of the \Delta R. By default the paths are read in ConstRVE/Paths/ but the directory can be given as an argument. The file can be used to read

either the generated loading paths

python3 countPathLength.py '../ConstRVE/PathsExamples'

or the results of the simulations

python3 countPathLength.py '../All_Path_Res/Path_Res9'

TrainingPaths/graphData.py generates illustrations from randomly picked paths in ConstRVE/Paths/ and generate png figures.

Generate the RVEs direct simulation results

Uses the loading paths existing in ConstRVE/Paths/.

ConstRVE/rve.geo is the RVE geometry file that can be read by gmsh (www.gmsh.info).

ConstRVE/rve.msh is the RVE mesh file that can be read by gmsh (www.gmsh.info).

ConstRVE/utilsFunc.py contains python tools to be used.

ConstRVE/Rve_withoutInternalVars.py is used to run all the RVE simulations:

This requires cm3Libraries (http://www.ltas-cm3.ulg.ac.be/openSource.htm).

All the ouptus are stored in All_Path_Res/Path_Res12, you can change the name in line 71 Path_Res = '../All_Path_Res/Path_Res12/'. The results of RVE simulations are saved as the sequence (one configuration per line) of the Green-Lagrange strains and Second Piola-Kirchhoff stress (in column). One example can be found in <a
Data of "Stochastic Deep Material Networks as Efficient Surrogates for...
zenodo.org
data.niaid.nih.gov
+1more
zip
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ling Wu; Ling Wu; Ludovic Noels; Ludovic Noels (2025). Data of "Stochastic Deep Material Networks as Efficient Surrogates for Stochastic Homogenisation of Non-linear Heterogeneous Materials" [Dataset]. http://doi.org/10.5281/zenodo.14861537
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14861537
Dataset updated
Feb 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ling Wu; Ling Wu; Ludovic Noels; Ludovic Noels
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Stochastic Deep Material Networks as Efficient Surrogates for Stochastic Homogenisation of Non-linear Heterogeneous Materials

This directory contains the data and algorithms generated in publication¹

Table of Contents

Dependencies and Prerequisites

Structure of Repository

Images/Geometries and IB-DMN training data of the 6 SVEs

Stochastic analysis - Direct numerical simulations of SVEs

Training of the reference IB-DMN

Stochastic analysis - Stochastic IB-DMN

Reproduce paper[^1] figures

Dependencies and Prerequisites

Python, pandas, matplotlib, texttabble and latextable are pre requisites for visualizing and navigating the data.

For generating mesh and for vizualization, gmsh (www.gmsh.info) is required.

For running simulations, cm3Libraries (http://www.ltas-cm3.ulg.ac.be/openSource.htm) is required.

Instructions using apt & pip3 package manager

Instructions for Debian/Ubuntu based workstations are as follows.

python, pandas and dependencies

sudo apt install python3 python3-scipy libpython3-dev python3-numpy python3-pandas

matplotlib, texttabble and latextable

pip3 install matplotlib texttable latextable

Pytorch

Without GPU

pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cpu

With GPU

pip3 install torch torchvision torchaudio

Libtorch (only when using cm3Libraries)

Without GPU: In a local directory (e.g. ~/local with export TORCHDIR=$HOME/local/libtorch)

wget wget https://download.pytorch.org/libtorch/cpu/libtorch-shared-with-deps-2.3.0%2Bcpu.zip unzip libtorch-shared-with-deps-2.1.1+cpu.zip

With GPU: In a local directory (e.g. ~/local with export TORCHDIR=$HOME/local/libtorch)

wget https://download.pytorch.org/libtorch/cu121/libtorch-shared-with-deps-2.1.1%2Bcu121.zip unzip libtorch-shared-with-deps-2.1.1+cu121.zip

Structure of Repository

6SVE_Example: Images/Geometries and IB-DMN training data of the 6 SVEs.

Vf_Mat: Training of the reference IB-DMN.

Stochastic_DNS_LinearHardening: Stochastic analysis - Direct numerical simulations of SVEs

Stochastic analysis - Stochastic IB-DMN: Stochastic analysis - Stochastic IB-DMN

Images/Geometries and IB-DMN training data of the 6 SVEs: 6SVE_Example

6SVE_Example/6SVE_Data: Images/Geometries and IB-DMN training data of the 6 SVEs

6SVE_Example/6SVE_DNS:

6SVE_Example/6SVE_DNS/DNS_LinearHardening:

Direct numerical simulations (require cm3Libraries) on the 6 SVEs in a finite-deformation setting: python3 RVE_Test.py

TestKey = 'Shear' or TestKey = 'Tensile' or TestKey = 'UniStrain' in RVE_Test.py to run the test in uniaxial stress, shearing or uniaxial strain loading conditions

Results are stored in 6SVE_Example/DNS_LinearHardening/Path_Res

6SVE_Example/6SVE_DNS/DNS_LinearHardening_SmallDefo:

Direct numerical simulations (require cm3Libraries) on the 6 SVEs in a small-deformation setting: python3 RVE_Test.py

Tensile = False or Tensile = True in RVE_Test.py to run the test in uniaxial stress or uniaxial strain loading conditions

Results are stored in 6SVE_Example/6SVE_DNS/DNS_LinearHardening_SmallDefo/Path_Res

6SVE_Example/6SVE_DMN:

6SVE_Example/6SVE_DMN/NNW_Tool.py:

Defines the class of IB-DMN and loss function

Called in the following files (not stand alone)

6SVE_Example/6SVE_DMN/WriteSingleRVEPara.py:

Training IB-DMN for single SVE

Training = True for warm start or not

level = 5 to select the IB-DMN level

Results are written in datafile in directory 6SVE_Example/6SVE_DMN/DMNPara for nonlinear simulations

6SVE_Example/6SVE_DMN/SimulationDMN.py:

Nonlinear IB-DMN simulations (require cm3Libraries) in a finite deformation setting

Use the trained IB-DMN parameters stored in 6SVE_Example/6SVE_DMN/DMNPara

TestKey = 'Shear' or TestKey = 'Tensile' or TestKey = 'UniStrain' in RVE_Test.py to run the test in uniaxial stress, shearing or uniaxial strain loading conditions

level = 4, level = 5 or level = 6 specifies the level of the IB-DMN to be used

SVE number and training case can be manually specified at the end of the file, e.g; f_Para = './DMNPara/rve6_Level'+str(level)+'_300Data.dat' and Id = prefix+'rve6_Level'+str(level)+'_'+Load+'Res300.csv'

Results are stored in 6SVE_Example/6SVE_DMN/DMN_simulation

6SVE_Example/6SVE_DMN/SmallDefoDMN.py:

Nonlinear IB-DMN simulations (require cm3Libraries) in a small deformation setting

Use the trained IB-DMN parameters stored in 6SVE_Example/6SVE_DMN/DMNPara

Results are stored in 6SVE_Example/6SVE_DMN/DMN_simulation_SmallDefo

6SVE_Example/6SVE_DMN/Plot_DNS_DMN.py:

Plot the comparison of the results of nonlinear simulations using IB-DMN and DNS

Load = 'Tensile' or Load = 'Shear' or Load = 'UniStrain' to vizualize the results in uniaxial stress, shearing or uniaxial strain loading conditions

Use LargeDefo = True to plot results in finite-strain
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Repositório de Dados de Pesquisa da Unicamp (2024). Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation [Dataset]. http://doi.org/10.25824/redu/IH0AH0

Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation

Explore at:

Unique identifier

https://doi.org/10.25824/redu/IH0AH0

Dataset updated

Jul 15, 2024

Dataset provided by

Repositório de Dados de Pesquisa da Unicamp

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Dataset funded by

Coordenação de Aperfeiçoamento de Pessoal de Nível Superior

Description

Source codes and dataset of the research "Solar flare forecasting based on magnetogram sequences learning with MViT and data augmentation". Our work employed PyTorch, a framework for training Deep Learning models with GPU support and automatic back-propagation, to load the MViTv2 s models with Kinetics-400 weights. To simplify the code implementation, eliminating the need for an explicit loop to train and the automation of some hyperparameters, we use the PyTorch Lightning module. The inputs were batches of 10 samples with 16 sequenced images in 3-channel resized to 224 × 224 pixels and normalized from 0 to 1. Most of the papers in our literature survey split the original dataset chronologically. Some authors also apply k-fold cross-validation to emphasize the evaluation of the model stability. However, we adopt a hybrid split taking the first 50,000 to apply the 5-fold cross-validation between the training and validation sets (known data), with 40,000 samples for training and 10,000 for validation. Thus, we can evaluate performance and stability by analyzing the mean and standard deviation of all trained models in the test set, composed of the last 9,834 samples, preserving the chronological order (simulating unknown data). We develop three distinct models to evaluate the impact of oversampling magnetogram sequences through the dataset. The first model, Solar Flare MViT (SF MViT), has trained only with the original data from our base dataset without using oversampling. In the second model, Solar Flare MViT over Train (SF MViT oT), we only apply oversampling on training data, maintaining the original validation dataset. In the third model, Solar Flare MViT over Train and Validation (SF MViT oTV), we apply oversampling in both training and validation sets. We also trained a model oversampling the entire dataset. We called it the "SF_MViT_oTV Test" to verify how resampling or adopting a test set with unreal data may bias the results positively. GitHub version The .zip hosted here contains all files from the project, including the checkpoint and the output files generated by the codes. We have a clean version hosted on GitHub (https://github.com/lfgrim/SFF_MagSeq_MViTs), without the magnetogram_jpg folder (which can be downloaded directly on https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip) and the output and checkpoint files. Most code files hosted here also contain comments on the Portuguese language, which are being updated to English in the GitHub version. Folders Structure In the Root directory of the project, we have two folders: magnetogram_jpg: holds the source images provided by Space Environment Artificial Intelligence Early Warning Innovation Workshop through the link https://tianchi-competition.oss-cn-hangzhou.aliyuncs.com/531804/dataset_ss2sff.zip. It comprises 73,810 samples of high-quality magnetograms captured by HMI/SDO from 2010 May 4 to 2019 January 26. The HMI instrument provides these data (stored in hmi.sharp_720s dataset), making new samples available every 12 minutes. However, the images from this dataset were collected every 96 minutes. Each image has an associated magnetogram comprising a ready-made snippet of one or most solar ARs. It is essential to notice that the magnetograms cropped by SHARP can contain one or more solar ARs classified by the National Oceanic and Atmospheric Administration (NOAA). Seq_Magnetogram: contains the references for source images with the corresponding labels in the next 24 h. and 48 h. in the respectively M24 and M48 sub-folders. M24/M48: both present the following sub-folders structure: Seqs16; SF_MViT; SF_MViT_oT; SF_MViT_oTV; SF_MViT_oTV_Test. There are also two files in root: inst_packages.sh: install the packages and dependencies to run the models. download_MViTS.py: download the pre-trained MViTv2_S from PyTorch and store it in the cache. M24 and M48 folders hold reference text files (flare_Mclass...) linking the images in the magnetogram_jpg folders or the sequences (Seq16_flare_Mclass...) in the Seqs16 folders with their respective labels. They also hold "cria_seqs.py" which was responsible for creating the sequences and "test_pandas.py" to verify head info and check the number of samples categorized by the label of the text files. All the text files with the prefix "Seq16" and inside the Seqs16 folder were created by "criaseqs.py" code based on the correspondent "flare_Mclass" prefixed text files. Seqs16 folder holds reference text files, in which each file contains a sequence of images that was pointed to the magnetogram_jpg folders. All SF_MViT... folders hold the model training codes itself (SF_MViT...py) and the corresponding job submission (jobMViT...), temporary input (Seq16_flare...), output (saida_MVIT... and MViT_S...), error (err_MViT...) and checkpoint files (sample-FLARE...ckpt). Executed model training codes generate output, error, and checkpoint files. There is also a folder called "lightning_logs" that stores logs of trained models. Naming pattern for the files: magnetogram_jpg: follows the format "hmi.sharp_720s...magnetogram.fits.jpg" and Seqs16: follows the format "hmi.sharp_720s...to.", where: hmi: is the instrument that captured the image sharp_720s: is the database source of SDO/HMI. : is the identification of SHARP region, and can contain one or more solar ARs classified by the (NOAA). : is the date-time the instrument captured the image in the format yyyymmdd_hhnnss_TAI (y:year, m:month, d:day, h:hours, n:minutes, s:seconds). : is the date-time when the sequence starts, and follow the same format of . : is the date-time when the sequence ends, and follow the same format of . Reference text files in M24 and M48 or inside SF_MViT... folders follows the format "flare_Mclass_.txt", where: : is Seq16 if refers to a sequence, or void if refers direct to images. : "24h" or "48h". : is "TrainVal" or "Test". The refers to the split of Train/Val. : void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. All SF_MViT...folders: Model training codes: "SF_MViT_M+_", where: : void or "oT" (over Train) or "oTV" (over Train and Val) or "oTV_Test" (over Train, Val and Test); : "24h" or "48h"; : "oneSplit" for a specific split or "allSplits" if run all splits. : void is default to run 1 GPU or "2gpu" to run into 2 gpus systems; Job submission files: "jobMViT_", where: : point the queue in Lovelace environment hosted on CENAPAD-SP (https://www.cenapad.unicamp.br/parque/jobsLovelace) Temporary inputs: "Seq16_flare_Mclass_.txt: : train or val; : void or "_over" after the extension (...txt_over): means temporary input reference that was over-sampled by a training model. Outputs: "saida_MViT_Adam_10-7", where: : k0 to k4, means the correlated split of the output, or void if the output is from all splits. Error files: "err_MViT_Adam_10-7", where: : k0 to k4, means the correlated split of the error log file, or void if the error file is from all splits. Checkpoint files: "sample-FLARE_MViT_S_10-7-epoch=-valid_loss=-Wloss_k=.ckpt", where: : epoch number of the checkpoint; : corresponding valid loss; : 0 to 4.

Clear search

Close search

Google apps

Main menu

Solar flare forecasting based on magnetogram sequences learning with MViT...

Aluminum alloy industrial materials defect

Data from: Linear-Transformer

Data of "Self-consistency Reinforced minimal Gated Recurrent Unit for...

Development of the Self-Consistency reinforced Minimum Recurrent Unit (SC-MRU)

Table of Contents

Dependencies and Prerequisites

Instructions using apt & pip3 package manager

python, pandas and dependencies

matplotlib, texttabble and latextable

Pytorch (only for run with cm3Libraries)

Libtorch (for compiling the cells)

Structure of Repository

Part 1: Data preparation

Generate the loading paths

Generate the RVEs direct simulation results

Data of "Stochastic Deep Material Networks as Efficient Surrogates for...

Stochastic Deep Material Networks as Efficient Surrogates for Stochastic Homogenisation of Non-linear Heterogeneous Materials