Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the organized python functions of proposed methods in Yanwen Wang PhD research. Researchers can directly use these functions to conduct spatial+ cross-validation (SP-CV), adversarial-validation-based dissimilarity quantification method (AVD), and dissimilarity-adaptive cross-validation (DA-CV). The description of how to run codes is in Readme.txt. The descriptions of functions are in functions.docx.
We have created three new Reading Comprehension datasets constructed using an adversarial model-in-the-loop.
We use three different models; BiDAF (Seo et al., 2016), BERTLarge (Devlin et al., 2018), and RoBERTaLarge (Liu et al., 2019) in the annotation loop and construct three datasets; D(BiDAF), D(BERT), and D(RoBERTa), each with 10,000 training examples, 1,000 validation, and 1,000 test examples.
The adversarial human annotation paradigm ensures that these datasets consist of questions that current state-of-the-art models (at least the ones used as adversaries in the annotation loop) find challenging. The three AdversarialQA round 1 datasets provide a training and evaluation resource for such methods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
TCAB is a large collection of successful adversarial attacks on state-of-the-art text classification models trained on multiple sentiment and abuse domain datasets.
The dataset is broken up into 2 files: train.csv and val.csv. The training set contains 1,448,751 instances (552,364 are "clean" unperturbed instances) and the validation set contains 482,914 instances (178,607 are "clean"). Each instance contains the following attributes:
scenario: Domain, either abuse or sentiment.
target_model_dataset: Dataset being attacked.
target_model_train_dataset: Dataset the target model trained on.
target_model: Type of victim model (e.g., bert, roberta, xlnet).
attack_toolchain: Open-source attack toolchain, either TextAttack or OpenAttack.
attack_name: Name of the attack method.
original_text: Original input text.
original_output: Prediction probabilities of the target model on the original text.
ground_truth: Encoded label for the original task of the domain dataset. 1 and 0 means toxic and toxic for abuse datasets, respectively. 1 and 0 means positive and negative sentiment for sentiment datasets. If there is a neutral sentiment, then 2, 1, 0 means positive, neutral, and negative sentiment.
status: Unperturbed example if "clean"; successful adversarial attack if "success".
perturbed_text: Text after it has been perturbed by an attack.
perturbed_output: Prediction probabilities of the target model on the perturbed text.
attack_time: Time taken to execute the attack.
num_queries: Number of queries performed while attacking.
frac_words_changed: Fraction of words changed due to an attack.
test_index: Index of each unique source example (original instance) (LEGACY - necessary for backwards compatibility).
original_text_identifier: Index of each unique source example (original instance).
unique_src_instance_identifier: Primary key to uniquely identify to every source instance; comprised of (target_model_dataset, test_index, original_text_identifier).
pk: Primary key to uniquely identify every attack instance; comprised of (attack_name, attack_toolchain, original_text_identifier, scenario, target_model, target_model_dataset, test_index).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Pearson correlation coefficient values of Davies–Bouldin Metric (DBM) and Amalgam Metric (AM) with Mean L2 Score of adversarial attacks for each vanilla classifier and attack pair.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average training times of DSAN-MMD on CSI image dataset across all training percentages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Batik is one of the Indonesian cultural heritages which have noble cultural and philosophical meaning in every motif. This article introduces a dataset from Yogyakarta, Indonesia, "Batik Nitik 960". Dataset was captured from a piece of fabric consisting of 60 Nitik motifs. The dataset provided by Paguyuban Pecinta Batik Indonesia (PPBI) Sekar Jagad Yogyakarta, collection of Winotosasto Batik, and the data taken in APIPS Gallery. The dataset is divided into 60 categories with a total of 960 images, and each class has 16 photos. The images were captured by Sony Alpha a6400, lighting using Godox SK II 400, and data was filtered using jpg format. Each category has four motifs and is augmented using rotation using 90, 180, and 270 degrees. Each class has a philosophical meaning which describes the history of the motif. This dataset allows the training and validation of machine learning models to classify, retrieve, or generate a new batik motif using a generative adversarial network. To our knowledge, this is the first publicly available Batik Nitik dataset with philosophical meaning. Data provides by a collaboration of PPBI Sekar Jagad Yogyakarta, Universitas Muhammadiyah Malang, and Universitas Gadjah Mada. Hope this dataset "Batik Nitik 960" can support batik research and we are open to research collaboration.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A challenge to many real-world data streams is imbalance with concept drift, which is one of the most critical tasks in anomaly detection. Learning nonstationary data streams for anomaly detection has been well studied in recent years. However, most of the researches assume that the class of data streams is relatively balanced. Only a few approaches tackle the joint issue of imbalance and concept drift. To overcome this joint issue, we propose an ensemble learning method with generative adversarial network-based sampling and consistency check (EGSCC) in this paper. First, we design a comprehensive anomaly detection framework that includes an oversampling module by generative adversarial network, an ensemble classifier, and a consistency check module. Next, we introduce double encoders into GAN to better capture the distribution characteristics of imbalanced data for oversampling. Then, we apply the stacking ensemble learning to deal with concept drift. Four base classifiers of SVM, KNN, DT and RF are used in the first layer, and LR is used as meta classifier in second layer. Last but not least, we take consistency check of the incremental instance and check set to determine whether it is anormal by statistical learning, instead of threshold-based method. And the validation set is dynamic updated according to the consistency check result. Finally, three artificial data sets obtained from Massive Online Analysis platform and two real data sets are used to verify the performance of the proposed method from four aspects: detection performance, parameter sensitivity, algorithm cost and anti-noise ability. Experimental results show that the proposed method has significant advantages in anomaly detection of imbalanced data streams with concept drift.
The Adversarial Natural Language Inference (ANLI)
Source: https://huggingface.co/datasets/anli Num examples: 100,459 (train) 1,200 (validation) 1,200 (test)
Language: English
from datasets import load_dataset
load_dataset("vietgpt/anli_r3_en")
Format for NLI task
def preprocess(sample): premise = sample['premise'] hypothesis = sample['hypothesis'] label = sample['label']
if label == 0:
label = "entailment"
elif label == 1:
label =… See the full description on the dataset page: https://huggingface.co/datasets/vietgpt/anli_r3_en.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Effect of confidence level on detection accuracy rate (HDFS).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average micro-F1 scores of DASAN-LMMD on CSI image dataset across all training percentages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Average macro-F1 scores of DASAN-MMD on CSI image dataset across all training percentages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Researches have demonstrated that microorganisms are indispensable for the nutrition transportation, growth and development of human bodies, and disorder and imbalance of microbiota may lead to the occurrence of diseases. Therefore, it is crucial to study relationships between microbes and diseases. In this manuscript, we proposed a novel prediction model named MADGAN to infer potential microbe-disease associations by combining biological information of microbes and diseases with the generative adversarial networks. To our knowledge, it is the first attempt to use the generative adversarial network to complete this important task. In MADGAN, we firstly constructed different features for microbes and diseases based on multiple similarity metrics. And then, we further adopted graph convolution neural network (GCN) to derive different features for microbes and diseases automatically. Finally, we trained MADGAN to identify latent microbe-disease associations by games between the generation network and the decision network. Especially, in order to prevent over-smoothing during the model training process, we introduced the cross-level weight distribution structure to enhance the depth of the network based on the idea of residual network. Moreover, in order to validate the performance of MADGAN, we conducted comprehensive experiments and case studies based on databases of HMDAD and Disbiome respectively, and experimental results demonstrated that MADGAN not only achieved satisfactory prediction performances, but also outperformed existing state-of-the-art prediction models.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accuracy rate under different noise rates on RBF and Waveform datasets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Accuracy rate with the number of instances processed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Model Checkpoint was used to save the model on each iteration if validation accuracy improves.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of running time consumption of each algorithm (seconds).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Site differences, or systematic differences in feature distributions across multiple data-acquisition sites, are a known source of heterogeneity that may adversely affect large-scale meta- and mega-analyses of independently collected neuroimaging data. They influence nearly all multi-site imaging modalities and biomarkers, and methods to compensate for them can improve reliability and generalizability in the analysis of genetics, omics, and clinical data. The origins of statistical site effects are complex and involve both technical differences (scanner vendor, head coil, acquisition parameters, imaging processing) and differences in sample characteristics (inclusion/exclusion criteria, sample size, ancestry) between sites. In an age of expanding international consortium research, there is a growing need to disentangle technical site effects from sample characteristics of interest. Numerous statistical and machine learning methods have been developed to control for, model, or attenuate site effects – yet to date, no comprehensive review has discussed the benefits and drawbacks of each for different use cases. Here, we provide an overview of the different existing statistical and machine learning methods developed to remove unwanted site effects from independently collected neuroimaging samples. We focus on linear mixed effect models, the ComBat technique and its variants, adjustments based on image quality metrics, normative modeling, and deep learning approaches such as generative adversarial networks. For each method, we outline the statistical foundation and summarize strengths and weaknesses, including their assumptions and conditions of use. We provide information on software availability and comment on the ease of use and the applicability of these methods to different types of data. We discuss validation and comparative reports, mention caveats and provide guidance on when to use each method, depending on context and specific research questions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Synthetic clinical images could augment real medical image datasets, a novel approach in otolaryngology–head and neck surgery (OHNS). Our objective was to develop a generative adversarial network (GAN) for tympanic membrane images and to validate the quality of synthetic images with human reviewers. Our model was developed using a state-of-the-art GAN architecture, StyleGAN2-ADA. The network was trained on intraoperative high-definition (HD) endoscopic images of tympanic membranes collected from pediatric patients undergoing myringotomy with possible tympanostomy tube placement. A human validation survey was administered to a cohort of OHNS and pediatrics trainees at our institution. The primary measure of model quality was the Frechet Inception Distance (FID), a metric comparing the distribution of generated images with the distribution of real images. The measures used for human reviewer validation were the sensitivity, specificity, and area under the curve (AUC) for humans’ ability to discern synthetic from real images. Our dataset comprised 202 images. The best GAN was trained at 512x512 image resolution with a FID of 47.0. The progression of images through training showed stepwise “learning” of the anatomic features of a tympanic membrane. The validation survey was taken by 65 persons who reviewed 925 images. Human reviewers demonstrated a sensitivity of 66%, specificity of 73%, and AUC of 0.69 for the detection of synthetic images. In summary, we successfully developed a GAN to produce synthetic tympanic membrane images and validated this with human reviewers. These images could be used to bolster real datasets with various pathologies and develop more robust deep learning models such as those used for diagnostic predictions from otoscopic images. However, caution should be exercised with the use of synthetic data given issues regarding data diversity and performance validation. Any model trained using synthetic data will require robust external validation to ensure validity and generalizability.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Identifying small molecules that bind strongly to target proteins in rational molecular design is crucial. Machine learning techniques, such as generative adversarial networks (GAN), are now essential tools for generating such molecules. In this study, we present an enhanced method for molecule generation using objective-reinforced GANs. Specifically, we introduce BEGAN (Boltzmann-enhanced GAN), a novel approach that adjusts molecule occurrence frequencies during training based on the Boltzmann distribution exp(−ΔU/τ), where ΔU represents the estimated binding free energy derived from docking algorithms and τ is a temperature-related scaling hyperparameter. This Boltzmann reweighting process shifts the generation process toward molecules with higher binding affinities, allowing the GAN to explore molecular spaces with superior binding properties. The reweighting process can also be refined through multiple iterations without altering the overall distribution shape. To validate our approach, we apply it to the design of sex pheromone analogs targeting Spodoptera frugiperda pheromone receptor SfruOR16, illustrating that the Boltzmann reweighting significantly increases the likelihood of generating promising sex pheromone analogs with improved binding affinities to SfruOR16, further supported by atomistic molecular dynamics simulations. Furthermore, we conduct a comprehensive investigation into parameter dependencies and propose a reasonable range for the hyperparameter τ. Our method offers a promising approach for optimizing molecular generation for enhanced protein binding, potentially increasing the efficiency of molecular discovery pipelines.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
PurposeTo propose a novel deep-learning-based auto-segmentation model for CTV delineation in cervical cancer and to evaluate whether it can perform comparably well to manual delineation by a three-stage multicenter evaluation framework.MethodsAn adversarial deep-learning-based auto-segmentation model was trained and configured for cervical cancer CTV contouring using CT data from 237 patients. Then CT scans of additional 20 consecutive patients with locally advanced cervical cancer were collected to perform a three-stage multicenter randomized controlled evaluation involving nine oncologists from six medical centers. This evaluation system is a combination of objective performance metrics, radiation oncologist assessment, and finally the head-to-head Turing imitation test. Accuracy and effectiveness were evaluated step by step. The intra-observer consistency of each oncologist was also tested.ResultsIn stage-1 evaluation, the mean DSC and the 95HD value of the proposed model were 0.88 and 3.46 mm, respectively. In stage-2, the oncologist grading evaluation showed the majority of AI contours were comparable to the GT contours. The average CTV scores for AI and GT were 2.68 vs. 2.71 in week 0 (P = .206), and 2.62 vs. 2.63 in week 2 (P = .552), with no significant statistical differences. In stage-3, the Turing imitation test showed that the percentage of AI contours, which were judged to be better than GT contours by ≥5 oncologists, was 60.0% in week 0 and 42.5% in week 2. Most oncologists demonstrated good consistency between the 2 weeks (P > 0.05).ConclusionsThe tested AI model was demonstrated to be accurate and comparable to the manual CTV segmentation in cervical cancer patients when assessed by our three-stage evaluation framework.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is the organized python functions of proposed methods in Yanwen Wang PhD research. Researchers can directly use these functions to conduct spatial+ cross-validation (SP-CV), adversarial-validation-based dissimilarity quantification method (AVD), and dissimilarity-adaptive cross-validation (DA-CV). The description of how to run codes is in Readme.txt. The descriptions of functions are in functions.docx.