Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.
Word Cloud
Frequency of Words
This graph shows a tendency for being about Eindhoven, more specifically, matters of its housing situation, social environments, industry and tech, among other topics.
Word Embeddings Plot
This graph shows us how related words are to each other. The closer one word is to another, the more they are related.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this paper we analyse the word frequency profiles of a set of works from the Shakespearean era to uncover patterns of relationship between them, highlighting the connections within authorial canons. We used a text corpus comprising 256 plays and poems from the 16th and 17th centuries, with 17 works of uncertain authorship. Our clustering approach is based on the Jensen-Shannon divergence and a graph partitioning algorithm, and our results show that authors' characteristic styles are very powerful factors in explaining the variation of word use, frequently transcending cross-cutting factors like the differences between tragedy and comedy, early and late works, and plays and poems. Our method also provides an empirical guide to the authorship of plays and poems where this is unknown or disputed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Scoring rule for the tokens in graph node labels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Significance of authorial affinity observed on clusters obtained with different distance metrics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Information Theory based kNN classification of the works of uncertain authorship.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
High contributor singleton vertices in the text graph and their frequencies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Word frequency statistics and graphs from the paper "Gender Differences in Covid-19 Tweeting in English" based on tweets in English March 10-23, 2020 matching the queries:coronavirus; “corona virus”; COVID-19; COVID19
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The first column list the minimum support, or, in other words, the frequency cut-off values: there are two values: 0.8 or 0.9, i.e., 80% 90%. The second column denotes the righ-, left- or both hippocampi; the abbreviation HPC stands for the word “hippocampus”. In the third column the sex is given; the next four columns contain the number size 1, 2, 3 and 4 frequent neighbor-sets of the hippocampus considered. The next column gives the number of the neighbor-sets, which have significantly different frequencies (p = 0.001) in male and female connectomes. The last, ninth column gives the number of neighbor-sets, which are significantly more frequent in male or in female connectomes: the sum of the two numbers in the ninth column is equal to the number in the eighth column. For example, in the first row, we can see that in males, the left hippocampus has 45 frequent 1-element neighbor sets; 844 frequent 2-element neighbor sets, 9102 3-element neighbor sets and 65150 frequent 4-element neighbor sets, where the frequency cut-off is 0.8. Moreover, one can see that there are 15732 sets, differing significantly in frequency in males and in females; and the last column says that from these 15732 sets, 15497 are present in the braingraph of males and only 235 in the braingraphs of females.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This describes the ontology embeddings of HPO, ORDO, and HOOM for training Onto-CGAN (Paper: Generating Unseen Diseases Patient Data Using Ontology-enhanced Generative Adversarial Networks.)We first combined ORDO, HPO, and HOOM ontologies to create unified resource-capturing relationships between diseases, phenotypes, and other biomedical concepts. The combined ontology is processed by OWL2Vec* to transform into graph-based representations, where nodes represent concepts (e.g., diseases, phenotypes) and edges represent their relationships. We used a random walk method with a depth of 3 to explore the graph structures to capture semantic relationships. Textual annotations (e.g., definitions) are tokenized and processed through a word2vec model. The word2vec model was trained by 10 iterations with a window size of 5 words and a minimum word frequency of 1. The resulting embeddings are 100-dimensional vectors that integrate hierarchical relationships, logical axioms, and textual information from the combined ontology.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Deep learning is an effective method to capture drug-target binding affinity, but low accuracy is still an obstacle to be overcome. Thus, we propose a novel predictor for drug-target binding affinity based on dipeptide frequency of word frequency encoding and a hybrid graph convolutional network. Word frequency characteristics of natural language are used to improve the frequency characteristics of peptides to express target proteins. For each drug molecules, the five different features of drug atoms and the atomic bond relationships are expressed as graphs. The obtained protein features and graph structure are used as the input of convolution neural network and the input of graph convolution neural network, respectively. A prediction model is established to predict the drug affinity by calculating the hidden relationship. In the KIBA data set test experiment, the consistency coefficient of the model is 0.901, which is 0.01 higher than the existing model, and the MSE (mean square error) of the model is 0.126, which is 5% lower than the existing model. In Davis data set test experiment, the consistency coefficient of the model is 0.895, which is 0.006 higher than the existing model, and the MSE of the model is 0.220, which is 4% lower than the existing model. These results show that our proposed method can not only predict the affinity better than those existing models, but also outperform unitary deep learning approaches.