In our changing world, it is critical to understand and predict plant community responses to global change drivers. Plant functional traits promise to be a key predictive tool for many ecosystems, including grasslands, however their use requires both complete plant community and functional trait data. Yet, representation of these data in global databases is incredibly sparse, particularly beyond a handful of most used traits and common species. Here we present the CoRRE Trait Database, spanning 17 traits (9 categorical, 8 continuous) anticipated to predict species’ responses to global change for 4,079 vascular plant species across 173 plant families present in 390 grassland experiments from around the world. The database contains complete categorical trait records for all 4,079 plant species, obtained from a comprehensive literature search. Additionally, the database contains nearly complete coverage (99.97%) of species mean values for continuous traits for a subset of 2,927 plant species, predicted from observed trait data drawn from TRY and a variety of other plant trait databases using Bayesian Probabilistic Matrix Factorization (BHPMF) and multivariate imputation using chained equations (MICE). These data will shed light on mechanisms underlying population, community, and ecosystem responses to global change in grasslands worldwide.
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
This data repository provides the Food and Agriculture Biomass Input Output (FABIO) database, a global set of multi-regional physical supply-use and input-output tables covering global agriculture and forestry.
The work is based on mostly freely available data from FAOSTAT, IEA, EIA, and UN Comtrade/BACI. FABIO currently covers 191 countries + RoW, 118 processes and 125 commodities (raw and processed agricultural and food products) for 1986-2013. All R codes and auxilliary data are available on GitHub. For more information please refer to https://fabio.fineprint.global.
The database consists of the following main components, in compressed .rds format:
A description of the included countries and commodities (i.e. the rows and columns of the Z matrix) can be found in the auxiliary file io_codes.csv. Separate lists of the country sample (including ISO3 codes and continental grouping) and commodities (including moisture content) are given in the files regions.csv and items.csv, respectively. For information on the individual processes, see auxiliary file su_codes.csv. RDS files can be opened in R. Information on how to read these files can be obtained here: https://www.rdocumentation.org/packages/base/versions/3.6.2/topics/readRDS
Except of X.rds, which contains a matrix, all variables are organized as lists, where each element contains a sparse matrix. Please note that values are always given in physical units, i.e. tonnes or head, as specified in items.csv. The suffixes value and mass only indicate the form of allocation chosen for the construction of the symmetric IO tables (for more details see Bruckner et al. 2019). Product, process and country classifications can be found in the file fabio_classifications.xlsx.
Footprint results are not contained in the database but can be calculated, e.g. by using this script: https://github.com/martinbruckner/fabio_comparison/blob/master/R/fabio_footprints.R
How to cite:
To cite FABIO work please refer to this paper:
Bruckner, M., Wood, R., Moran, D., Kuschnig, N., Wieland, H., Maus, V., Börner, J. 2019. FABIO – The Construction of the Food and Agriculture Input–Output Model. Environmental Science & Technology 53(19), 11302–11312. DOI: 10.1021/acs.est.9b03554
License:
This data repository is distributed under the CC BY-NC-SA 4.0 License. You are free to share and adapt the material for non-commercial purposes using proper citation. If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original. In case you are interested in a collaboration, I am happy to receive enquiries at martin.bruckner@wu.ac.at.
Known issues:
The underlying FAO data have been manipulated to the minimum extent necessary. Data filling and supply-use balancing, yet, required some adaptations. These are documented in the code and are also reflected in the balancing item in the final demand matrices. For a proper use of the database, I recommend to distribute the balancing item over all other uses proportionally and to do analyses with and without balancing to illustrate uncertainties.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
As microRNAs (miRNAs) have been reported to be a type of novel high-value small molecule (SM) drug targets for disease treatments, many researchers are engaged in the field of exploring new SM–miRNA associations. Nevertheless, because of the high cost, adopting traditional biological experiments constrains the efficiency of discovering new associations between SMs and miRNAs. Therefore, as an important auxiliary tool, reliable computational models will be of great help to reveal SM–miRNA associations. In this article, we developed a computational model of sparse learning and heterogeneous graph inference for small molecule–miRNA association prediction (SLHGISMMA). Initially, the sparse learning method (SLM) was implemented to decompose the SM–miRNA adjacency matrix. Then, we integrated the reacquired association information together with the similarity information of SMs and miRNAs into a heterogeneous graph to infer potential SM–miRNA associations. Here, the main innovation of SLHGISMMA lies in the introduction of SLM to eliminate noises of the original adjacency matrix to some extent, which plays an important role in performance improvement. In addition, to assess SLHGISMMA’ performance, four different kinds of cross-validations were performed based on two datasets. As a result, based on dataset 1 (dataset 2), SLHGISMMA achieved area under the curves of 0.9273 (0.7774), 0.9365 (0.7973), 0.7703 (0.6556), and 0.9241 ± 0.0052 (0.7724 ± 0.0032) in global leave-one-out cross-validation (LOOCV), miRNA-fixed local LOOCV, SM-fixed local LOOCV, and 5-fold cross-validation, respectively. Moreover, in the case study on three important SMs via removing their known associations, the results showed that most of the top 50 predicted miRNAs were confirmed by the database SM2miR v1.0 or the experimental literature.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recently, a growing number of biological research and scientific experiments have demonstrated that microRNA (miRNA) affects the development of human complex diseases. Discovering miRNA-disease associations plays an increasingly vital role in devising diagnostic and therapeutic tools for diseases. However, since uncovering associations via experimental methods is expensive and time-consuming, novel and effective computational methods for association prediction are in demand. In this study, we developed a computational model of Matrix Decomposition and Heterogeneous Graph Inference for miRNA-disease association prediction (MDHGI) to discover new miRNA-disease associations by integrating the predicted association probability obtained from matrix decomposition through sparse learning method, the miRNA functional similarity, the disease semantic similarity, and the Gaussian interaction profile kernel similarity for diseases and miRNAs into a heterogeneous network. Compared with previous computational models based on heterogeneous networks, our model took full advantage of matrix decomposition before the construction of heterogeneous network, thereby improving the prediction accuracy. MDHGI obtained AUCs of 0.8945 and 0.8240 in the global and the local leave-one-out cross validation, respectively. Moreover, the AUC of 0.8794+/-0.0021 in 5-fold cross validation confirmed its stability of predictive performance. In addition, to further evaluate the model's accuracy, we applied MDHGI to four important human cancers in three different kinds of case studies. In the first type, 98% (Esophageal Neoplasms) and 98% (Lymphoma) of top 50 predicted miRNAs have been confirmed by at least one of the two databases (dbDEMC and miR2Disease) or at least one experimental literature in PubMed. In the second type of case study, what made a difference was that we removed all known associations between the miRNAs and Lung Neoplasms before implementing MDHGI on Lung Neoplasms. As a result, 100% (Lung Neoplasms) of top 50 related miRNAs have been indexed by at least one of the three databases (dbDEMC, miR2Disease and HMDD V2.0) or at least one experimental literature in PubMed. Furthermore, we also tested our prediction method on the HMDD V1.0 database to prove the applicability of MDHGI to different datasets. The results showed that 50 out of top 50 miRNAs related with the breast neoplasms were validated by at least one of the three databases (HMDD V2.0, dbDEMC, and miR2Disease) or at least one experimental literature.
Library of Wroclaw University of Science and Technology scientific output (DONA database)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
In our changing world, it is critical to understand and predict plant community responses to global change drivers. Plant functional traits promise to be a key predictive tool for many ecosystems, including grasslands, however their use requires both complete plant community and functional trait data. Yet, representation of these data in global databases is incredibly sparse, particularly beyond a handful of most used traits and common species. Here we present the CoRRE Trait Database, spanning 17 traits (9 categorical, 8 continuous) anticipated to predict species’ responses to global change for 4,079 vascular plant species across 173 plant families present in 390 grassland experiments from around the world. The database contains complete categorical trait records for all 4,079 plant species, obtained from a comprehensive literature search. Additionally, the database contains nearly complete coverage (99.97%) of species mean values for continuous traits for a subset of 2,927 plant species, predicted from observed trait data drawn from TRY and a variety of other plant trait databases using Bayesian Probabilistic Matrix Factorization (BHPMF) and multivariate imputation using chained equations (MICE). These data will shed light on mechanisms underlying population, community, and ecosystem responses to global change in grasslands worldwide.