39 datasets found
  1. Assessment of data transformations for model-based clustering of RNA-Seq...

    • plos.figshare.com
    xlsx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janelle R. Noel-MacDonnell; Joseph Usset; Ellen L. Goode; Brooke L. Fridley (2023). Assessment of data transformations for model-based clustering of RNA-Seq data [Dataset]. http://doi.org/10.1371/journal.pone.0191758
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Janelle R. Noel-MacDonnell; Joseph Usset; Ellen L. Goode; Brooke L. Fridley
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Quality control, global biases, normalization, and analysis methods for RNA-Seq data are quite different than those for microarray-based studies. The assumption of normality is reasonable for microarray based gene expression data; however, RNA-Seq data tend to follow an over-dispersed Poisson or negative binomial distribution. Little research has been done to assess how data transformations impact Gaussian model-based clustering with respect to clustering performance and accuracy in estimating the correct number of clusters in RNA-Seq data. In this article, we investigate Gaussian model-based clustering performance and accuracy in estimating the correct number of clusters by applying four data transformations (i.e., naïve, logarithmic, Blom, and variance stabilizing transformation) to simulated RNA-Seq data. To do so, an extensive simulation study was carried out in which the scenarios varied in terms of: how genes were selected to be included in the clustering analyses, size of the clusters, and number of clusters. Following the application of the different transformations to the simulated data, Gaussian model-based clustering was carried out. To assess clustering performance for each of the data transformations, the adjusted rand index, clustering error rate, and concordance index were utilized. As expected, our results showed that clustering performance was gained in scenarios where data transformations were applied to make the data appear “more” Gaussian in distribution.

  2. Transformations in PubChem - Full Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Schymanski, Emma; Bolton, Evan; Cheng, Tiejun; Thiessen, Paul; Zhang, Jian (Jeff); Helmus, Rick; Blanke, Gerd (2025). Transformations in PubChem - Full Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5644560
    Explore at:
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
    StructurePendium Technologies GmbH
    LCSB, Uni Luxembourg
    University of Amsterdam
    Authors
    Schymanski, Emma; Bolton, Evan; Cheng, Tiejun; Thiessen, Paul; Zhang, Jian (Jeff); Helmus, Rick; Blanke, Gerd
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is an archive of the data contained in the "Transformations" section in PubChem for integration into patRoon and other workflows.

    For further details see the ECI GitLab site: README and main "tps" folder.

    Credits:

    Concepts: E Schymanski, E Bolton, J Zhang, T Cheng;

    Code (in R): E Schymanski, R Helmus, P Thiessen

    Transformations: E Schymanski, J Zhang, T Cheng and many contributors to various lists!

    PubChem infrastructure: PubChem team

    Reaction InChI (RInChI) calculations (v1.0): Gerd Blanke (previous versions of these files)

    Acknowledgements: ECI team who contributed to related efforts, especially: J. Krier, A. Lai, M. Narayanan, T. Kondic, P. Chirsir, E. Palm. All contributors to the NORMAN-SLE transformations!

    March 2025 released as v0.2.0 since the dataset grew by >3000 entries! The stats are:

    14 March 2025

    Unique Transformation Entries: 10904# Unique Reactions by CID: 9152# Unique Reactions by IK: 9139# Unique Reactions by IKFB: 8574# Unique NORMAN-SLE Compounds by CID: 8207# Unique ChEMBL Compounds by CID: 1419# Unique Compounds (all) by CID: 9267# Unique Predecessors (all) by CID: 3724# Unique Successors (all) by CID: 7331# Range of XlogP Differences: -9.9,10# Range of Mass Differences: -957.97490813,820.227106427

  3. f

    Data from: Evaluating Functional Diversity: Missing Trait Data and the...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Feb 17, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bryndová, Michala; de Bello, Francesco; Lepš, Jan; Sam, Katerina; Weiss, Matthias; Paal, Taavi; Májeková, Maria; Bishop, Tom R.; Kasari, Liis; Luke, Sarah H.; Götzenberger, Lars; Norberg, Anna; Plowman, Nichola S.; Le Bagousse-Pinguet, Yoann (2016). Evaluating Functional Diversity: Missing Trait Data and the Importance of Species Abundance Structure and Data Transformation [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001507382
    Explore at:
    Dataset updated
    Feb 17, 2016
    Authors
    Bryndová, Michala; de Bello, Francesco; Lepš, Jan; Sam, Katerina; Weiss, Matthias; Paal, Taavi; Májeková, Maria; Bishop, Tom R.; Kasari, Liis; Luke, Sarah H.; Götzenberger, Lars; Norberg, Anna; Plowman, Nichola S.; Le Bagousse-Pinguet, Yoann
    Description

    Functional diversity (FD) is an important component of biodiversity that quantifies the difference in functional traits between organisms. However, FD studies are often limited by the availability of trait data and FD indices are sensitive to data gaps. The distribution of species abundance and trait data, and its transformation, may further affect the accuracy of indices when data is incomplete. Using an existing approach, we simulated the effects of missing trait data by gradually removing data from a plant, an ant and a bird community dataset (12, 59, and 8 plots containing 62, 297 and 238 species respectively). We ranked plots by FD values calculated from full datasets and then from our increasingly incomplete datasets and compared the ranking between the original and virtually reduced datasets to assess the accuracy of FD indices when used on datasets with increasingly missing data. Finally, we tested the accuracy of FD indices with and without data transformation, and the effect of missing trait data per plot or per the whole pool of species. FD indices became less accurate as the amount of missing data increased, with the loss of accuracy depending on the index. But, where transformation improved the normality of the trait data, FD values from incomplete datasets were more accurate than before transformation. The distribution of data and its transformation are therefore as important as data completeness and can even mitigate the effect of missing data. Since the effect of missing trait values pool-wise or plot-wise depends on the data distribution, the method should be decided case by case. Data distribution and data transformation should be given more careful consideration when designing, analysing and interpreting FD studies, especially where trait data are missing. To this end, we provide the R package “traitor” to facilitate assessments of missing trait data.

  4. g

    R-scripts for uncertainty analysis v01

    • gimi9.com
    • researchdata.edu.au
    • +1more
    Updated Apr 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). R-scripts for uncertainty analysis v01 [Dataset]. https://gimi9.com/dataset/au_322c38ef-272f-4e77-964c-a14259abe9cf/
    Explore at:
    Dataset updated
    Apr 13, 2022
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    Abstract This dataset was created within the Bioregional Assessment Programme. Data has not been derived from any source datasets. Metadata has been compiled by the Bioregional Assessment Programme. This dataset contains a set of generic R scripts that are used in the propagation of uncertainty through numerical models. ## Dataset History The dataset contains a set of R scripts that are loaded as a library. The R scripts are used to carry out the propagation of uncertainty through numerical models. The scripts contain the functions to create the statistical emulators and do the necessary data transformations and backtransformations. The scripts are self-documenting and created by Dan Pagendam (CSIRO) and Warren Jin (CSIRO). ## Dataset Citation Bioregional Assessment Programme (2016) R-scripts for uncertainty analysis v01. Bioregional Assessment Source Dataset. Viewed 13 March 2019, http://data.bioregionalassessments.gov.au/dataset/322c38ef-272f-4e77-964c-a14259abe9cf.

  5. u

    NOAA R/V Ron Brown Fourier Transform Infrared Spectroscopy (FTIR) Data

    • data.ucar.edu
    ascii
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lynn Russell (2025). NOAA R/V Ron Brown Fourier Transform Infrared Spectroscopy (FTIR) Data [Dataset]. http://doi.org/10.26023/87N8-35T6-RE0C
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    Lynn Russell
    Time period covered
    Oct 21, 2008 - Nov 29, 2008
    Area covered
    Description

    This file contains the Fourier Transform Infrared Spectroscopy (FTIR) Spectroscopy Data from NOAA R/V Ronald H. Brown ship during VOCALS-REx 2008.

  6. Supplement 1. R code for performing nonlinear regression, with data...

    • wiley.figshare.com
    html
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    E. Carol Adair; Sarah E. Hobbie; Russell K. Hobbie (2023). Supplement 1. R code for performing nonlinear regression, with data (embedded in the R code), and a short description of the program. [Dataset]. http://doi.org/10.6084/m9.figshare.3544664.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Wileyhttps://www.wiley.com/
    Authors
    E. Carol Adair; Sarah E. Hobbie; Russell K. Hobbie
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List nonlinear_regression.R Description The "nonlinear_regression.R" program provides a short example (with data) of one way to perform nonlinear regression in R (version 2.8.1). This example is not meant to provide extensive information on or training in programming in R, but rather is meant to serve as a starting point for performing nonlinear regression in R. R is a free statistical computing and graphics program that may be run on of UNIX platforms, Windows and MacOS. R may be downloaded here: http://www.r-project.org/.

    There are several good
     resources for learning how to program and perform extensive statistical
     analyses in R, including:
    
     Benjamin M. Bolker. Ecological Models and Data in R. Princeton
     University Press, 2008. ISBN 978-0-691-12522-0. [
     http://www.zoology.ufl.edu/bolker/emdbook/ ]
    
     Other references are provided at http://www.r-project.org/ under
    

    “Documentation” and “Books”.

  7. o

    C2Metadata test files

    • openicpsr.org
    spss, zip
    Updated Aug 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Alter (2020). C2Metadata test files [Dataset]. http://doi.org/10.3886/E120642V1
    Explore at:
    spss, zipAvailable download formats
    Dataset updated
    Aug 16, 2020
    Dataset provided by
    ICPSR
    Authors
    George Alter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The C2Metadata (“Continuous Capture of Metadata”) Project automates one of the most burdensome aspects of documenting the provenance of research data: describing data transformations performed by statistical software. Researchers in many fields use statistical software (SPSS, Stata, SAS, R, Python) for data transformation and data management as well as analysis. Scripts used with statistical software are translated into an independent Structured Data Transformation Language (SDTL), which serves as an intermediate language for describing data transformations. SDTL can be used to add variable-level provenance to data catalogs and codebooks and to create “variable lineages” for auditing software operations. This repository provides examples of scripts and metadata for use in testing C2Metadata tools.

  8. k

    Data from: Reproduction Package for the Dissertation on Building...

    • radar.kit.edu
    • radar-service.eu
    tar
    Updated Jun 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heiko Klare (2023). Reproduction Package for the Dissertation on Building Transformation Networks for Consistent Evolution of Interrelated Models [Dataset]. http://doi.org/10.35097/1281
    Explore at:
    tar(1534837248 bytes)Available download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Karlsruhe Institute of Technology
    Klare, Heiko
    Authors
    Heiko Klare
    Description

    Instructions on how to use the data can be found within the repository.

  9. Data and R code from: Global Phanerozoic biodiversity, can variation be...

    • data.niaid.nih.gov
    zip
    Updated Jul 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Phillipi (2024). Data and R code from: Global Phanerozoic biodiversity, can variation be explained by spatial sampling intensity [Dataset]. http://doi.org/10.5061/dryad.2280gb621
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 27, 2024
    Dataset provided by
    Syracuse University
    Authors
    Daniel Phillipi
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Variation in observed global generic richness over the Phanerozoic must be partly explained by changes in the numbers of fossils and their geographic spread over time. The influence of sampling intensity (i.e., the number of samples) has been well addressed, but the extent to which the geographic distribution of samples might influence recovered biodiversity is comparatively unknown. To investigate this question, we create models of genus richness through time by resampling the same occurrence dataset of modern global biodiversity using spatially explicit sampling intensities defined by the paleo-coordinates of fossil occurrences from successive time intervals. Our steady-state null model explains about half of observed change in uncorrected fossil diversity and a quarter of variation in sampling-standardized diversity estimates. The inclusion in linear models of two additional explanatory variables associated with the spatial array of fossil data (absolute latitudinal range of occurrences, percent of occurrences from shallow environments) and a Cenozoic step increase the accuracy of steady-state models, accounting for 67% of variation in sampling-standardized estimates and more than one third of the variation in first differences. Our results make clear that the spatial distribution of samples is at least as important as numerical sampling intensity in determining the trajectory of recovered fossil biodiversity through time, and caution the overinterpretation of both the variation and the trend that emerges from analyses of global Phanerozoic diversity. Methods Fossil data were downloaded from the Palebobiology Database and manually cleaned to remove errors (i.e., non-marine organisms being included in the marine dataset). Modern marine invertebrate data were downloaded from the Ocean Biodiversity Information system using the R API. Further data transformations and statistical analyses were performed on the datasets using the R code provided.

  10. q

    Exploring global reef health and human population using correlation and...

    • qubeshub.org
    Updated May 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Sale (2021). Exploring global reef health and human population using correlation and simple linear regression [Dataset]. http://doi.org/10.25334/0FFY-1047
    Explore at:
    Dataset updated
    May 10, 2021
    Dataset provided by
    QUBES
    Authors
    David Sale
    Description

    In this lesson, students will explore the relationship between reef cover and human disturbance. Students will manipulate a large dataset and perform normality tests, data transformations, correlations, and a simple linear regression in R Studio.

  11. F

    Data from: Solar self-sufficient households as a driving factor for...

    • data.uni-hannover.de
    .zip, r, rdata +2
    Updated Dec 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institut für Kartographie und Geoinformatik (2024). Solar self-sufficient households as a driving factor for sustainability transformation [Dataset]. https://data.uni-hannover.de/dataset/solar-self-sufficient-households-as-a-driving-factor-for-sustainability-transformation
    Explore at:
    r, text/x-sh, rdata, txt, .zipAvailable download formats
    Dataset updated
    Dec 12, 2024
    Dataset authored and provided by
    Institut für Kartographie und Geoinformatik
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    To get the consumption model from Section 3.1, one needs load execute the file consumption_data.R. Load the data for the 3 Phases ./data/CONSUMPTION/PL1.csv, PL2.csv, PL3.csv, transform the data and build the model (starting line 225). The final consumption data can be found in one file for each year in ./data/CONSUMPTION/MEGA_CONS_list.Rdata

    To get the results for the optimization problem, one needs to execute the file analyze_data.R. It provides the functions to compare production and consumption data, and to optimize for the different values (PV, MBC,).

    To reproduce the figures one needs to execute the file visualize_results.R. It provides the functions to reproduce the figures.

    To calculate the solar radiation that is needed in the Section Production Data, follow file calculate_total_radiation.R.

    To reproduce the radiation data from from ERA5, that can be found in data.zip, do the following steps: 1. ERA5 - download the reanalysis datasets as GRIB file. For FDIR select "Total sky direct solar radiation at surface", for GHI select "Surface solar radiation downwards", and for ALBEDO select "Forecast albedo". 2. convert GRIB to csv with the file era5toGRID.sh 3. convert the csv file to the data that is used in this paper with the file convert_year_to_grid.R

  12. U

    Data sets for the Journal of Non-Crystalline Solids X: Article entitled...

    • researchdata.bath.ac.uk
    • search.datacite.org
    c
    Updated May 10, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Philip Salmon; Anita Zeidler (2019). Data sets for the Journal of Non-Crystalline Solids X: Article entitled "Pressure induced structural transformations in amorphous MgSiO_3 and CaSiO_3" [Dataset]. http://doi.org/10.15125/BATH-00601
    Explore at:
    cAvailable download formats
    Dataset updated
    May 10, 2019
    Dataset provided by
    University of Bath
    Authors
    Philip Salmon; Anita Zeidler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Dataset funded by
    University of Bath
    Institut Laue-Langevin
    Royal Society
    United States Department of Energy
    Engineering and Physical Sciences Research Council
    Japan Society for the Promotion of Science
    Atomic Weapons Establishment
    Description

    Data sets used to prepare Figures 1-14 in the Journal of Non-Crystalline Solids X article entitled "Pressure induced structural transformations in amorphous MgSiO_3 and CaSiO_3." The files are labelled according to the figure numbers. The data sets were created using the methodology described in the manuscript. Each of the plots was drawn using QtGrace (https://sourceforge.net/projects/qtgrace/). The data set corresponding to a plotted curve within an QtGrace file can be identified by clicking on that curve. The units for each axis are identified on the plots.

    Figure 1 shows the pressure-volume EOS at room temperature for amorphous and crystalline (a) MgSiO_3 and (b) CaSiO_3.

    Figure 2 shows the pressure dependence of the neutron total structure factor S_{N}(k) for amorphous (a) MgSiO_3 and (b) CaSiO_3.

    Figure 3 shows the pressure dependence of the neutron total pair-distribution function G_{N}(r) for amorphous (a) MgSiO_3 and (b) CaSiO_3.

    Figure 4 shows the pressure dependence of several D′_{N}(r) functions for amorphous MgSiO_3 measured using the D4c diffractometer.

    Figure 5 shows the pressure dependence of the Si-O coordination number in amorphous (a) MgSiO_3 and (b) CaSiO_3, the Si-O bond length in amorphous (c) MgSiO_3 and (d) CaSiO_3, and (e) the fraction of n-fold (n = 4, 5, or 6) coordinated Si atoms in these materials.

    Figure 6 shows the pressure dependence of the M-O (a) coordination number and (b) bond length for amorphous MgSiO_3 and CaSiO_3.

    Figure 7 shows the S_{N}(k) or S_{X}(k) functions for (a) MgSiO_3 and (b) CaSiO_3 after recovery from a pressure of 8.2 or 17.5 GPa.

    Figure 8 shows the G_{N}(r) or G_{X}(r) functions for (a) MgSiO_3 and (b) CaSiO_3 after recovery from a pressure of 8.2 or 17.5 GPa.

    Figure 9 shows the pressure dependence of the Q^n speciation for fourfold coordinated Si atoms in amorphous (a) MgSiO_3 and (b) CaSiO_3.

    Figure 10 shows the pressure dependence in amorphous MgSiO_3 and CaSiO_3 of (a) the overall M-O coordination number and its contributions from M-BO and M-NBO connections, (b) the fractions of M-BO and M-NBO bonds, and (c) the associated M-BO and M-NBO bond distances.

    Figure 11 shows the pressure dependence of the fraction of n-fold (n = 4, 5, 6, 7, 8, or 9) coordinated M atoms in amorphous (a) MgSiO_3 and (b) CaSiO_3.

    Figure 12 shows the pressure dependence of the O-Si-O, Si-O-Si, Si-O-M, O-M-O and M-O-M bond angle distributions (M = Mg or Ca) for amorphous MgSiO_3 (left hand column) and CaSiO_3 (right hand column).

    Figure 13 shows the pressure dependence of the q-parameter distributions for n-fold (n = 4, 5, or 6) coordinated Si atoms in amorphous (a) MgSiO_3 and (b) CaSiO_3.

    Figure 14 shows the pressure dependence of the q-parameter distributions for the M atoms in amorphous MgSiO_3 (left hand column) and CaSiO_3 (right hand column).

  13. f

    Supplement 1. R code demonstrating how to fit a logistic regression model,...

    • figshare.com
    • wiley.figshare.com
    html
    Updated Aug 9, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David I. Warton; Francis K. C. Hui (2016). Supplement 1. R code demonstrating how to fit a logistic regression model, with a random intercept term, and how to use resampling-based hypothesis testing for inference. [Dataset]. http://doi.org/10.6084/m9.figshare.3550407.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 9, 2016
    Dataset provided by
    Wiley
    Authors
    David I. Warton; Francis K. C. Hui
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    File List glmmeg.R: R code demonstrating how to fit a logistic regression model, with a random intercept term, to randomly generated overdispersed binomial data. boot.glmm.R: R code for estimating P-values by applying the bootstrap to a GLMM likelihood ratio statistic. Description glmm.R is some example R code which show how to fit a logistic regression model (with or without a random effects term) and use diagnostic plots to check the fit. The code is run on some randomly generated data, which are generated in such a way that overdispersion is evident. This code could be directly applied for your own analyses if you read into R a data.frame called “dataset”, which has columns labelled “success” and “failure” (for number of binomial successes and failures), “species” (a label for the different rows in the dataset), and where we want to test for the effect of some predictor variable called “location”. In other cases, just change the labels and formula as appropriate. boot.glmm.R extends glmm.R by using bootstrapping to calculate P-values in a way that provides better control of Type I error in small samples. It accepts data in the same form as that generated in glmm.R.

  14. e

    Eximpedia Export Import Trade

    • eximpedia.app
    Updated Oct 13, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seair Exim (2025). Eximpedia Export Import Trade [Dataset]. https://www.eximpedia.app/
    Explore at:
    .bin, .xml, .csv, .xlsAvailable download formats
    Dataset updated
    Oct 13, 2025
    Dataset provided by
    Eximpedia Export Import Trade Data
    Eximpedia PTE LTD
    Authors
    Seair Exim
    Area covered
    San Marino, Kazakhstan, Tajikistan, Saint Vincent and the Grenadines, Nauru, Iran (Islamic Republic of), Palestine, Finland, Bahrain, Denmark
    Description

    Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries

  15. Data and R-scripts for "Identifying leverage points for a sustainable...

    • zenodo.org
    bin, csv
    Updated Oct 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominic A. Martin; Dominic A. Martin (2021). Data and R-scripts for "Identifying leverage points for a sustainable land-use transformation in a a global biodiversity hotspot" (V1) [Dataset]. http://doi.org/10.5281/zenodo.4601600
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Oct 14, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dominic A. Martin; Dominic A. Martin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Transformations towards sustainable land systems require leverage points where land-use policies can benefit people and nature. Here, we present a novel approach that identifies and evaluates leverage points along land-use trajectories, which explicitly incorporate path dependency. We apply the approach in the biodiversity hotspot Madagascar, where smallholder agriculture results in a land-use trajectory reaching from old-growth forests via forest fragments and vanilla agroforests to shifting cultivation. Integrating interdisciplinary empirical data on biodiversity, ecosystem functions and agricultural productivity, we assess trade-offs and co-benefits at three leverage points along the trajectory. We find that leverage points are path-dependent: two leverage points target the transformation of old-growth forests and forest fragments to other land uses and result in considerable trade-offs. In contrast, one leverage point allows for the transformation of land under shifting cultivation into agroforests and offers clear co-benefits. Incorporating path-dependency is essential to identify leverage points for sustainable land-use transformations.

  16. Data applied to automatic method to transform routine otolith images for a...

    • seanoe.org
    image/*
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Andrialovanirina; Alizee Hache; Kelig Mahe; Sébastien Couette; Emilie Poisson Caillault (2022). Data applied to automatic method to transform routine otolith images for a standardized otolith database using R [Dataset]. http://doi.org/10.17882/91023
    Explore at:
    image/*Available download formats
    Dataset updated
    2022
    Dataset provided by
    SEANOE
    Authors
    Nicolas Andrialovanirina; Alizee Hache; Kelig Mahe; Sébastien Couette; Emilie Poisson Caillault
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    fisheries management is generally based on age structure models. thus, fish ageing data are collected by experts who analyze and interpret calcified structures (scales, vertebrae, fin rays, otoliths, etc.) according to a visual process. the otolith, in the inner ear of the fish, is the most commonly used calcified structure because it is metabolically inert and historically one of the first proxies developed. it contains information throughout the whole life of the fish and provides age structure data for stock assessments of all commercial species. the traditional human reading method to determine age is very time-consuming. automated image analysis can be a low-cost alternative method, however, the first step is the transformation of routinely taken otolith images into standardized images within a database to apply machine learning techniques on the ageing data. otolith shape, resulting from the synthesis of genetic heritage and environmental effects, is a useful tool to identify stock units, therefore a database of standardized images could be used for this aim. using the routinely measured otolith data of plaice (pleuronectes platessa; linnaeus, 1758) and striped red mullet (mullus surmuletus; linnaeus, 1758) in the eastern english channel and north-east arctic cod (gadus morhua; linnaeus, 1758), a greyscale images matrix was generated from the raw images in different formats. contour detection was then applied to identify broken otoliths, the orientation of each otolith, and the number of otoliths per image. to finalize this standardization process, all images were resized and binarized. several mathematical morphology tools were developed from these new images to align and to orient the images, placing the otoliths in the same layout for each image. for this study, we used three databases from two different laboratories using three species (cod, plaice and striped red mullet). this method was approved to these three species and could be applied for others species for age determination and stock identification.

  17. f

    Data from: Stepwise Chemical Reduction of [4]Cyclo[4]helicenylene: Stereo...

    • datasetcatalog.nlm.nih.gov
    • acs.figshare.com
    Updated Nov 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang, Yong; Wei, Zheng; Liang, Jianwei; Sato, Sota; Zhang, Zhenyi; Zhou, Zheng (2024). Stepwise Chemical Reduction of [4]Cyclo[4]helicenylene: Stereo Transformation and Site-Selective Metal Complexation [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001402405
    Explore at:
    Dataset updated
    Nov 1, 2024
    Authors
    Yang, Yong; Wei, Zheng; Liang, Jianwei; Sato, Sota; Zhang, Zhenyi; Zhou, Zheng
    Description

    A highly strained macrocycle comprising four [4]helicene panels, [4]cyclo[4]helicenylene ([4]CH, 1), was synthesized through a one-pot macrocyclization and chemically reduced by alkali metals (Na and K), revealing a four-electron reduction process. The resulting di-, tri-, and tetraanions of compound 1 were isolated and crystallographically characterized by X-ray diffraction. Owing to the four axially chiral bi[4]helicenyl fragments, a reversible stereo transformation of 1 between the (S,R,S,R)- and (S,S,R,R)-configurations was disclosed upon the two-electron uptake, which was rationally understood by theoretical calculations. The (S,S,R,R)-configuration of 12– was further stabilized in triply reduced and tetra-reduced states, where structural deformation led by charges and metal complexation was observed. This study proposed an approach to alter the configuration of cycloarylenes in addition to thermal treatment.

  18. m

    Experiment files and measurement parameters for Bruker Invenio-R

    • data.mendeley.com
    Updated Feb 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohd Rashidi Abdull Manap (2024). Experiment files and measurement parameters for Bruker Invenio-R [Dataset]. http://doi.org/10.17632/rp8nthpx4f.1
    Explore at:
    Dataset updated
    Feb 6, 2024
    Authors
    Mohd Rashidi Abdull Manap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In the experiment file (.xpm), the settings and values of the advanced parameters (e.g. resolution, sample scan time, background scan time, spectral range to be used.) are stored. Meanwhile, the phase resolution is stored in the FT. The optic parameters are shown in this experimental condition as well.

  19. r

    Australian Bureau of Statistics Labour Force API

    • researchdata.edu.au
    Updated Feb 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.vic.gov.au (2021). Australian Bureau of Statistics Labour Force API [Dataset]. https://researchdata.edu.au/australian-bureau-statistics-force-api/1675365
    Explore at:
    Dataset updated
    Feb 10, 2021
    Dataset provided by
    data.vic.gov.au
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Australia
    Description

    This RESTful API provides Australian Bureau of Statistics (ABS) labour force data such as employment statistics by region, sex, age groups, and labour utilisation using original, seasonally adjusted and trend markers since 1978.\r \r It connects to an existing ABS API and improves the usability of the information queried from ABS by transforming the SDMX formatted data into a JSON format. This allows developers to consume ABS data easily by using a standard format without requiring time-consuming reformatting and transformation of the data received.\r \r Version 1.0.0\r \r An API key will be issued if you wish to explore and understand the way this API operates.\r \r Access for this API is available via request through developer.vic.gov.au.

  20. f

    Data from: A Graphical Goodness-of-Fit Test for Dependence Models in Higher...

    • tandf.figshare.com
    application/gzip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marius Hofert; Martin Mächler (2023). A Graphical Goodness-of-Fit Test for Dependence Models in Higher Dimensions [Dataset]. http://doi.org/10.6084/m9.figshare.1067049.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Taylor & Francis
    Authors
    Marius Hofert; Martin Mächler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This article introduces a graphical goodness-of-fit test for copulas in more than two dimensions. The test is based on pairs of variables and can thus be interpreted as a first-order approximation of the underlying dependence structure. The idea is to first transform pairs of data columns with the Rosenblatt transform to bivariate standard uniform distributions under the null hypothesis. This hypothesis can be graphically tested with a matrix of bivariate scatterplots, Q-Q plots, or other transformations. Furthermore, additional information can be encoded as background color, such as measures of association or (approximate) p-values of tests of independence. The proposed goodness-of-fit test is designed as a basic graphical tool for detecting deviations from a postulated, possibly high-dimensional, dependence model. Various examples are given and the methodology is applied to a financial dataset. An implementation is provided by the R package copula. Supplementary material for this article is available online, which provides the R package copula and reproduces all the graphical results of this article.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Janelle R. Noel-MacDonnell; Joseph Usset; Ellen L. Goode; Brooke L. Fridley (2023). Assessment of data transformations for model-based clustering of RNA-Seq data [Dataset]. http://doi.org/10.1371/journal.pone.0191758
Organization logo

Assessment of data transformations for model-based clustering of RNA-Seq data

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Janelle R. Noel-MacDonnell; Joseph Usset; Ellen L. Goode; Brooke L. Fridley
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Quality control, global biases, normalization, and analysis methods for RNA-Seq data are quite different than those for microarray-based studies. The assumption of normality is reasonable for microarray based gene expression data; however, RNA-Seq data tend to follow an over-dispersed Poisson or negative binomial distribution. Little research has been done to assess how data transformations impact Gaussian model-based clustering with respect to clustering performance and accuracy in estimating the correct number of clusters in RNA-Seq data. In this article, we investigate Gaussian model-based clustering performance and accuracy in estimating the correct number of clusters by applying four data transformations (i.e., naïve, logarithmic, Blom, and variance stabilizing transformation) to simulated RNA-Seq data. To do so, an extensive simulation study was carried out in which the scenarios varied in terms of: how genes were selected to be included in the clustering analyses, size of the clusters, and number of clusters. Following the application of the different transformations to the simulated data, Gaussian model-based clustering was carried out. To assess clustering performance for each of the data transformations, the adjusted rand index, clustering error rate, and concordance index were utilized. As expected, our results showed that clustering performance was gained in scenarios where data transformations were applied to make the data appear “more” Gaussian in distribution.

Search
Clear search
Close search
Google apps
Main menu