8 datasets found
  1. f

    Proportion of counts assigned to either true or spurious OTUs/ASVs.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin (2023). Proportion of counts assigned to either true or spurious OTUs/ASVs. [Dataset]. http://doi.org/10.1371/journal.pone.0227434.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Proportion of counts assigned to either true or spurious OTUs/ASVs.

  2. Data from: Molecular characterisation of faecal bacterial assemblages among...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Sep 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eero J. Vesterinen; Andre Morrill; Mark Forbes; Manu Tamminen; Ilari Sääksjärvi; Kari Kaunisto (2024). Molecular characterisation of faecal bacterial assemblages among four species of syntopic odonates [Dataset]. http://doi.org/10.5061/dryad.08kprr58q
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 27, 2024
    Dataset provided by
    University of Turku
    Carleton University
    Authors
    Eero J. Vesterinen; Andre Morrill; Mark Forbes; Manu Tamminen; Ilari Sääksjärvi; Kari Kaunisto
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Factors such as host species, phylogeny, diet, timing, and location of sampling are thought to influence the composition of gut-associated bacteria in insects. In this study, we compared the faecal-associated bacterial taxa for three Coenagrion and one Enallagma damselfly species. We expected high overlap in representation of bacterial taxa due to the shared ecology and diet of these species. Using metabarcoding based on the 16S rRNA gene, we identified 1513 sequence variants, representing distinct bacterial ‘taxa’. Intriguingly, the damselfly species showed somewhat different magnitudes of richness of ZOTUs, ranging from 480 to 914 ZOTUs. In total, 921 (or 60.8% of the 1513) distinct ZOTUs were non-shared, each found only in one species, and then most often in only a single individual. There was a surfeit of these non-shared incidental ZOTUs in the Enallagma species accounting for it showing the highest bacterial richness and accounting for a sample-wide pattern of more single-species ZOTUs than expected, based on comparisons to the null model. Future studies should address the extent to which faecal bacteria represent non-incidental gut bacteria and whether abundant and shared taxa are true gut symbionts.

    Methods To assess the faecal bacterial assemblages of damselflies, we targeted four predatory odonate species at a freshwater pond of approximately 600 m × 200 m (12 ha), located in Southern Finland (ETRS-TM35FIN N: 67118; E: 2460). On 1–2 June 2016, we collected 185 individuals (20–26 males and females from each species) for faecal DNA analysis. All our focal damselfly species belong to the family Coenagrionidae: Coenagrion lunulatum (Charpentier, 1840), Coenagrion hastulatum (Charpentier, 1825), Coenagrion pulchellum (Vander Linden, 1825), and Enallagma cyathigerum (Charpentier, 1840). Species identification of damselflies was based on current literature, e.g. [27]. These four target species were selected as they were the most common predatory species at the study site, based on pilot surveys (K. Kaunisto, pers. obs.). Only sexually mature individuals with adult colours and hardened wings were included in the study. According to a previous study [28], all four focal species feed mainly on dipteran prey by open foraging flights and by gleaning insects from vegetation.Each damselfly was placed into a sterile 10-ml collection tube housing a piece of dampened paper towel to reduce desiccation risk. To allow for defecation, damselflies were kept in the tubes for the next 24 h (sufficient time for defecation to occur, according to [18]). After the live individuals had defecated into the tube, we froze the entire sample without removing the faeces or the damselfly. All faecal material was collected from the tubes with sterile forceps, after which the faeces were frozen in 15-ml Falcon tubes at −64 °C until further processing and analysis. Sample Processing and Molecular Analysis Total DNA was extracted as described in a previous study using NucleoSpin Tissue XS Kit (product nr 740901, Macherey-Nagel, Düren, Germany) [28]. To characterize the bacterial assemblages of the focal species, we used established metabarcoding protocols for dragonflies building on earlier optimization [1828]. To amplify bacterial 16S rRNA gene (hypervariable region v4), we used primers 515F-Parada (also known as 515FB: 5′-GTG YCA GCM GCC GCG GTA A-3′; Parada et al. 2016) and 806R-Apprill (also known as 806RB: 5′-GGA CTA CNV GGG TWT CTA AT-3′; [29]). Each DNA sample was amplified in two separate reactions that were individually tagged and sequenced. The locus-specific PCR setup followed Kankaanpaa et al. [30] and included 5 μl of 2× MyTaq HS Red Mix (Bioline, UK), 2.4 μl of H2O, 150 nM of each primer (two forward and two reverse primer versions; total primer mix concentration 600 nM), and 2 μl of DNA extract per each sample in 10 μl volume. CycAQ6ling conditions were 3 min at 95 °C, then 35 cycles of 45 s at 95 °C, 1 min at 50 °C, and 1 min 30 s at 72 °C, ending with 10 min at 72 °C. In the second PCR stage, the first PCR products were modified by attaching Illumina-specific adapters and sample-specific indices. For a reaction volume of 10 μl in the indexing PCR, we mixed 5 μl of MyTaq HS RedMix, 500 nM of each tagged and indexed primer (i7 and i5), and 3 μl of locus-specific PCR product from the first PCR phase. For this second PCR, we used the following protocol: initial denaturation for 3 min at 98 °C, then 15 cycles of 20 s at 95 °C, 15 s at 60 °C, and 30 s at 72 °C, followed by 3 min at 72 °C. All the indexed reactions were then pooled and purified using magnetic beads [3132].Sequencing was done on an Illumina MiSeq v3 PE 2×300 (Illumina Inc., San Diego, CA, USA) run, including the PhiX control library by the Turku Centre for Biotechnology, Turku, Finland. After sequencing, the reads were demultiplexed into each original sample and uploaded onto CSC servers (IT Center for Science, https://www.csc.fi/ ) for bioinformatic analysis. Paired-end reads (13,027,754) were merged and trimmed for quality using 64-bit vsearch version 2.14.2 [33] command ‘fastq_mergepairs’ with the default options and ‘fastq_allowmergestagger’. Primers were removed from the merged reads (11,179,018) using software cutadapt version 1.14 (Martin 2011) with 20% mismatch rate, minimum length of 240 bp and truncate length of 270 bp (the excess nucleotides were trimmed from 3′ end). Trimmed reads (11,050,385) reads were then collapsed into unique sequences (singletons removed) with command ‘fastx_uniques’ and option ‘minuniquesize’ set to 10 (49,832 uniques retrieved). Finally, reads were corrected for point errors to obtain an accurate set of amplicon sequences (=denoised) and filtered of chimeric amplicons (=chimeras were removed) resulting in 3803 ZOTUs (‘ZOTU’, ‘zero-radius OTU’) through command ‘unoise3’ using USEARCH version 11.0.667 with settings minsize = 8 and unoise_alpha = 2. The median and mean length of ZOTUs was 253 bp (SD ± 2.50 bp) Then ZOTUs were mapped back to the original trimmed reads with command ‘usearch_global’ to establish the total number of reads in each sample using vsearch. We were able to map 10,627,197 of 11,050,385 (96.17%) to our original samples. The ZOTUs (sequence variants) were assigned to taxa using 16 RDP database with SINTAX (Edgar, 2010) probabilistic algorithm implemented in vsearch. The database ‘16S RDP training set v18’ (21k seqs) was downloaded from the usearch website (https://drive5.com/usearch/manual/sintax_downloads.html; accessed 19th April 2023). For the chosen database, the genus level is the lowest taxonomic level. For any taxonomic level, we only accepted assignations with 100% probability. The data was further filtered to remove artefacts, spurious reads, and non-targets based on information on the numerous control samples, technical replicates, and taxonomy. First, we removed those ZOTUs from any sample that had fewer reads than extraction or PCR controls (9,833,618 reads retained). Then, we collapsed reads based on the taxonomy per each sample, that is, all the reads that were assigned to the same taxa per sample were summarized. Out of the 3803 ZOTUs, we identified 983 to genus, 1570 to family, 2002 to order, 3063 to class, 3319 to phylum, and 3482 to domain level. From the total ~10M reads, we identified 4.0M to genus, 4.4M to family, 8.5M to order, and 9.5M to the higher levels. Then, we removed taxa that were present in a sample by only one of the two replicates and finally summed the reads in both replicates (9,678,663 reads left). Then, to remove potentially leaked ‘tag-jumped’ reads from the data, we removed all taxa from the samples with less than 0.05% proportion of the total reads in one sample (9,636,233 reads saved). We removed all the taxa outside domains Bacteria or Archaea, as well as Class Chloroplast (9,006,117 reads passed the filtering). The non-targets included mainly plants (~6200 reads) and Fungi (~250 reads). Altogether 284,351 reads could not be assigned with the strict 100% probability threshold. Finally, very rare occurrences (sequence count < 20) were removed (9,004,996 final reads).

  3. f

    Inferred ratios of 16S rRNA gene variants.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin (2023). Inferred ratios of 16S rRNA gene variants. [Dataset]. http://doi.org/10.1371/journal.pone.0227434.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Expected ratios (based on known copy numbers of the respective 16S rRNA gene variants) are shown in bold. USEARCH-UNOISE3 could not differentiate the two C. beijerinckii variants. Qiime2-Deblur could not differentiate any of the variants.

  4. f

    Data_Sheet_2_A Systematic Phylogenomic Classification of the Multidrug and...

    • frontiersin.figshare.com
    pdf
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manduparambil Subramanian Nimmy; Vinod Kumar; Backiyarani Suthanthiram; Uma Subbaraya; Ramawatar Nagar; Chellapilla Bharadwaj; Pradeep Kumar Jain; Panneerselvam Krishnamurthy (2023). Data_Sheet_2_A Systematic Phylogenomic Classification of the Multidrug and Toxic Compound Extrusion Transporter Gene Family in Plants.pdf [Dataset]. http://doi.org/10.3389/fpls.2022.774885.s002
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    Frontiers
    Authors
    Manduparambil Subramanian Nimmy; Vinod Kumar; Backiyarani Suthanthiram; Uma Subbaraya; Ramawatar Nagar; Chellapilla Bharadwaj; Pradeep Kumar Jain; Panneerselvam Krishnamurthy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multidrug and toxic compound extrusion (MATE) transporters comprise a multigene family that mediates multiple functions in plants through the efflux of diverse substrates including organic molecules, specialized metabolites, hormones, and xenobiotics. MATE classification based on genome-wide studies remains ambiguous, likely due to a lack of large-scale phylogenomic studies and/or reference sequence datasets. To resolve this, we established a phylogeny of the plant MATE gene family using a comprehensive kingdom-wide phylogenomic analysis of 74 diverse plant species. We identified more than 4,000 MATEs, which were classified into 14 subgroups based on a systematic bioinformatics pipeline using USEARCH, blast+ and synteny network tools. Our classification was performed using a four-step process, whereby MATEs sharing ≥ 60% protein sequence identity with a ≤ 1E-05 threshold at different sequence lengths (either full-length, ≥ 60% length, or ≥ 150 amino acids) or retaining in the similar synteny blocks were assigned to the same subgroup. In this way, we assigned subgroups to 95.8% of the identified MATEs, which we substantiated using synteny network clustering analysis. The subgroups were clustered under four major phylogenetic groups and named according to their clockwise appearance within each group. We then generated a reference sequence dataset, the usefulness of which was demonstrated in the classification of MATEs in additional species not included in the original analysis. Approximately 74% of the plant MATEs exhibited synteny relationships with angiosperm-wide or lineage-, order/family-, and species-specific conservation. Most subgroups evolved independently, and their distinct evolutionary trends were likely associated with the development of functional novelties or the maintenance of conserved functions. Together with the systematic classification and synteny network profiling analyses, we identified all the major evolutionary events experienced by the MATE gene family in plants. We believe that our findings and the reference dataset provide a valuable resource to guide future functional studies aiming to explore the key roles of MATEs in different aspects of plant physiology. Our classification framework can also be readily extendable to other (super) families.

  5. Multiple sequence alignments of sensor histidine kinases and response...

    • zenodo.org
    • data.niaid.nih.gov
    bin
    Updated Jun 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andreas Möglich; Andreas Möglich; Elina Multamäki; Heikki Takala; Janne Ihalainen; Elina Multamäki; Heikki Takala; Janne Ihalainen (2021). Multiple sequence alignments of sensor histidine kinases and response regulators [Dataset]. http://doi.org/10.5281/zenodo.5005587
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 21, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andreas Möglich; Andreas Möglich; Elina Multamäki; Heikki Takala; Janne Ihalainen; Elina Multamäki; Heikki Takala; Janne Ihalainen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The two FASTA files contain multiple sequence alignments of sensor histidine kinase and response regulator sequences. The source sequences were obtained by BLAST, clustered with usearch and aligned with muscle. More details to be found in Multamäki et al. 2021.

  6. d

    Raw sequence data and OTU tables of soil microorganisms obtained across a...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Feb 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pieter De Maayer (2023). Raw sequence data and OTU tables of soil microorganisms obtained across a summit in the Lesotho highlands [Dataset]. http://doi.org/10.5061/dryad.v6wwpzh0j
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 10, 2023
    Dataset provided by
    Dryad
    Authors
    Pieter De Maayer
    Time period covered
    2022
    Description

    Raw sequence data can be handled with a fasta editor. OTU tables can be opened using Microsoft Excel.

  7. d

    Svalbard 2010 mesocosm experiment: Protistan diversity - Dataset - B2FIND

    • b2find.dkrz.de
    Updated Nov 2, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Svalbard 2010 mesocosm experiment: Protistan diversity - Dataset - B2FIND [Dataset]. https://b2find.dkrz.de/dataset/69c30a60-7896-5e0d-b301-c5ec1daaf810
    Explore at:
    Dataset updated
    Nov 2, 2023
    Area covered
    Svalbard
    Description

    Hierarchical clustering. Taxonomic assignment of reads was performed using a preexisting database of SSU rDNA sequences from including XXX reference sequences generated by Sanger sequencing. Experimental amplicons (reads), sorted by abundance, were then concatenated with the reference extracted sequences sorted by decreasing length. All sequences, experimental and referential, were then clustered to 85% identity using the global alignment clustering option of the uclust module from the usearch v4.0 software (Edgar, 2010). Each 85% cluster was then reclustered at a higher stringency level (86%) and so on (87%, 88%,…) in a hierarchical manner up to 100% similarity. Each experimental sequence was then identified by the list of clusters to which it belonged at 85% to 100% levels. This information can be viewed as a matrix with the lines corresponding to different sequences and the columns corresponding to the cluster membership at each clustering level. Taxonomic assignment for a given read was performed by first looking if reference sequences clustered with the experimental sequence at the 100% clustering level. If this was the case, the last common taxonomic name of the reference sequence(s) within the cluster was used to assign the environmental read. If not, the same procedure was applied to clusters from 99% to 85% similarity if necessary, until a cluster was found containing both the experimental read and reference sequence(s), in which case sequences were taxonomically assigned as described above.

  8. f

    Data_Sheet_1_A Systematic Phylogenomic Classification of the Multidrug and...

    • figshare.com
    • frontiersin.figshare.com
    txt
    Updated Jun 6, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manduparambil Subramanian Nimmy; Vinod Kumar; Backiyarani Suthanthiram; Uma Subbaraya; Ramawatar Nagar; Chellapilla Bharadwaj; Pradeep Kumar Jain; Panneerselvam Krishnamurthy (2023). Data_Sheet_1_A Systematic Phylogenomic Classification of the Multidrug and Toxic Compound Extrusion Transporter Gene Family in Plants.FASTA [Dataset]. http://doi.org/10.3389/fpls.2022.774885.s001
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    Frontiers
    Authors
    Manduparambil Subramanian Nimmy; Vinod Kumar; Backiyarani Suthanthiram; Uma Subbaraya; Ramawatar Nagar; Chellapilla Bharadwaj; Pradeep Kumar Jain; Panneerselvam Krishnamurthy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multidrug and toxic compound extrusion (MATE) transporters comprise a multigene family that mediates multiple functions in plants through the efflux of diverse substrates including organic molecules, specialized metabolites, hormones, and xenobiotics. MATE classification based on genome-wide studies remains ambiguous, likely due to a lack of large-scale phylogenomic studies and/or reference sequence datasets. To resolve this, we established a phylogeny of the plant MATE gene family using a comprehensive kingdom-wide phylogenomic analysis of 74 diverse plant species. We identified more than 4,000 MATEs, which were classified into 14 subgroups based on a systematic bioinformatics pipeline using USEARCH, blast+ and synteny network tools. Our classification was performed using a four-step process, whereby MATEs sharing ≥ 60% protein sequence identity with a ≤ 1E-05 threshold at different sequence lengths (either full-length, ≥ 60% length, or ≥ 150 amino acids) or retaining in the similar synteny blocks were assigned to the same subgroup. In this way, we assigned subgroups to 95.8% of the identified MATEs, which we substantiated using synteny network clustering analysis. The subgroups were clustered under four major phylogenetic groups and named according to their clockwise appearance within each group. We then generated a reference sequence dataset, the usefulness of which was demonstrated in the classification of MATEs in additional species not included in the original analysis. Approximately 74% of the plant MATEs exhibited synteny relationships with angiosperm-wide or lineage-, order/family-, and species-specific conservation. Most subgroups evolved independently, and their distinct evolutionary trends were likely associated with the development of functional novelties or the maintenance of conserved functions. Together with the systematic classification and synteny network profiling analyses, we identified all the major evolutionary events experienced by the MATE gene family in plants. We believe that our findings and the reference dataset provide a valuable resource to guide future functional studies aiming to explore the key roles of MATEs in different aspects of plant physiology. Our classification framework can also be readily extendable to other (super) families.

  9. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin (2023). Proportion of counts assigned to either true or spurious OTUs/ASVs. [Dataset]. http://doi.org/10.1371/journal.pone.0227434.t003

Proportion of counts assigned to either true or spurious OTUs/ASVs.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Andrei Prodan; Valentina Tremaroli; Harald Brolin; Aeilko H. Zwinderman; Max Nieuwdorp; Evgeni Levin
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Proportion of counts assigned to either true or spurious OTUs/ASVs.

Search
Clear search
Close search
Google apps
Main menu