100+ datasets found
  1. Barcode sequence matrix

    • figshare.com
    xlsx
    Updated Jun 6, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura M Boggs; Melissa KR Scheible; Gustavo Machado; Kelly A Meiklejohn (2019). Barcode sequence matrix [Dataset]. http://doi.org/10.6084/m9.figshare.8218997.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 6, 2019
    Dataset provided by
    figshare
    Authors
    Laura M Boggs; Melissa KR Scheible; Gustavo Machado; Kelly A Meiklejohn
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of 572 barcode sequences used in statistical analyses. The region targeted is noted, along with the number of reads recovered for that sequence in each sample (n, 50). Corresponding scientific paper: Boggs, L.M.; Scheible, M.K.; Machado, G.; Meiklejohn, K.A. Single Fragment or Bulk Soil DNA Metabarcoding: Which is Better for Characterizing Biological Taxa Found in Surface Soils for Sample Separation? Genes 2019, 10, 431

  2. Z

    Aligned DNA sequence matrix for phylogenetic analyses of the article "A...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santiago Ron (2020). Aligned DNA sequence matrix for phylogenetic analyses of the article "A bizarre new species of Lynchius (Amphibia, Anura, Strabomantidae) from the Andes of Ecuador and first report of Lynchius parkeri in Ecuador" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_1477904
    Explore at:
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Santiago Ron
    María J. Navarrete
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Ecuador, Andes
    Description

    Aligned DNA sequence matrix for phylogenetic analyses of the article "A bizarre new species of Lynchius (Amphibia, Anura, Strabomantidae) from the Andes of Ecuador and first report of Lynchius parkeri in Ecuador"

    The matrix is in NEXUS format. Genes are arranged as follows:

    RAG1: 1-652 Tyrosinase: 653-1195 12S RNA: 1196-2242 tRNA Val: 2243-2313 16S RNA: 2314-3994 tRNA Leu = 3995-4065 ND1: 4066-5026 tRNA Ile: 5027-5144;

  3. f

    Pan Matrix data

    • f1000.figshare.com
    txt
    Updated Jan 12, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lars Snipen; David Ussery (2016). Pan Matrix data [Dataset]. http://doi.org/10.6084/m9.figshare.103707.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 12, 2016
    Dataset provided by
    f1000research.com
    Authors
    Lars Snipen; David Ussery
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The file pan_matrix.txt is a huge table (tab-separated columns) where each row corresponds to a genome and each column to a domain sequences family. The rows are named by the BIOID-code, see map_ecoli.txt to look up the strain names. The columns are named Cluster 1, Cluster 2,...etc. The corresponding Pfam-A domain sequence is given in the file cluster_info.txt (see below). In cell (i,j) in this table you find the number of occurrences that domain sequence j has in genome number i.

  4. d

    Monotropoid Ericaceae 102-locus nuclear sequences matrices and plastid locus...

    • search.dataone.org
    • search-sandbox-2.test.dataone.org
    • +2more
    Updated Apr 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Freudenstein; Michael Broe (2024). Monotropoid Ericaceae 102-locus nuclear sequences matrices and plastid locus sequence matrix [Dataset]. https://search.dataone.org/view/sha256%3Adc460f44d7772808ba49e87d1f4afbe14a4b3ef4094c6ea405546b1d69853828
    Explore at:
    Dataset updated
    Apr 26, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    John Freudenstein; Michael Broe
    Time period covered
    Jan 1, 2023
    Description

    Monotropoideae (Ericaceae) is a wholly leafless and holomycotrophic group of primarily temperate herbs with centers of diversity in western North America and east Asia.   The eleven genera are structurally diverse and also vegetatively reduced, making relationships difficult to assess based on morphology. Previous molecular analyses have focused primarily on segments of the ribosomal RNA repeat and yielded sometimes conflicting topologies. We employed a genomic sampling approach to obtain 102 nuclear loci and plastid coding loci for nine of the genera, as well as sampling ITS-26S and plastid rps2 for a broader set of accessions via PCR and Sanger sequencing Data filtering for character completeness had a clear effect on relationships and branch support. Nuclear and plastid loci agree on a topology that resolves Allotropa and Hemitomes as sisters and Monotropsis sister to Eremotropa+Monotropa+Monotropastrum, relationships that were unclear from previous analyses. Hypopitys should be ..., Data were collected using Illumina sequencing and a low-coverage genome skimming approach., , # Monotropoid Ericaceae 102-locus nuclear sequences matrices and plastid locus sequence matrix

    https://doi.org/10.5061/dryad.7h44j1017

    These matrices were derived from genome-skimming runs using Illumina sequencing technology. After the pools of reads were obtained, they were mapped to Angiosperm353 target sequences from Monotropa uniflora, which were obtained from the supplementary data in the paper that described that probe set (Johnson et al., 2019, Systematic Biology 68: 594-606). This allowed us to recover sequences from our reads that matched the Angiosperm353 orthologs. 102 of these were assembled into a concatenated dataset for analysis of monotropoid relationships and filtered to different levels of individual base-position completeness: 100% complete, 80% complete, 50% complete, and unfiltered (all data included). These matrices are provided in NEXUS format.

    We also assembled plastid genomes from the skimming reads and de novo mapp...

  5. Aligned DNA sequence matrix for phylogenetic analyses in the article...

    • zenodo.org
    bin
    Updated Mar 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santiago Ron; Santiago Ron (2020). Aligned DNA sequence matrix for phylogenetic analyses in the article "Description and phylogenetic relationships of a new trans-Andean species of Elachistocleis Parker 1927 (Amphibia, Anura, Microhylidae)" [Dataset]. http://doi.org/10.5281/zenodo.3722518
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 21, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Santiago Ron; Santiago Ron
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Aligned DNA sequence matrix for phylogenetic analyses in the article "Description and phylogenetic relationships of a new trans-Andean species of Elachistocleis Parker 1927 (Amphibia, Anura, Microhylidae)"

    Gene partitions are arranged as follows (tRNAs are included as part of larger adjacent genes):

    16S = 1-1165;
    BDNFcodonPos1 = 1166 - 1874\3;
    BDNFcodonPos2 = 1167 - 1875\3;
    BDNFcodonPos3 = 1168 - 1876\3;
    cmyccodonPos1 = 1878 - 2319\3;
    cmyccodonPos2 = 1879 - 2320\3;
    cmyccodonPos3 = 1877 - 2318\3;
    CO1codonPos1 = 2321 - 2981\3;
    CO1codonPos2 = 2322 - 2979\3;
    CO1codonPos3 = 2323 - 2980\3;
    histcodonPos1 = 2983 - 3307\3;
    histcodonPos2 = 2984 - 3308\3;
    histcodonPos3 = 2982 - 3309\3;
    siacodonPos1 = 3311 - 3704\3;
    siacodonPos2 = 3312 - 3705\3;
    siacodonPos3 = 3310 - 3706\3;
    tyrcodonPos1 = 3708 - 4263\3;
    tyrcodonPos2 = 3709 - 4264\3;
    tyrcodonPos3 = 3707 - 4262\3;
    28S = 4265-5084;
    12S = 5085-6171;

  6. Sequence matrix and tree files for Spatial phylogenetics of the native...

    • zenodo.org
    • datadryad.org
    bin
    Updated Jun 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Thornhill; Andrew Thornhill; Bruce Baldwin; William Freyman; Sonia Nosratinia; Matthew Kling; Naia Morueta-Holme; Thomas Madsen; David Ackerly; Brent Mishler; Bruce Baldwin; William Freyman; Sonia Nosratinia; Matthew Kling; Naia Morueta-Holme; Thomas Madsen; David Ackerly; Brent Mishler (2022). Sequence matrix and tree files for Spatial phylogenetics of the native California flora (Thornhill et al. BMC Biology) [Dataset]. http://doi.org/10.6078/d1vd4p
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 2, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew Thornhill; Andrew Thornhill; Bruce Baldwin; William Freyman; Sonia Nosratinia; Matthew Kling; Naia Morueta-Holme; Thomas Madsen; David Ackerly; Brent Mishler; Bruce Baldwin; William Freyman; Sonia Nosratinia; Matthew Kling; Naia Morueta-Holme; Thomas Madsen; David Ackerly; Brent Mishler
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    California
    Description

    This is the DNA sequence alignment for the 1083 OTUs used in Thornhill et al. 2017. We defined monophyletic OTUs at the finest scale possible given data availability and current understanding of the evolutionary relationships of Californian plant lineages. Using all 5258 species described in The Jepson eFlora (http://ucjeps.berkeley.edu/eflora/) as a starting point, a thorough literature search was undertaken to find molecular phylogenetic studies that had included California taxa. Genera were split into finer level OTUs if robust evidence existed for monophyly of subclades and representative DNA data either were available in GenBank or could be generated within the scope of the project. Genera were lumped in a few cases if recent evidence showed that one is nested in another. In total, 1083 OTUs were defined to include the 5258 binomials (Table S2 in Thornhill et al. 2017 details the OTU to which each binomial was assigned).

  7. Aligned DNA sequence matrix for phylogenetic analyses in the article "A new...

    • zenodo.org
    bin
    Updated Aug 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Santiago Ron; Santiago Ron (2024). Aligned DNA sequence matrix for phylogenetic analyses in the article "A new glassfrog of the genus Centrolene (Amphibia: Centrolenidae) from the Subandean Kutukú Cordillera, eastern Ecuador" [Dataset]. http://doi.org/10.5281/zenodo.10719694
    Explore at:
    binAvailable download formats
    Dataset updated
    Aug 30, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Santiago Ron; Santiago Ron
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Feb 2024
    Description

    Aligned DNA sequence matrix for phylogenetic analyses of the article "A new glassfrog of the genus Centrolene (Amphibia: Centrolenidae) from the Subandean Kutukú Cordillera, eastern Ecuador"

    The matrix is in NEXUS format and has 6626 bp and 239 terminals.

    Partitions are as follows:

    charset 12S = 1-967;
    charset 16S = 968-2129;
    charset BNDFcodonPos1 = 2132-2828\3;
    charset BNDFcodonPos2 = 2130-2829\3;
    charset BNDFcodonPos3 = 2131-2827\3;
    charset ND1codonPos1 = 2831-3785\3;
    charset ND1codonPos2 = 2832-3786\3;
    charset ND1codonPos3 = 2830-3787\3;
    charset CXCR4codonPos1 = 3789-4143\3;
    charset CXCR4codonPos2 = 3790-4141\3;
    charset CXCR4codonPos3 = 3788-4142\3;
    charset CMYCcodonPos1 = 4144-4546\3;
    charset CMYCcodonPos2 = 4145-4547\3;
    charset CMYCcodonPos3 = 4146-4548\3;
    charset POMCcodonPos1 = 4550-5159\3;
    charset POMCcodonPos2 = 4551-5160\3;
    charset POMCcodonPos3 = 4549-5161\3;
    charset RAG1codonPos1 = 5162-5615\3;
    charset RAG1codonPos2 = 5163-5616\3;
    charset RAG1codonPos3 = 5164-5617\3;
    charset SLC8A1codonPos1 = 5619-6159\3;
    charset SLC8A1codonPos2 = 5620-6157\3;
    charset SLC8A1codonPos3 = 5618-6158\3;
    charset SLC8A3codonPos1 = 6161-6626\3;
    charset SLC8A3codonPos2 = 6162-6624\3;
    charset SLC8A3codonPos3 = 6160-6625\3;

  8. u

    Data from: Weighing matrices and sequences

    • ro.uow.edu.au
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Seberry (2024). Weighing matrices and sequences [Dataset]. https://ro.uow.edu.au/articles/dataset/Weighing_matrices_and_sequences/27676044
    Explore at:
    Dataset updated
    Nov 12, 2024
    Dataset provided by
    University of Wollongong
    Authors
    Jennifer Seberry
    License

    https://uow.libguides.com/uow-ro-copyright-all-rights-reservedhttps://uow.libguides.com/uow-ro-copyright-all-rights-reserved

    Description

    Hadamard mathematical matrices for weighing.

  9. f

    Carabus RAD matrix

    • figshare.com
    zip
    Updated Jun 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomochika Fujisawa; Masataka Sasabe; Nobuaki Nagata; Yasuoki Takami; Teiji Sota (2019). Carabus RAD matrix [Dataset]. http://doi.org/10.6084/m9.figshare.7546532.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 5, 2019
    Dataset provided by
    figshare
    Authors
    Tomochika Fujisawa; Masataka Sasabe; Nobuaki Nagata; Yasuoki Takami; Teiji Sota
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A RAD sequence matrix in phylip format of 17 Carabus species (89 individual samples), assembled with pyRAD. pyRAD parameters: sequence similarity=70% and minimum number of taxa=45.

  10. f

    Additional file 1: of Simple adjustment of the sequence weight algorithm...

    • springernature.figshare.com
    txt
    Updated Jun 3, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Toshiyuki Oda; Kyungtaek Lim; Kentaro Tomii (2023). Additional file 1: of Simple adjustment of the sequence weight algorithm remarkably enhances PSI-BLAST performance [Dataset]. http://doi.org/10.6084/m9.figshare.c.3794830_D1.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    figshare
    Authors
    Toshiyuki Oda; Kyungtaek Lim; Kentaro Tomii
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The ascii pssm file made from MSA-A using PSI-BLAST. (ASCII 5 kb)

  11. d

    Beyond mutations: accounting for selection and self-organization in the...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Mar 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georg F. Weber; Xiaoyong Wu; Shesh N. Rai (2024). Beyond mutations: accounting for selection and self-organization in the analysis of protein evolution [Dataset]. https://search.dataone.org/view/sha256%3Ae22f3af1fa00f7a87685c2679111181baf0ada17c717743c46402fa409ccf558
    Explore at:
    Dataset updated
    Mar 2, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Georg F. Weber; Xiaoyong Wu; Shesh N. Rai
    Time period covered
    Jan 1, 2024
    Description

    Molecular phylogenetic research has relied on the analysis of the coding sequences by genes or of the amino acid sequences by the encoded proteins. Enumerating the numbers of mismatches, being indicators of mutation, has been central to pertinent algorithms. However, the constraining forces of selection and self-organization have been unaccounted for in conventional approaches, possibly causing available models to fall short of representing the actual evolutionary history. Specific amino acids possess quantifiable characteristics that enable the conversion from “words†(strings of letters denoting amino acids or bases) to “waves†(strings of quantitative values representing the physico-chemical properties) or to matrices (coordinates representing the positions in a comprehensive property space). The application of such numerical representations to evolutionary analysis takes into account not only mutation but also selection/self-organization as influences that drive speciation, because ..., , , # Beyond Mutations: Accounting for Selection and Self-Organization in the Analysis of Protein Evolution

    https://doi.org/10.5061/dryad.tht76hf63

    Description of the data and file structure

    Publicly accessible sequences were collected from the NCBI landmark model organisms and then sought to add representatives of diverse clades from NCBI nucleotide.

    Sharing/Access information

    Data was derived from the following sources:

    • NCBI

    Code/Software

    NA

  12. Genetic data for Aurelia systematics

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Lawley; Edgar Gamero-Mora; Maximiliano Maronna; Luciano M. Chiaverano; Sergio N. Stampar; Russell R. Hopcroft; Allen Collins; André Morandini (2023). Genetic data for Aurelia systematics [Dataset]. http://doi.org/10.6084/m9.figshare.14502474.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Jonathan Lawley; Edgar Gamero-Mora; Maximiliano Maronna; Luciano M. Chiaverano; Sergio N. Stampar; Russell R. Hopcroft; Allen Collins; André Morandini
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sequence sets, alignments and tree files associated with Lawley, J.W., Gamero-Mora, E., Maronna, M.M., Chiaverano, L.M., Stampar, S.N., Hopcroft, R.R., Collins, A.G., Morandini, A.C. 2021. The importance of molecular characters when morphological variability hinders diagnosability: systematics of the moon jellyfish genus Aurelia (Cnidaria: Scyphozoa). PeerJ https://doi.org/10.7717/peerj.11954.See below a description of the attached files. When applicable, just replace the word (Marker) for 16S, COI, ITS1 or 28S (for COI files, disregard the "-ip_ia1" that appears in parentheses in filenames). For relevant codes used for molecular analyses based on these files see github.com/lawleyjw/Aurelia.- (Marker)-Aurelia-seqs.fasta - Sequence set used to generate (Marker)-Aurelia(-ip_ia1).fasta. Sequence IDs normally appear as ">(isolate/acession)-(previous species ID)-(sampling locality)".- (Marker)-Aurelia-seqs-UPDATED.fasta - Same sequence set as above, but sequence IDs appear with updated species names as ">(accession)-(updated species ID)|(previous sequence ID, as in (Marker)-Aurelia-seqs.fasta above)".- (Marker)-Aurelia(-ip_ia1).fasta - Alignment used to generate (Marker)-Aurelia.tre. Sequence IDs normally appear as ">(isolate/acession)-(previous species ID)-(sampling locality)".- (Marker)-Aurelia(-ip_ia1)-UPDATED.fasta - Same alignment as above, but sequence IDs appear with updated species names as ">(accession)-(updated species ID)|(previous sequence ID, as in (Marker)-Aurelia(-ip_ia1).fasta above)".- (Marker)-Aurelia.tre - Parsimony tree file in Newick format with branch lengths, derived from (Marker)-Aurelia(-ip_ia1).fasta; for Goodman-Bremer support values and bootstrap resampling frequencies see Fig. S4-S7 in Lawley et al. (2021).- concat-Aurelia.nex - Concatenated sequence matrix in Nexus file (from Sequence Matrix), including sequences of all markers (based on single-marker alignments) for some representative specimens of each species. This file was used to generate concat-Aurelia.tre and concatML-Aurelia.tre. For details on species composition and species ID see Table S5 in Lawley et al. (2021).- concat-Aurelia.tre - Concatenated parsimony tree file in Newick format with branch lengths, derived from concat-Aurelia.nex. For Goodman-Bremer support values and bootstrap resampling frequencies see Fig. 9 in Lawley et al. (2021).- concatML-Aurelia.tre - Concatenated maximum likelihood tree file in Nexus format (from FigTree), derived from concat-Aurelia.nex, including SH-aLRT and ultrafast bootstrap values, respectively (see Fig. S3 in Lawley et al., 2021).

  13. Datasets for phylogenetic analyses and phylogenetic trees for: Genetic...

    • zenodo.org
    bin, txt
    Updated Nov 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mariana L. Barone; Jeremy Dean Wilson; Jeremy Dean Wilson; Martin J. Ramirez; Martin J. Ramirez; Mariana L. Barone (2024). Datasets for phylogenetic analyses and phylogenetic trees for: Genetic barcodes for species identification and phylogenetic estimation in ghost spiders (Araneae: Anyphaenidae: Amaurobioidinae). Invertebrate Systematics, 2024 [Dataset]. http://doi.org/10.5281/zenodo.14035511
    Explore at:
    bin, txtAvailable download formats
    Dataset updated
    Nov 5, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mariana L. Barone; Jeremy Dean Wilson; Jeremy Dean Wilson; Martin J. Ramirez; Martin J. Ramirez; Mariana L. Barone
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 2024
    Description

    We combined the COI sequence data with legacy multigene sequence data to create a new, taxon-rich phylogeny for the Amaurobioidinae. We used sequences for four loci that have been used in previous studies on the subfamily: two mitochondrial loci, COI (658bp) and ribosomal subunit 16S (16S, 410bp); and two nuclear loci, Histone H3 (H3, 327bp) and ribosomal subunit 28S (28S, 839bp). We complemented the Amaurobioidinae data with sequences from several non-amaurobioidine anyphaenids and two clubionids as outgroups. Sequence alignment was performed using the MAFFT (ver. 7.308) plugin in Geneious, allowing MAFFT to automatically select an appropriate alignment strategy based on the properties of each locus, or with the online MAFFT server (https://mafft.cbrc.jp), which consistently selected the L-INS-i algorithm. Finally, alignments of the four loci were concatenated to construct a 2234 bp multigene sequence matrix containing 692 taxa, with about 55% missing/gap data (“full” matrix henceforth). To ensure that excessive missing data did not affect the resulting topology, we also constructed a reduced matrix by removing additional COI-only specimens so that each species and morphotype was represented by just one or two specimens for which all loci were available (where possible). After realignment, this reduced matrix was 2235 bp long, included 167 taxa, and had about 22% missing/gap data (“reduced” matrix henceforth). Phylogenetic analyses under maximum likelihood, including model selection, were then conducted with IQ-TREE 2. We performed phylogenetic analyses on both concatenated matrices (the full matrix and the reduced matrix) and on each individual locus. For model selection, we provided an initial scheme that partitioned the matrix by locus, and further partitioned the protein-coding loci (COI and H3) by codon position. We used ModelFinder and searched for the best partition scheme, all in IQ-TREE. The best models (partitions) for the full dataset were: GTR+F+I+G4 (16S), GTR+F+I+I+R4 (28S), TVM+F+I+I+R2 (COI-1), TIM2+F+R4 (COI-2), GTR+F+R5 (COI-3), TVMe+G4 (H3-1-H3-2), SYM+G4 (H3-3); and for the reduced dataset: GTR+F+I+G4 (16S), GTR+F+I+G4: (28S), GTR+F+I+G4: (COI-2), GTR+F+I+G4: (COI-3), TVM+F+I+G4: (COI-1, H3-2), GTR+F+I+G4: (H3-1), GTR+F+I+G4: (H3-3). For each dataset, once the best models and partitions were defined, we executed 10 independent replicates of tree calculations followed by 1000 ultrafast bootstrap replicates, and the replicate reaching the maximum likelihood was chosen. Phylogenetic analyses under parsimony were made with TNT, under equal weights, using the “new technology” search with default values, asking for 10 independent hits to the minimal length, and submitting the resulting trees to a round of TBR branch swapping.

  14. d

    Data from: Integrating UCE phylogenomics with traditional taxonomy reveals a...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Dec 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michael G. Branstetter; John T. Longino (2021). Integrating UCE phylogenomics with traditional taxonomy reveals a trove of New World Syscia species (Formicidae, Dorylinae) [Dataset]. http://doi.org/10.5061/dryad.08kprr50s
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 29, 2021
    Dataset provided by
    Dryad
    Authors
    Michael G. Branstetter; John T. Longino
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2020
    Description

    We have deposited data and results files that support the molecular phylogenetic analyses presented in the study. Raw Illumina reads and contigs representing UCE loci have been deposited at the NCBI Sequence Read Archive and GenBank, respectively (BioProject# PRJNA615631). All newly generated COI sequences have been deposited at GenBank (MT267540-MT267668). Here we have deposited the concatenated UCE matrix, the COI matrix, all Trinity contigs, all tree files, unfiltered alignment files, and additional data analysis files (partitioning schemes, log files). The methods used to generate these data are described below and in the accompanying paper.

    DNA sequence generation: We selected 130 specimens for inclusion in molecular phylogenomic analysis (Table S1): 128 Syscia and two outgroup specimens from the genus Ooceraea. All sequence data were newly generated for this study, except for 5 samples, for which data were extracted from Oxley et al. (2014; Genome), Branstetter et al. (2017), and Borowiec (2019) (see Table S1). Vouchers were designated for each extraction and may be the same specimen (non-destructive DNA extraction) or with varying degrees of subjectivity from the same nest, collection series, or rarely, population. Full voucher specimen details are in Supplementary Material, Table S2.

    To examine species boundaries and phylogenetic relationships among species and populations, we employed the UCE approach to phylogenomics (Faircloth et al. 2012, Faircloth et al. 2015, Branstetter et al. 2017), a method that combines targeted enrichment of ultraconserved elements (UCEs) with multiplexed, next-generation sequencing. All UCE molecular work was performed following the UCE methodology described in Branstetter et al. (2017). Briefly, the process involves DNA extraction, sample QC, DNA fragmentation (400-600 bp), library preparation, library pooling (equimolar pools of 10 or 11 samples), UCE enrichment, qPCR quantification, final pooling (up to 102 samples per sequencing pool), and sequencing. All sequencing was performed on an Illumina HiSeq 2500 instrument (2x125 bp v4 chemistry; Illumina Inc., San Diego, CA) by the University of Utah genomics core facility. To enrich UCE loci, we used an ant-customized bait set (“ant-specific hym-v2”) that includes 9,898 baits (120 mer) targeting 2,524 UCE loci shared across Hymenoptera and a set of legacy markers (data not used) (Branstetter et al. 2017). The ability of this bait set to successfully enrich UCE loci and resolve relationships in ants has been demonstrated in several studies (Branstetter et al. 2017, Pierce et al. 2017, Ward and Branstetter 2017, Blaimer et al. 2018, Branstetter and Longino 2019, Longino and Branstetter 2020).

    UCE matrix assembly: After sequencing, the University of Utah bioinformatics core demultiplexed the data using bcl2fastq v1.8 (Illumina, 2013) and made the data available for download. Once received, the sequence data were cleaned, assembled and aligned using PHYLUCE v1.6 (Faircloth 2016), which includes a set of wrapper scripts that facilitates batch processing of large numbers of samples. Within the PHYLUCE environment, we used the programs ILLUMIPROCESSOR v2.0 (Faircloth 2013), which incorporates TRIMMOMATIC (Bolger et al. 2014), for quality trimming raw reads, TRINITY v2013-02-25 (Grabherr et al. 2011) for de novo assembly of reads into contigs, and LASTZ v1.0 (Harris 2007) for identifying UCE contigs from all contigs. All optional PHYLUCE settings were left at default values for these steps. For the bait sequences file needed to identify and extract UCE contigs, we used the ant-specific hym-v2 bait file. To calculate assembly statistics, including sequencing coverage, we used scripts from the PHYLUCE package (phyluce_assembly_get_trinity_coverage and phyluce_assembly_get_trinity_coverage_for_uce_loci) that call the programs BWA v 0.7.7 (Li and Durban 2010) and GATK v3.8 (McKenna et al. 2010).

    After extracting UCE contigs, we aligned each UCE locus using a stand-alone version of the program MAFFT v7.130b (Katoh and Standley 2013) and the L-INS-i algorithm. We then used a PHYLUCE wrapper to trim flanking regions and poorly aligned internal regions using the program GBLOCKS (Talavera and Castresana 2007). The program was run with reduced stringency parameters (b1:0.5, b2:0.5, b3:12, b4:7). We then used another PHYLUCE script to filter the initial set of alignments so that each alignment was required to include data for ≥ 90% of taxa. This resulted in a final set of 1,388 alignments and 1,035,633 bp of sequence data for analysis. To calculate summary statistics for the final data matrix, we used a script from the PHYLUCE package (phyluce_align_get_align_summary_data). Information related to UCE sequencing and assembly results can be found in Supplemental Material, Table S3. All steps, including the phylogenetic analyses described below, were performed on a multicore Linux workstation (40 CPUs and 512 Gb of memory).

    Phylogenomic analysis: To partition the UCE data for phylogenetic analysis, we used the Sliding-Window Site Characteristics based on entropy method (SWSC-EN; Tagliacollo and Lanfear 2018), which breaks UCE loci into three regions, corresponding to the right flank, core, and left flank. The theoretical underpinning of the approach comes from the observation that UCE core regions are conserved, while the flanking regions become increasingly more variable (Faircloth et al. 2012). After running the SWSC-EN algorithm, the resulting data subsets were analyzed using PARTITIONFINDER2 (Lanfear et al. 2012, Lanfear et al. 2017). For this analysis we used the rclusterf algorithm, AICc model selection criterion, and the GTR+G model of sequence evolution. The resulting best-fit partitioning scheme included 1,126 data subsets and had a significantly better log likelihood than alternative partitioning schemes (SWSC-EN: -5,608,249.502; By Locus: -5,639,169.680; Unpartitioned: -5,731,679.666).

    Using the SWSC-EN partitioning scheme, we inferred phylogenetic relationships of Syscia with the likelihood-based program IQ-TREE v1.5.5 (Nguyen et al. 2015). For the analysis we selected the “-spp” option for partitioning (linked branch lengths but allowing each partition to have its own evolutionary rate) and the GTR+F+G4 model of sequence evolution. To assess branch support, we performed 1,000 replicates of the ultrafast bootstrap approximation (UFB) (Minh et al. 2013, Hoang et al. 2018) and 1,000 replicates of the branch-based, SH-like approximate likelihood ratio test (Guindon et al. 2010). For these support measures, values ≥ 95% and ≥ 80%, respectively, signal that a clade is supported.

    COI barcode analysis: Due to the high abundance of mitochondrial DNA in samples and the less-than-perfect efficiency of target enrichment methods, Cytochrome Oxidase I (COI) sequence data, and sometimes entire mitochondrial genomes (see Ströher et al. 2016) are often generated as a byproduct of the UCE sequencing process. To provide a separate assessment of species identities, possibly with more samples included, we extracted COI sequences from our UCE enriched samples and combined them with Syscia COI sequences downloaded from the BOLD database (Ratnasingham and Hebert 2007) (Accessed 16 May 2019). To extract COI from UCE data, we downloaded a complete 658 bp barcode sequence of a Costa Rican Syscia specimen from BOLD (Process ID ACGAE095-10, identified by us as S. benevidesae, one of the new species in this work) and used this as the bait input sequence for a PHYLUCE program (phyluce_assembly_match_contigs_to_barcodes) that extracts COI sequence from bulk sets of contigs.

    After extracting COI sequence from UCE sample data, we downloaded accessible barcode sequences from BOLD following a series of steps. First, using the BOLD workbench interface, we searched for all records matching the taxonomy search term “Syscia” or “Cerapachys”. We then copied all of the resulting Barcode Index Numbers (BINs) and performed a second search using these numbers in the identifiers field. This approach recovers taxonomically mislabeled samples because BINs group sequences into units by sequence similarity, not name (Ratnasingham & Hebert 2013). All returned sequences were downloaded examined, and subsequently filtered to remove Old World specimens and entries with no sequence data. We also removed a misidentified sample from Madagascar and a sequence mined from GenBank that had no accompanying specimen data. Because some of the remaining sequences included private, unpublished data, we contacted data owners for permission to use the private sequences in our analyses.

    We combined the final set of BOLD sequences with the successfully extracted COI sequences from UCE samples and aligned the data using MAFFT. We visually inspected the resulting alignment for signs of pseudogenes/numts (e.g. presence of stop codons, indels, or highly divergent sequence) or other anomalies using MESQUITE v3.51 (Maddison and Maddison 2018). The final matrix was partitioned by codon position and analyzed with IQ-TREE using GTR+F+G4, 1,000 ultrafast bootstrap replicates, and 1,000 SH-like replicates. Following a preliminary analysis of all samples, we discovered that a set of 79 putative “Cerapachys” samples actually belonged to the phylogenetically distinct genus Neocerapachys. Consequently, we removed these samples from our data set and updated determinations in BOLD. Sample information for the final set of 86 BOLD specimens included in our analysis is available in Supplemental Material, Table S4.

  15. m

    Data from: Predicting bonds between collagens – An alignment approach

    • data.mendeley.com
    Updated Feb 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentin Wesp (2024). Predicting bonds between collagens – An alignment approach [Dataset]. http://doi.org/10.17632/9pf5n687z4.2
    Explore at:
    Dataset updated
    Feb 16, 2024
    Authors
    Valentin Wesp
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The supplement contains all materials (data, scripts, etc.) used in this publication.

  16. The mean and standard error of the standard scores of assigning sequences to...

    • data.subak.org
    xls
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Figshare (2023). The mean and standard error of the standard scores of assigning sequences to each protein family based on the emission matrix and similarity emission matrix [Dataset]. http://doi.org/10.1371/journal.pone.0080565.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Feb 16, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The mean and standard error of the standard scores of assigning sequences to each protein family based on the emission matrix and similarity emission matrix .

  17. DNA matrix combined (nuclear and indels coded) datasets for Hyptidinae...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jul 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    José Floriano Pastore (2021). DNA matrix combined (nuclear and indels coded) datasets for Hyptidinae (Lamiaceae) [Dataset]. http://doi.org/10.5061/dryad.1rn8pk0sc
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 12, 2021
    Dataset provided by
    Universidade Federal de Santa Catarina
    Authors
    José Floriano Pastore
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Hyptidinae, ca. 400 species, is an important component of Neotropical vegetation formations. Members of the subtribe possess flowers arranged in variously modified bracteolate cymes and nutlets with an expanded areole and all share a unique explosive mechanism of pollen release, except for Asterohyptis. In a recent phylogenetic study, the group had its generic delimitations rearranged with the recognition of 19 genera in the subtribe. Although the previous phylogenetic analysis covered almost all the higher taxa in the subtribe, it lacked a broader sampling at the species level. Here we present a new expanded phylogenetic analysis for the subtribe comprising 153 accessions of Hyptidinae sequenced for the nuclear nrITS, nrETS, and waxy regions and the plastid markers trnL-F, trnS-G, trnD-T, and matK. Our results widely support the previous phylogenetic results with some changes in the support and relationship between genera. It also uncovers the need for a new combination of Eriope machrisae in Hypenia and the phylogenetic position of Hyptis sect. Rhytidea, which was demonstrated to be part of Mesosphaerum. The generic delimitation in Hyptidinae is discussed, and we recommend that further studies with more markers are needed to confirm the monophyly of Hyptidendron and Mesosphaerum, as well as to support taxonomic changes on the infrageneric delimitation within Hyptis s. s.

    Methods DNA Amplification and Sequencing—Total genomic DNA was extracted from fresh or silica-gel dried leaf material and sodium chloride/CTAB preserved material (Chase and Hills 1991) or from herbarium specimens. The fragments, when extracted from herbarium, were removed from HUEFS and K collections. A rescaled version of the Doyle and Doyle (1987) protocol was used for genomic DNA extractions. We chose the nuclear ribosomal internal transcribed spacer region (nrITS) including both ITS1 and ITS2 and intervening 5.8S and nuclear ribosomal external transcribed spacer region nrETS, and the nuclear low copy waxy granule-bound starch synthase I (GBSS). The plastid regions used were 3’trnK-matK (including partial trnK intron and matK coding region), trnL-F region (including the trnL intron and the trnL-trnF intergenic spacer), trnD-T and trnS-G region (including trnS-psbZ intergenic spacer, partial sequence; psbZ gene, complete cds; psbZ-trnG intergenic spacer, complete sequence; and tRNA-Gly (trnG) gene). The nrITS region was amplified using the primers 17SE and 26SE of Sun et al. (1994). The 3′ 18S-IGS primer of Baldwin and Markos (1998) and the 5′ primer ETS-B (Beardsley and Olmstead, 2002) were used to amplify a portion of the 3′ end of the nrETS. The nuclear region waxy were sequenced between GBSSI bd9f-bd11r, using the same methodology designed by Drew and Sytsma (2013). Therefore, for the GBSSI gene, we used a nested PCR approach to amplify the region between (and including parts of) exons 7–11. The initial PCR reaction was used the primers bd7f and bd12r Drew and Sytsma (2013). The PCR product from the above amplification was then used (after 1: 20 dilution) as a template for the additional PCR reaction, using the primers bd9f and bd11r. The product of this amplification was then sequenced with the same primers used in the nested PCR. The partial matK/trnK locus was amplified using 390F and 1326R (Cuénoud et al. 2002). The whole trnL-trnF region was amplified using primers “c” and “f”, with the use of internal primers “d” and “e” for some problematic samples, of Taberlet et al. (1991). For amplifying the trnS-G spacer we used the set of primers described in Shaw et al. (2007). The spacer trnD-T was amplified for most taxa using the primers of Demesure et al. (1995), trnD GUC and trnT GGU. For some samples which could not be amplified using these primers, we used the internal primer trnY GUA (Shaw et al. 2005). The plastid loci were sequenced using the same set of primers used for the amplification, whereas the nuclear nrITS was sequenced using internal primers ITS92 (Desfeux and Lejeune 1996) and ITS4 (White et al. 1990) with the same PCR program.

    All PCR amplifications were performed in a final volume of 10 µL containing: 5 µL of TopTap master mix kit (Qiagen, Valencia, California), 2.25 pMol primers each, 5–10 ηg of genomic DNA, and ultrapure H2O (enough to complete the volume to 10 µL). For the ITS amplification, we added 2% DMSO (dimethyl sulfoxide) and 1M of betaine. All regions were amplified using initial denaturation at 94°C (5 min), 28 (ITS) or 32 (plastid loci) cycles of denaturation at 94°C (1 min), annealing 52°C (ITS) or 54°C (plastid loci) (1 min), elongation at 72°C (2 min), and a final elongation of 4 min. Amplified products were purified using precipitation with 11% solution of polyethylene glycol (PEG) 8000 and ethanol cleaning. Sequencing reactions in both directions were performed using BigDye Terminator 3.1 (Applied Biosystems, Carlsbad, California) chemistry and analyzed on an ABI3130XL sequencer (Applied Biosystems/Life Technologies Corporation, Carlsbad, California) following the manufacturer’s protocol at Universidade Estadual de Feira de Santana, Bahia, Brazil. Some PCR products were sequenced at the Interdisciplinary Center for Biotechnology Research at the University of Florida, Gainesville.

    Sequence Assembly and Alignment—The sequences were edited using Geneious 6.1.8 (https://www.geneious.com) and aligned using the program Clustal2X (Larkin et al. 2007); alignments were checked by eye. Gaps were coded according to the "simple coding" criterion of Simmons and Ochoterena (2000) using the software Seqstate v.1.4.1 (Müller 2005).

  18. Data from: A framework phylogeny of the American oak clade based on...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, zip
    Updated May 27, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew L. Hipp; Deren A. R. Eaton; Jeannine Cavender-Bares; Elisabeth Fitzek; Rick Nipper; Paul S. Manos; Andrew L. Hipp; Deren A. R. Eaton; Jeannine Cavender-Bares; Elisabeth Fitzek; Rick Nipper; Paul S. Manos (2022). Data from: A framework phylogeny of the American oak clade based on sequenced RAD data [Dataset]. http://doi.org/10.5061/dryad.ts2hj
    Explore at:
    zip, binAvailable download formats
    Dataset updated
    May 27, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Andrew L. Hipp; Deren A. R. Eaton; Jeannine Cavender-Bares; Elisabeth Fitzek; Rick Nipper; Paul S. Manos; Andrew L. Hipp; Deren A. R. Eaton; Jeannine Cavender-Bares; Elisabeth Fitzek; Rick Nipper; Paul S. Manos
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Previous phylogenetic studies in oaks (Quercus, Fagaceae) have failed to resolve the backbone topology of the genus with strong support. Here, we utilize next-generation sequencing of restriction-site associated DNA (RAD-Seq) to resolve a framework phylogeny of a predominantly American clade of oaks whose crown age is estimated at 23–33 million years old. Using a recently developed analytical pipeline for RAD-Seq phylogenetics, we created a concatenated matrix of 1.40 E06 aligned nucleotides, constituting 27,727 sequence clusters. RAD-Seq data were readily combined across runs, with no difference in phylogenetic placement between technical replicates, which overlapped by only 43–64% in locus coverage. 17% (4,715) of the loci we analyzed could be mapped with high confidence to one or more expressed sequence tags in NCBI Genbank. A concatenated matrix of the loci that BLAST to at least one EST sequence provides approximately half as many variable or parsimony-informative characters as equal-sized datasets from the non-EST loci. The EST-associated matrix is more complete (fewer missing loci) and has slightly lower homoplasy than non-EST subsampled matrices of the same size, but there is no difference in phylogenetic support or relative attribution of base substitutions to internal versus terminal branches of the phylogeny. We introduce a partitioned RAD visualization method (implemented in the R package RADami; http://cran.r-project.org/web/packages/RADami) to investigate the possibility that suboptimal topologies supported by large numbers of loci—due, for example, to reticulate evolution or lineage sorting—are masked by the globally optimal tree. We find no evidence for strongly-supported alternative topologies in our study, suggesting that the phylogeny we recover is a robust estimate of large-scale phylogenetic patterns in the American oak clade. Our study is one of the first to demonstrate the utility of RAD-Seq data for inferring phylogeny in a 23–33 million year-old clade.

  19. n

    UniPROBE

    • neuinfo.org
    • scicrunch.org
    • +2more
    Updated Mar 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). UniPROBE [Dataset]. http://identifiers.org/RRID:SCR_005803
    Explore at:
    Dataset updated
    Mar 22, 2025
    Description

    Database that hosts experimental data from universal protein binding microarray (PBM) experiments (Berger et al., 2006) and their accompanying statistical analyses from prokaryotic and eukaryotic organisms, malarial parasites, yeast, worms, mouse, and human. It provides a centralized resource for accessing comprehensive data on the preferences of proteins for all possible sequence variants ("words") of length k ("k-mers"), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. The database's web tools include a text-based search, a function for assessing motif similarity between user-entered data and database PWMs, and a function for locating putative binding sites along user-entered nucleotide sequences.

  20. d

    Total Ortholog Median Matrix (TOMM): an alternative unsupervised approach...

    • datadryad.org
    • explore.openaire.eu
    • +2more
    zip
    Updated Dec 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandra R. Maruyama; Luana A. Rogerio; Patrícia Domingues Freitas; Marta Maria Teixeira; José Marcos Ribeiro (2020). Total Ortholog Median Matrix (TOMM): an alternative unsupervised approach for phylogenomics based on evolutionary distance between protein coding genes [Dataset]. http://doi.org/10.5061/dryad.b1k526g
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 10, 2020
    Dataset provided by
    Dryad
    Authors
    Sandra R. Maruyama; Luana A. Rogerio; Patrícia Domingues Freitas; Marta Maria Teixeira; José Marcos Ribeiro
    Time period covered
    2019
    Description

    Supplemental Figures: S1, S2, S3, S4, S5, S6 and S7

    S1-S6: Phylograms using different percentiles of the ranked ortholog pairs of 46 species of Kinetoplastida protozoa (S1, S2 and S4), different cutoff for E-value (S3) and different methods for orthology inference (S5 and S6).

    S7: Bar graph showing the total number of orthologous identified by the RSD and OrthoMCL algorithms for 78 pairs of species combinations used in the analysis, based on 13 species with sequences retrieved from TriTryp database, as indicated in Table 1 ("Proteins sequence source" column). Intersections (shared orthologs) and unique orthologs were calculated with gene ID lists as input using Venn diagram tool (http://bioinformatics.psb.ugent.be/webtools/Venn/).

    File name: Supplemental_Figures_S1_S2_S3_S4_S5_S6_S7.pdf

    Supplemental Table 1: Kinetoplastida pairwise matrices

    Excel spreadsheet containing resulting tables of pairwise orthologs data (pairwise matrices). Sheet "AA distance": aminoacid distance obtain...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Laura M Boggs; Melissa KR Scheible; Gustavo Machado; Kelly A Meiklejohn (2019). Barcode sequence matrix [Dataset]. http://doi.org/10.6084/m9.figshare.8218997.v1
Organization logo

Barcode sequence matrix

Explore at:
xlsxAvailable download formats
Dataset updated
Jun 6, 2019
Dataset provided by
figshare
Authors
Laura M Boggs; Melissa KR Scheible; Gustavo Machado; Kelly A Meiklejohn
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

List of 572 barcode sequences used in statistical analyses. The region targeted is noted, along with the number of reads recovered for that sequence in each sample (n, 50). Corresponding scientific paper: Boggs, L.M.; Scheible, M.K.; Machado, G.; Meiklejohn, K.A. Single Fragment or Bulk Soil DNA Metabarcoding: Which is Better for Characterizing Biological Taxa Found in Surface Soils for Sample Separation? Genes 2019, 10, 431

Search
Clear search
Close search
Google apps
Main menu