41 datasets found
  1. o

    Data from: Divergent lineages in a semi-arid mallee species, Eucalyptus...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +2more
    Updated Aug 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Patrick Fahey; Rachael Fowler; Todd McLay; Frank Udovicic; David Cantrill; Michael Bayly (2021). Data from: Divergent lineages in a semi-arid mallee species, Eucalyptus behriana, correspond to a major geographic break in south-eastern Australia [Dataset]. http://doi.org/10.5061/dryad.v9s4mw6sm
    Explore at:
    Dataset updated
    Aug 28, 2021
    Authors
    Patrick Fahey; Rachael Fowler; Todd McLay; Frank Udovicic; David Cantrill; Michael Bayly
    Area covered
    Australia, Eastern states of Australia
    Description

    Aim: To infer relationships between populations of the semi-arid, mallee eucalypt, Eucalyptus behriana, to build hypotheses regarding evolution of major disjunctions in the species’ distribution and to expand understanding of the biogeographical history of south-eastern Australia. Location: South-eastern Australia Taxon: Eucalyptus behriana (Myrtaceae, Angiospermae) Methods: We developed a large dataset of anonymous genomic loci for 97 samples from 11 populations of E. behriana using double digest restriction site associated DNA sequencing (ddRAD-seq), to determine genetic relationships between the populations. These relationships, along with species distribution models, were used to construct hypotheses regarding environmental processes that have driven fragmentation of the species’ distribution. Results: Greatest genetic divergence was between populations on either side of the Lower Murray Basin. Populations west of the Basin showed greater genetic divergence between one another than the eastern populations. The most genetically distinct population in the east (Long Forest) was separated from others by the Great Dividing Range. A close relationship was found between the outlying northernmost population (near West Wyalong) and those in the Victorian Goldfields despite a large disjunction between them. Conclusions: Patterns of genetic variation are consistent with a history of vicariant differentiation of disjunct populations. We infer that an early disjunction to develop in the species distribution was that across the Lower Murray Basin, an important biogeographical barrier separating many dry sclerophyll plant taxa in south-eastern Australia. Additionally, our results suggest that the western populations fragmented earlier than the eastern ones, with this fragmentation, both west and east of the Murray Basin, likely tied to climatic changes associated with glacial-interglacial cycles, although major geological events including uplift of the Mount Lofty Ranges and basalt flows in the Newer Volcanics Province possibly also played a role. Supplementary tables.docx - Tables containing information on Ipyrad parameters and individual samples in the dataset used for analyses Concatenated_alignment.nex - nexus format concatenated alignment of all loci generated by ipyrad used in phylogenetic analyses One_SNP_per_5000bp_Egrandis_reference_genepop.txt - Genepop format file containing filtered SNP dataset containing no more than one SNP per 5000 bp of the E. grandis reference genome used to assemble loci in ipyrad.

  2. Z

    Data from: Exploring the impact of read clustering thresholds on...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hutter, Carl R. (2023). Data from: Exploring the impact of read clustering thresholds on RADseq-based systematics: an empirical example from European amphibians. [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7829242
    Explore at:
    Dataset updated
    Apr 19, 2023
    Dataset provided by
    Dufresnes, Christophe
    Crochet, Pierre-André
    Policain, Mathieu
    Hutter, Carl R.
    Elmer, Kathryn E.
    Babik, Wieslaw
    Arntzen, Jan W.
    Galan, Pedro
    Rancilhac, Loïs
    Deso, Grégory
    Duguet, Rémi
    Sabino-Pinto, Joana
    Vences, Miguel
    Capstick, Maria
    Priol, Pauline
    Pabijan, Maciej
    Sylvestre, Florent
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains genetic sequences obtained from Hybrid-Enrichment and RAD sequencing protocols of the amphibian genera Discoglossus, Lissotriton, Rana and Triturus, as well as phylogenetic trees inferred from the RADseq data. This data was generated for the manuscript "Exploring the impact of read clustering thresholds on RADseq-based systematics: an empirical example from European amphibians.", in which we tested the influence of the clustering threshold used to assemble RADseq data on downstream phylogenetic inferences. Details on the data generation and analyses can be found in the manuscript and related supplementary materials.

    The repository is organised as follow:

    -> Hybrid-Enrichment: alignments of the Hybrid-Enrichment markers in phylip/fasta format (with one subdirectory for each of the four datasets assembled: Discoglossus, Lissotriton, Rana, Triturus)

    --> RADseq: Assemblies and phylogenetic trees obtained from a RADseq protocol

    --> Assemblies: RADseq assemblies (complete loci sequences and SNP matrices, spreadsheets with assembly metrics). Divided into "iCT" (assemblies produced with 23 different intra-sample Clustering Threshold [iCT] and a fixed between-samples Clustering Threshold [bCT]) and "bCT" (assemblies produced with a fixed iCT and 23 different bCT). Both iCT and bCT are further divided in four sub-directories corresponding to the four datasets: Discoglossus, Lissotriton, Rana, Triturus)
    
    
    --> Trees: Phylogenetic trees inferred from the aforementionned assemblies. Divided into "iCT" (RAxML concatenation trees inferred from the assemblies with different iCTs) and "bCT" (RAxML concatenation trees and Tetrad species trees inferred from the assemblies with different bCTs).
    
  3. Genotyping-by-sequencing data for Corybas acotiniflorus complex...

    • data.csiro.au
    • researchdata.edu.au
    Updated Dec 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katharina Nargar; Natascha Wagner; Mark Clements; Lalita Simpson (2020). Genotyping-by-sequencing data for Corybas acotiniflorus complex (Orchidaceae) from eastern Australia. [Dataset]. http://doi.org/10.25919/5vyn-mh60
    Explore at:
    Dataset updated
    Dec 2, 2020
    Dataset provided by
    CSIROhttp://www.csiro.au/
    Authors
    Katharina Nargar; Natascha Wagner; Mark Clements; Lalita Simpson
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Apr 1, 2015 - Oct 26, 2020
    Area covered
    Dataset funded by
    CSIROhttp://www.csiro.au/
    Georg August University Goettingen
    NSW Office of Environment and Heritage
    James Cook University
    Australian Tropical Herbarium
    Center for Australian National Biodiversity Research
    Description

    Genotyping-by-sequencing data (raw reads and assembled/aligned) for conservation genomic study on Corybas acotiniflorus species complex (Acianthinae, Diurideae, Orchidaceae). Reference: Natascha D. Wagner, Mark A. Clements, Lalita Simpson, Katharina Nargar: Conservation in the face of hybridisation: genome-wide study to evaluate taxonomic delimitation and conservation status of a threatened orchid species . Conservation Genetics (accepted manuscript).

    Lineage: Material: The dataset includes 70 samples from the Corybas aconitiflorus complex: C. aconitiflorus (24 samples, 9 localities), C. barbarae (32 samples, 5 localities), C. dowlingii (14 samples, 2 localities), and Corybas pruinosus (2 samples). Sampling focused on the south-eastern distribution of the C. aconitiflorus complex. It extended from the restricted distribution of C. dowlingii (between Port Macquarie and Newcastle, New South Wales) ca. 300 km northwards to the border between New South Wales and Queensland (Uralba), ca. 1,200 km southwards to Tasmania (Ulverstone), and ca. 600 km eastwards to Lord Howe Island. DNA extraction, ddRAD library preparation, and sequencing: Total DNA was extracted from silica-dried leaf material using a modified CTAB protocol (Weising et al. 2005). Double-digest restriction-site associated DNA (ddRAD) sequencing libraries were prepared following Peterson et al. (2012) with the enzyme combination PstI and NlaIII. Quality and reproducibility of libraries and DNA sequencing were assessed by running five samples in duplicate (6.9 % of all samples). Multiplexed libraries were sequenced on one lane of a NextSeq500 sequencing platform (Illumina Inc., San Diego, CA, USA) as single-ended, 150 bp reads at the Australian Genome Research Facility (AGRF; Melbourne, Victoria, Australia). Bioinformatics and data filtering: Quality of the sequence reads was examined using FastQC v.0.11.5 (Andrews 2010). Raw sequences were demultiplexed, trimmed and further processed using the ipyrad pipeline v.0.6.15 (Eaton and Overcast 2016). In an initial filtering step, reads with more than five low quality bases (Phred quality score < 20) were excluded from the data set. The phred quality score offset was set to 33. The strict adapter trimming option was selected, and a minimum read length of 35bp after trimming was chosen to retain a read in the dataset. After these quality-filtering steps, the reads were clustered within and across samples by similarity of 85% using the vclust function in VSEARCH (Edgar 2010). The alignment was carried out using MUSCLE (Edgar 2004) as implemented in ipyrad. Clusters with less than six reads were excluded in order to ensure accurate base calls. The resulting clusters represent putative RAD loci shared across samples. A maximum number of five uncalled bases (‘Ns’) and a maximum number of eight heterozygote sites (‘Hs’) was allowed in the consensus sequences. The maximum number of single nucleotide polymorphisms (SNPs) within a locus was set to ten and the maximum number of indels per locus to five. For the sample set including all accessions of the C. aconitiflorus complex as well as two accessions of C. pruinosus as outgroup ipyrad runs for two different datasets were generated, i.e. based on loci shared by at least 20 individuals (m20) and on loci shared by at least 70 individuals (m70). Additionally, the same settings were used for ipyrad runs excluding the outgroup (C. pruinosus, 2 samples).

  4. Dataset: The potential of genome-wide RAD sequences for resolving rapid...

    • figshare.com
    txt
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Danilo Trabuco Amaral; Juliana Rodrigues Bombonato; Gislaine Angélica Rodrigues Silva; Gulzar Khan; Evandro Marsola Moraes; Sónia Cristina da Silva Andrade; Deren A.R. Eaton; Diego Peres Alonso; Paulo Eduardo Martins Ribolla; Nigel Taylor; Daniela C. Zappi; Fernando Faria Franco (2023). Dataset: The potential of genome-wide RAD sequences for resolving rapid radiations: a case study in Cactaceae [Dataset]. http://doi.org/10.6084/m9.figshare.12678551.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Danilo Trabuco Amaral; Juliana Rodrigues Bombonato; Gislaine Angélica Rodrigues Silva; Gulzar Khan; Evandro Marsola Moraes; Sónia Cristina da Silva Andrade; Deren A.R. Eaton; Diego Peres Alonso; Paulo Eduardo Martins Ribolla; Nigel Taylor; Daniela C. Zappi; Fernando Faria Franco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The reconstruction of relationships within recently radiated groups is challenging even when massive amounts of sequencing data are available. The use of restriction site-associated DNA sequencing (RAD-Seq) to this end is promising. Here, we assessed the performance of RAD-Seq to infer the species-level phylogeny of the rapidly radiating genus Cereus (Cactaceae). To examine how the amount of genomic data affects resolution in this group, we used distinct datasets and implemented different analyses. We sampled 52 individuals of Cereus, representing 18 of the 25 species currently recognized, plus members of the closely allied genera Cipocereus and Praecereus, and other 11 Cactaceae genera as outgroups. Three scenarios of permissiveness to missing data were carried out in iPyRAD, assembling datasets with 4330% (333 loci), 45% (1440 loci), and 70% (6141 loci) of missing data. For each dataset, Maximum Likelihood (ML) trees were generated using two supermatrices, i.e., only SNPs and SNPs plus invariant sites. Accuracy and resolution were improved when the dataset with the highest number of loci was used (6141 loci), despite the high percentage of missing data included (70%). Coalescent trees estimated using SVDQuartets and ASTRAL are similar to those obtained by the ML reconstructions. Overall, we reconstruct a well-supported phylogeny of Cereus, which is resolved as monophyletic and composed of four main clades with high support in their internal relationships. Our findings also provide insights into the impact of missing data for phylogeny reconstruction using RAD loci. SamplingOur dataset includes 63 samples spanning 52 ingroups of Cereus and 11 outgroups (Table 1). ddRAD library preparation and sequencing 157Genomic DNA was extracted from root tissues using the DNeasy Plant Mini Kit (Qiagen). ddRAD libraries were prepared using high fidelity EcoRI and HPAII restriction enzymes following Campos et al. (2017) and Khan et al. (2019). Details of library preparation and sequencing are shown in Supplementary materialBioinformatics analyses Raw data were trimmed for adapters and quality filtered before SNPs calling. The quality of sequencing data was checked with FastQC 0.11.2 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc), visualized in MultiQC 1.0 (https://github.com/ewels/MultiQC), and filtered with SeqyClean 1.9.12 (Zhbannikov et al., 2017) using the following settings: minimum quality (Phred Score 20), minimum size (>65 bp), and Illumina contaminants (UniVec.fas). We used the iPyRAD pipeline (available at http://github.com/dereneaton/ipyrad) to identify homology among reads, make SNP calls, and format output files. The following parameter settings were implemented: mindepth_majrule = 6 (minimum depth for majority-rule base calling), clust_threshold = 0.85 (clustering threshold for de novo assembly), filter_adapters = 2 (strict filter), max_Hs_consens = 6 (maximum heterozygotes in consensus), min_samples_locus (minimum percentage of samples per locus 184for output). For the latter, values varied in three distinct scenarios concerning the permissiveness to missing data. These scenarios considered that the final set of loci should have at least 39 samples (scenario 1, approximately 30% of missing data), 26 samples (scenario 2, approximately 45% of missing data), or 13 samples (scenario 3, approximately 70% of missing data). After SNP calling, CD-HIT (Li and Godzik, 2006; Fu et al., 2012) was used to identify reverse-complement duplicates in the loci recovered by iPyRAD.

  5. f

    Number of putative loci identified across all samples for each iPYRAD...

    • plos.figshare.com
    xls
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lina M. Valencia; Amely Martins; Edgardo M. Ortiz; Anthony Di Fiore (2023). Number of putative loci identified across all samples for each iPYRAD pipeline and number retained after each filtering step. [Dataset]. http://doi.org/10.1371/journal.pone.0201254.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Lina M. Valencia; Amely Martins; Edgardo M. Ortiz; Anthony Di Fiore
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Number of putative loci identified across all samples for each iPYRAD pipeline and number retained after each filtering step.

  6. d

    Supplementary materials for: Exploring the impact of read clustering...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Loïs Rancilhac; Florent Sylvestre; Carl R. Hutter; Jan W. Arntzen; Wieslaw Babik; Pierre-André Crochet; Grégory Deso; Rémi Duguet; Pedro Galan; Maciej Pabijan; Mathieu Policain; Pauline Priol; Joana Sabino-Pinto; Maria Capstick; Kathryn R. Elmer; Christophe Dufresnes; Miguel Vences (2023). Supplementary materials for: Exploring the impact of read clustering thresholds on RADseq-based systematics: an empirical example from European amphibians [Dataset]. http://doi.org/10.5061/dryad.n2z34tn1z
    Explore at:
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Dryad Digital Repository
    Authors
    Loïs Rancilhac; Florent Sylvestre; Carl R. Hutter; Jan W. Arntzen; Wieslaw Babik; Pierre-André Crochet; Grégory Deso; Rémi Duguet; Pedro Galan; Maciej Pabijan; Mathieu Policain; Pauline Priol; Joana Sabino-Pinto; Maria Capstick; Kathryn R. Elmer; Christophe Dufresnes; Miguel Vences
    Time period covered
    Jan 1, 2023
    Description

    Restriction site-Associated DNA sequencing (RADseq) has great potential for genome-wide systematics studies of non-model organisms. However, accurately assembling RADseq reads into orthologous loci remains a major challenge in the absence of a reference genome. Traditional assembly pipelines cluster putative orthologous sequences based on a user-defined clustering threshold. Because improper clustering of orthologs is expected to affect results in downstream analyses, it is crucial to design pipelines for empirically optimizing the clustering threshold. While this issue has been largely discussed from a population genomics perspective, it remains understudied in the context of phylogenomics and coalescent species delimitation. To address this issue, we generated RADseq assemblies of representatives of the amphibian genera Discoglossus, Rana, Lissotriton and Triturus using a wide range of clustering thresholds. Particularly, we studied the effects of the intra-sample Clustering Threshold...

  7. Detection of cross-contamination and strong mitonuclear discordance in two...

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marko Prous; Kyung Min Lee; Marko Mutanen (2023). Detection of cross-contamination and strong mitonuclear discordance in two species groups of sawfly genus Empria (Hymenoptera, Tenthredinidae): ipyrad dataset and an R script [Dataset]. http://doi.org/10.6084/m9.figshare.7605404.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Marko Prous; Kyung Min Lee; Marko Mutanen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The ipyrad dataset contains alignment of 19 413 ddRAD loci of 35 Empria specimens. The script takes as an input the ipyrad dataset and a table is produced for every locus (rows) and specimen (columns) where cells contain list of specimens that are identical to a specimen indicated in the column (the cell is empty if there are no identical specimens for a particular specimen and locus). Additional columns are added to get information per locus about identical specimens between two groups, the number of specimens, maximum, median and mean divergence. The two groups examined are longicornis (including E. tridentis, which taxonomically is not a member of the group, but closely related) and immersa groups. For both groups and for every locus, specimens are recorded that are identical to any member in the other group while different from specimens in its own group. The second table produced by the script lists the specimens in the dataset, the number of loci, and the normalised number of loci per specimen. Normalised numbers of loci were calculated as half of the maximum number of loci divided by the number of loci of a particular specimen in the dataset. Then the script proceeds to produce bar plots (output as pdf) for every specimen showing percent of loci and normalised percent of loci that are identical to a particular specimen while different from all others. Two additional bar plots are produced for longicornis and immersa groups to show percent of loci of a particular specimen that are identical to any specimen in the wrong group while different from specimens in its own group.

  8. d

    Data from: Molecular data of the Sphagnum cuspidatum complex relative to...

    • dataone.org
    • explore.openaire.eu
    • +5more
    Updated Jul 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean C. Robinson; Marta Nieto-Lugilde; Aaron M. Duffy; Katherine Martinez Munoz; Blanka Aguero; Amelia Merced; Kristian Hassel; Kjell Ivar Flatberg; A. Jonathan Shaw (2025). Molecular data of the Sphagnum cuspidatum complex relative to taxonomy [Dataset]. http://doi.org/10.5061/dryad.crjdfn39t
    Explore at:
    Dataset updated
    Jul 13, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Sean C. Robinson; Marta Nieto-Lugilde; Aaron M. Duffy; Katherine Martinez Munoz; Blanka Aguero; Amelia Merced; Kristian Hassel; Kjell Ivar Flatberg; A. Jonathan Shaw
    Time period covered
    Jan 1, 2023
    Description

    The use of species as a concept is an important metric for assessing biological diversity and ecosystem function. However, delimiting species based on morphological characters can be difficult, especially in aquatic plants that exhibit high levels of variation and overlap. The Sphagnum cuspidatum complex, which includes plants that dominate peatland hollows close to or at the water table, provides an example of challenges in species delimitation. Microscopic characters that have been used to define taxa and the possibility that these characters may simply be phenoplastic responses to variation in water availability make species delimitation in this group especially difficult. In particular, the use of leaf shape and serration, which have been used to separate species in the complex, have resulted in divergent taxonomic treatments. Using a combination of high-resolution population genomic data (RADseq) and a robust morphological assessment of plants representing the focal species, we pro..., , File cuspidatum_demultiplexed_illumina_reads_20230315.tar.gz: Zipped folder with 135 files of demultiplexed Illumina reads for Sphagnum samples included in the analyses. File cuspidatum_dataset_all_135_samples.phy: Phylip format alignment of 8367 RADseq loci generated by ipyrad for 135 Sphagnum samples. File cuspidatum_dataset_all_135_samples.loci: Loci format file of 8367 RADseq loci generated by ipyrad for 135 Sphagnum samples. File cuspidatum_plastid_loci_52_samples: Fasta format alignment of 5 plastid concatenated loci generated by ipyrad and identified them by mapping to Sphagnum fallax reference genome (52 samples). File cuspidatum_dataset_ingroup_57_samples.ustr: Structure format file (two lines per samples, considering S. torreyanum and S. mississippiense samples as diploid and the remain species as haploid) with one randomly selected SNP per locus generated by ipyrad for 57 Sphagnum samples.

  9. Output files from ipyrad

    • figshare.com
    txt
    Updated Nov 18, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Finger; Keaka Farleigh; Jason Bracken; Adam D. Leaché; Olivier Francois; Ziheng Yang; Tomas Flouri; Tristan Charran; Tereza Jezkova; Dean Williams; Christopher Blair (2021). Output files from ipyrad [Dataset]. http://doi.org/10.6084/m9.figshare.17043086.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 18, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Nicolas Finger; Keaka Farleigh; Jason Bracken; Adam D. Leaché; Olivier Francois; Ziheng Yang; Tomas Flouri; Tristan Charran; Tereza Jezkova; Dean Williams; Christopher Blair
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    All output files generated by running the raw GBS data through the ipyad pipeline.

  10. Z

    Data from: A new species of bridled darter endemic to the Etowah River...

    • data.niaid.nih.gov
    • explore.openaire.eu
    Updated Aug 17, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emily L. Boring (2020). A new species of bridled darter endemic to the Etowah River system in Georgia (Percidae: Etheostomatinae: Percina) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3987477
    Explore at:
    Dataset updated
    Aug 17, 2020
    Dataset provided by
    Jeffrey W. Simmons
    Thomas J. Near
    Gerald R. Dinkins
    Daniel J. MacGuigan
    Emily L. Boring
    Benjamin P. Keck
    Richard C. Harrington
    Brett Albanese
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Georgia, Etowah River
    Description

    "alignments" folder contains concatenated ddRAD phylip alignments produced by iPyrad. mXXp naming scheme consistent with iPyrad parameter files.

    "bridledCOB.nex" contains the DNA sequence alignment for the mtDNA gene cytochrome b used in the Bayesian phylogenetic analysis.

    "bridledCOB.nex.con.tre" summarized posterior tree from Bayesian analysis of the mtDNA cytochrome b data.

    "bridledNUC.nex" contains the DNA sequence alignment for the 11 nuclear genes used in the Bayesian phylogenetic analysis.

    "bridledNUC.nex.con.tre" summarized posterior tree from Bayesian analysis of the 11 nuclear genes.

    "fastqs" folder contains demultiplexed fastq files containing ddRAD reads for each individual.

    "iPyrad_paramFiles" folder contains assembly parameters for iPyrad. mXXp indicates the minimum proportion of samples per locus (min_samples_locus iPyrad parameter). For example, m70p indicates that each locus is represented by at least 70% of the samples.

    "P_freemanorum_meristic_data.csv" the merstic data of Percina freemanorum.

    "P_freemanorum_meristic_specimen_info.csv" information associated with specimens of Percina freemanorum.

    "P_kusha_meristic_data.csv" the merstic data of Percina kusha.

    "P_kusha_meristic_specimen_info.csv" information associated with specimens of Percina kusha.

    "IQTree" folder contains bash script to run IQTree analyses and resulting treefiles. mXXp naming scheme consistent with iPyrad parameter files.

    "VCFs" folder contains bash script to run VCFTools filtering, outgroup file listing outgroup taxa to prune, input VCF file from iPyrad, and filtered VCF files. Unlinked indicates that SNPs have been pruned to include only one SNP per ddRAD locus. mXXp naming scheme consistent with iPyrad parameter files.

  11. Data from: Transpacific coalescent pathways of coconut rhinoceros beetle...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, text/x-python +1
    Updated May 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan Bradley Reil; Camiel Doorenweerd; Michael San Jose; Sheina B. Sim; Scott M. Geib; Daniel Rubinoff; Jonathan Bradley Reil; Camiel Doorenweerd; Michael San Jose; Sheina B. Sim; Scott M. Geib; Daniel Rubinoff (2022). Data from: Transpacific coalescent pathways of coconut rhinoceros beetle biotypes: resistance to biological control catalyzes resurgence of an old pest [Dataset]. http://doi.org/10.5061/dryad.f4g56
    Explore at:
    bin, txt, text/x-pythonAvailable download formats
    Dataset updated
    May 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonathan Bradley Reil; Camiel Doorenweerd; Michael San Jose; Sheina B. Sim; Scott M. Geib; Daniel Rubinoff; Jonathan Bradley Reil; Camiel Doorenweerd; Michael San Jose; Sheina B. Sim; Scott M. Geib; Daniel Rubinoff
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Biological control agents have several advantages over chemical control for pest management, including the capability to restore ecosystem balance with minimal non-target effects and a lower propensity for targets to develop resistance. These factors are particularly important in the invasive species control. The coconut rhinoceros beetle (Oryctes rhinoceros Linnaeus) is a major palm pest that invaded many Pacific islands in the early 20th century through human-mediated dispersal. Application of the Oryctes nudivirus in the 1960's successfully halted the beetle's first invasion wave and made it a textbook example of successful biological control. However, a recently discovered O. rhinoceros biotype that is resistant to the nudivirus appears to be correlated with a new invasion wave. We performed a population genomics analysis of 172 O. rhinoceros from seven regions, including native and invasive populations, to reconstruct invasion pathways and explore correlation between recent invasions and biotypes. From ddRAD sequencing, we generated datasets ranging from 4,000 to 209,000 loci using STACKS and IPYRAD software pipelines and compared genetic signal in downstream clustering and phylogenetic analyses. Analysis suggests that the O. rhinoceros resurgence is mediated by the nudivirus-resistant biotype. Genomic data has proven essential to understanding the new O. rhinoceros biotype's, invasion patterns and interactions with the original biotype. Such information is crucial to optimization of strategies for quarantine and control of resurgent pests. Our results demonstrate that while invasions are relatively rare events, new introductions can have significant ecological consequences, and quarantine vigilance is required even in previously invaded areas.

  12. d

    Data from: Pleistocene dynamics of the Eurasian steppe as a driving force of...

    • datadryad.org
    • zenodo.org
    zip
    Updated Jul 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anze Zerdoner Calasan; Herbert Hurka; Dmitry A. German; Simon Pfanzelt; Frank Blattner; Anna Seidl; Barbara Neuffer (2022). Pleistocene dynamics of the Eurasian steppe as a driving force of evolution: Phylogenetic history of the genus Capsella (Brassicaceae) [Dataset]. http://doi.org/10.5061/dryad.j6q573nf3
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 30, 2022
    Dataset provided by
    Dryad
    Authors
    Anze Zerdoner Calasan; Herbert Hurka; Dmitry A. German; Simon Pfanzelt; Frank Blattner; Anna Seidl; Barbara Neuffer
    Time period covered
    Jul 29, 2021
    Description

    The supplementary material consists of two sets of files:

    'caps_0_95_min141_max005_outfiles' represent the raw output files of the ipyrad analysis of a taxon sample containing all 282 accessions of the five investigated Capsella species.

    0_95 refers to the ## [14] [clust_threshold]: Clustering threshold for de novo assembly in the ipyrad parameter file min141 refers to the ## [21] [min_samples_locus]: Min # samples per locus for output in the ipyrad parameter file max005 refers to the ## [22] [max_SNPs_locus]: Max # SNPs per locus in the ipyrad parameter file

    'ori_0_95_min117_max005_outfiles' represent the raw output files of the ipyrad analysis of a taxon sample containing 235 investigated Capsella orientalis accessions.

    0_95 refers to the ## [14] [clust_threshold]: Clustering threshold for de novo assembly in the ipyrad parameter file min117 refers to the ## [21] [min_samples_locus]: Min # samples per locus for output in the ipyrad parameter file max005 refers to the ## ...

  13. o

    Data from: Genomic and phenotypic divergence‐with‐gene‐flow across an...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +2more
    Updated May 20, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    María José Rodríguez-Cajarville; Leonardo Campagna; Cássia Alves Lima-Rezende; Martín Carboni; Pablo L. Tubaro; Gustavo Sebastián Cabanne (2022). Genomic and phenotypic divergence‐with‐gene‐flow across an ecological and elevational gradient in a neotropical bird [Dataset]. http://doi.org/10.5061/dryad.c866t1g8h
    Explore at:
    Dataset updated
    May 20, 2022
    Authors
    María José Rodríguez-Cajarville; Leonardo Campagna; Cássia Alves Lima-Rezende; Martín Carboni; Pablo L. Tubaro; Gustavo Sebastián Cabanne
    Description

    We sampled tissue of 23 Phytotoma rutila (Table S1.1) collected through the species’ whole breeding and altitudinal range (Fig. 2a). We purified genomic DNA from muscle or blood with the DNeasy Blood & Tissue purification kit (Qiagen). We sampled genomic markers by double-digest restriction site-associated DNA sequencing (ddRADseq, (Peterson, Weber, Kay, Fisher, & Hoekstra, 2012) following the approach outlined by Peterson et al. (Peterson et al., 2012) and described in (Thrasher, Butcher, Campagna, Webster, & Lovette, 2018). Briefly, we digested samples with SbfI and MspI (New England Biolabs, MA) and ligated the digested DNA to adapters on both the 5’ and 3’ ends that allow multiplexing. The 5’ adaptors included barcodes and were unique to each sample, while the 3’ barcode was common to groups of 20 samples. We pooled samples with unique 3’ barcodes and selected DNA fragments in the 400-700 bp range. The libraries were enriched, and the TruSeq adapters were incorporated by performing 13 cycles of PCR. Finally, libraries were combined in equimolar proportions and sequenced on an Illumina HiSeq 2500 lane at the Cornell University Institute for Biotechnology, obtaining single-end 101 bp sequences. We demultiplexed, trimmed, filtered reads, assembled loci, and called single nucleotide polymorphism (SNPs) with ipyrad 0.7.28 (Eaton & Overcast, 2020). Parameters of the ipyrad pipeline and further SNP filtering using VCFtools (Danecek et al., 2011) are in Table S1.2. We used the first dataset of 23 samples for exploratory analyses and the second dataset of 21 samples (see below) in the downstream analyses. The ipyrad outfiles of the 23 samples dataset were used for the exploratory analyses such as STRUCTURE, maximum likelihood reconstruction, and PCA. Then, we removed two samples with excessive missing data (> 78%, MACN-Or-Ct 4079 and MACN-Or-Ct 2766) and re-run ipyrad over the remaining 21 individuals. We proceeded with SNP filtering using VCFtools (Danecek et al., 2011),(Danecek et al., 2011) to further filtering of SNPs. Our criteria to obtain the final RADseq data-set are: (a) remove sites with a quality score less than 30; (b) exclude sites with more than 80% of missing data (N); (c) remove low-frequency SNPs setting a Minimum Allele frequency of 0.05; (d) keep only biallelic SNPs; (e) reduce the Maximum Depth coverage to 300x (mean depth coverage of SNP in our sample + 1 standard deviation), and Minimum Depth coverage of 6x. Then, we randomly selected one SNP per locus, obtaining a final data set of 4,893 unlinked SNP (n = 21 samples, maximum missing rate per individual of 31.6%, an average depth across all loci of 89.1). We used this final ddRADseq dataset for all the remaining genetic analyses, except for those previously stated. Also, we used .loci ipyrad outfile to carry out G-PhocS and DXY-FST analysis. This filter set was used to remove SNPs that might be the product of sequencing errors, as these can create biases in inferring demographic processes (Linck & Battey, 2019; O’Leary, Puritz, Willis, Hollenbeck, & Portnoy, 2018; Willis, Hollenbeck, Puritz, Gold, & Portnoy, 2017). Aim: Along with environmental gradients, some species show significant differences in morphological, ecological-related traits. Those differences are commonly related to past events of allopatry but, alternatively, could be caused by natural selection in the presence of gene flow. We aimed to explore the prevalence of the divergence-with-gene-flow model across the Chaco-Andes dry forest belt, testing competing models of evolution in a Neotropical bird. Location: Central Andes Mountain range and Chaco region of Argentina and Bolivia. Taxon: Phytotoma rutila (Aves, Cotingidae). Methods: We studied ddRADseq loci (4,893 SNPs) of 21 tissue samples and body size variation of 146 specimens. We evaluated population genetic structure and tested the effects of altitude and distance on genomic divergence. To evaluate allopatry and divergence-with-gene-flow, we compared the divergence on phenotypic traits (bill, tarsus, and wing measurements) versus neutral genomic variation, conducted coalescent analyses to estimate gene flow and divergence time among populations, and calculated relative (FST) versus absolute (DXY) genomic divergence. Results: a) there is a genomic and phenotypic differentiation in P. rutila matched the highland-lowland axis, where the altitude variation explains genomic variation; b) A larger phenotypic than neutral genomic variation was found. c) there is an asymmetric gene flow between populations; d) a pattern of relative and absolute genomic differentiation compatible with divergence-with-gene-flow. Main conclusions: The mechanism behind the morphological and genomic diversification along the Chaco-Andes dry forest belt in P. rutila is divergence‐with‐gene‐flow. Far more complex than we traditionally thought, diversification in South America i...

  14. d

    Data from: Molecular data of Sphagnum majus ssp. majus and ssp. norvegicum...

    • datadryad.org
    • portalinvestigacion.um.es
    zip
    Updated Mar 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Nieto-Lugilde; Sean Robinson; Blanka Aguero; Aaron Duffy; Karn Imwattana; Kristian Hassel; Kjell I. Flatberg; Hans K. Stenoien; Anna V. Shkurko; Vladimir E. Fedosov; Jonathan A. Shaw (2022). Molecular data of Sphagnum majus ssp. majus and ssp. norvegicum (Bryophyta: Sphagnaceae) relative to taxonomy and geography [Dataset]. http://doi.org/10.5061/dryad.r7sqv9sdv
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 22, 2022
    Dataset provided by
    Dryad
    Authors
    Marta Nieto-Lugilde; Sean Robinson; Blanka Aguero; Aaron Duffy; Karn Imwattana; Kristian Hassel; Kjell I. Flatberg; Hans K. Stenoien; Anna V. Shkurko; Vladimir E. Fedosov; Jonathan A. Shaw
    Time period covered
    Feb 21, 2022
    Description

    Zipped folder with 88 files of demultiplexed Illumina reads for Sphagnum samples included in the analyses. Phylip format alignment of 6692 RADseq loci generated by ipyrad for 88 Sphagnum samples. Loci format file of 6692 RADseq loci generated by ipyrad for 88 Sphagnum samples. Fasta format alignment of 8 plastid concatenated loci generated by ipyrad and identified them by mapping to Sphagnum angustifolium reference genome (76 samples). Structure format file (two lines per samples, considering S. majus samples as diploid and the remain species as haploid) with one randomly selected SNP per locus generated by ipyrad for 78 ingroup S. majus and putative parental species. Structure format file (two lines per samples, considering S. majus samples as diploid) with one randomly selected SNP per locus generated by ipyrad for 63 ingroup S. majus.

  15. Z

    Supplementary data for "Testing the efficacy of different molecular tools...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattia De Vivo (2023). Supplementary data for "Testing the efficacy of different molecular tools for parasite conservation genetics: a case study using horsehair worms (Phylum Nematomorpha)" [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_7659650
    Explore at:
    Dataset updated
    Jul 12, 2023
    Dataset authored and provided by
    Mattia De Vivo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary data for "Testing the efficacy of different molecular tools for parasite conservation genetics: a case study using horsehair worms (Phylum Nematomorpha)"

    alignments: alignments used for BEAST ("bayes") and PopArt ("popart"). The "popart" folder also has a traits file per each species.

    bayesian_plots: TSVs ("tsv") and PDF files ("ogs") generated by BEAST. The "tsv" folder also has the scripts for plotting the results in R.

    easysfs: scripts, population file and results from the VCF to SFS conversione done by easySFS.

    fineRADstructure: fineRADstructure input files and output PDF plots ("plots") for C. formosanus ipyrad and Stacks ("stacks") data.

    logs: logs for ipyrad, ModelTest, PGDspider, PopArt ("popart") and Stacks ("stacks"). The "popart" folder also have the generated networks in a TXT file. The "stacks" folder also has ODS files for calculating the amount of loci per each M/n fixed value.

    snapclust: STR files used with R for snapclust. Scripts included.

    stairway_plot: input (blueprint files) and outputs for Stairway Plot 2 analyses. The C. formosanus folder ("chordodes") also has scripts for R plotting.

    vcfs: VCF and HDF5 files used in this study. Also scripts for filtering/converting data and plotting the PCA with ipyrad (activate python first!) for C. formosanus.

    "acutogordius" = A. taiwanensis "chordodes" = C. formosanus "gordius" = G. chiashanus

  16. d

    Data from: Caught in the act: Incipient speciation at the southern limit of...

    • dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated Jul 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Carlos Alonso Maya-Lastra; Patrick Sweeney; Deren Eaton; Vania Torrez; Carla Maldonado; Malu Ore-Rengifo; Mónica Arakaki; Michael Donoghue; Erika Edwards (2024). Caught in the act: Incipient speciation at the southern limit of Viburnum in the Central Andes [Dataset]. http://doi.org/10.5061/dryad.4f4qrfjh6
    Explore at:
    Dataset updated
    Jul 22, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Carlos Alonso Maya-Lastra; Patrick Sweeney; Deren Eaton; Vania Torrez; Carla Maldonado; Malu Ore-Rengifo; Mónica Arakaki; Michael Donoghue; Erika Edwards
    Time period covered
    Jan 1, 2023
    Area covered
    Andes
    Description

    A fundamental objective of evolutionary biology is to understand the origin of independently evolving species. Phylogenetic studies of species radiations rarely are able to document ongoing speciation; instead, modes of speciation, entailing geographic separation and/or ecological differentiation, are posited retrospectively. The Oreinotinus clade of Viburnum has radiated recently from north to south through the cloud forests of Mexico and Central America to the Central Andes. Our analyses support a hypothesis of incipient speciation in Oreinotinus at the southern edge of its geographic range, from central Peru to northern Argentina. Although several species and infraspecific taxa of have been recognized in this area, multiple lines of evidence and analytical approaches (including analyses of phylogenetic relationships, genetic structure, leaf morphology, and climatic envelopes) favor the recognition of just a single species, V. seemenii. We show that what has previously been recognized..., We collected leaf tissues from herbarium samples or from our own recent collections of these plants from central Peru to southern Bolivia. We included five samples previously classified as V. incarum, 29 as V. seemenii, and 13 as “New Name 2†. Whenever possible multiple individuals per population were included, and all collections were deposited in the Yale Herbarium (YU) except for three specimens located in NY and MO (Appendix 1). Total DNA was extracted from leaf tissues using DNeasy plant extraction kits (Qiagen Inc., Hilden, Germany). RAD-seq data were generated by Floragenex Inc. (Eugene, Oregon) by digesting genomic DNA with the PstI restriction enzyme, followed by sonication and size selection at 400 bp. Samples were ligated with 10bp barcodes for multiplexing. Then samples were pooled and sequenced on Illumina HiSeq 2500 or 4000 to produce 100bp SE reads. Samples were demultiplexed and assembled into orthologous loci with ipyrad v.0.9.85 (Eaton and Overcast 2020) using a refere..., Ipyrad is needed to open HDF5 files, # Data from: Caught in the act: Incipient speciation at the southern limit of Viburnum in the Central Andes

    In this repository, we are storing the following:

    HDF5 file of the assembly produced by ipyrad of our RADseq reads (bolivia_history.seqs.hdf5) HDF5 files with the SNPs found in item 1 (bolivia_history.snps.hdf5) Phylip file containing the final alignment used for the tree reconstruction (10-bolivia-initial_mcov0.25_rcov0.1_ALLscaff_SelectiveSampling.phy)

    HDF5 files were produced by ipyrad software (https://ipyrad.readthedocs.io/en/master/) which is a organized database that contains not only the DNA sequence, but also additional information used by this software for further analysis. As cross compatible file, we provide a Phylip version of the final alignment.

    For additional reproducibility material (scripts, notebooks) check: https://github.com/camayal/southern-oreinotinus

  17. d

    Data from: Integrating genomics and biogeography to unravel the origin of a...

    • search.dataone.org
    • explore.openaire.eu
    • +1more
    Updated Jul 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bernat Burriel-Carranza; Héctor Tejero-Cicuéndez; Albert Carné; Gabriel Riaño; Adrián Talavera; Saleh Al Saadi; Johannes Els; Jiřà ŠmÃd; Karin Tamar; Pedro Tarroso; Salvador Carranza (2024). Integrating genomics and biogeography to unravel the origin of a mountain biota: The case of a reptile endemicity hotspot in Arabia [Dataset]. http://doi.org/10.5061/dryad.r7sqv9sj3
    Explore at:
    Dataset updated
    Jul 20, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Bernat Burriel-Carranza; Héctor Tejero-Cicuéndez; Albert Carné; Gabriel Riaño; Adrián Talavera; Saleh Al Saadi; Johannes Els; Jiří Šmíd; Karin Tamar; Pedro Tarroso; Salvador Carranza
    Time period covered
    Mar 29, 2024
    Description

    Advances in genomics have greatly enhanced our understanding of mountain biodiversity, providing new insights into the complex and dynamic mechanisms that drive the formation of mountain biotas. These span from broad biogeographic patterns to population dynamics and adaptations to these environments. However, significant challenges remain in integrating large-scale and fine-scale findings to develop a comprehensive understanding of mountain biodiversity. One significant challenge is the lack of genomic data, particularly in historically understudied arid regions where reptiles are a particularly diverse vertebrate group. In the present study, we assembled a de novo genome-wide SNP dataset for the complete endemic reptile fauna of a mountain range (19 described species with more than 600 specimens sequenced), and integrated state-of-the-art biogeographic analyses at the population, species and community level. Thus, we provide for the first time a holistic integration of how a whole ende..., Raw ddRADseq demultiplexed reads are available and can be processed with ipyrad (https://ipyrad.readthedocs.io/). Supplementary figures, supplementary tables datasets and tree files are uploaded as separate files. , , # Title of Dataset: Integrating genomics and biogeography to unravel the origin of a mountain biota: The case of a reptile endemicity hotspot in Arabia

    Brief summary of dataset contents.

    Data archive for:
    Integrating genomics and biogeography to unravel the origin of a mountain biota: The case of a reptile endemicity hotspot in Arabia

    by Bernat Burriel-Carranza, Hctor Tejero-Cicundez, Albert Carn, Gabriel Riao, Adrin Talavera, Saleh Al Saadi, Johannes Els, Ji md, Karin Tamar, Pedro Tarroso and Salvador Carranza

    This dataset contains ddRADseq raw reads, data files that should allow replication of the workflow, and the resulting phylogenomic and pylogenetic trees produced in this work. Also, this dataset contains supplementary material (Figures and Tables), and the extended methods related to this publication.

    In the present work, we assembled a large genomic database (n = 661) for all endemic reptiles of the Hajar Mountains. We investigated the diversity, population stru...

  18. n

    Data from: History of the terrestrial isopod genus Ligidium in Japan based...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jun 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wakana Harigai; Aya Saito; Chika Zemmoto; Shigenori Karasawa; Touta Yokoi; Atsushi Nagano; Hitoshi Suzuki; Masanobu Yamamoto (2023). History of the terrestrial isopod genus Ligidium in Japan based on phylogeographic analysis [Dataset]. http://doi.org/10.5061/dryad.zkh1893bw
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 20, 2023
    Dataset provided by
    Ryukoku University
    Osaka University
    Hokkaido University
    Tottori University
    Authors
    Wakana Harigai; Aya Saito; Chika Zemmoto; Shigenori Karasawa; Touta Yokoi; Atsushi Nagano; Hitoshi Suzuki; Masanobu Yamamoto
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Japan
    Description

    Background Phylogeographical approaches explain the genetic diversity of local organisms in the context of their geological and geographic environments. Thus, genetic diversity can be a proxy for geological history. Here we propose a genus of woodland isopod, Ligidium, as a marker of geological history in relation to orogeny and the Quaternary glacial cycle. Results Mitochondrial analysis of 721 individuals from 97 sites across Japan revealed phylogenetic divergence between the northeastern and southwestern Japan arcs from 7 to 3.5 million years ago. It also showed repeated population expansions in northeastern Japan in response to Quaternary glacial and interglacial cycles. Genome-wide analysis of 83 selected individuals revealed multiple genetic nuclear clusters. The genomic groupings were consistent with the local geographic distribution, indicating that the Ligidium phylogeny reflects its migration history. Conclusion Ligidium DNA sequence analysis can provide insight into the geological, geographical, and paleoenvironmental history of the studied region. Methods Sample collection We surveyed Ligidium populations in Japan. We collected 828 Ligidium specimens from 97 sites and sequenced 721 samples (Fig. 1, Table 1). Samples were preserved in 70–99.5% ethanol in 2-mL microtubes at 4°C or room temperature. DNA extraction, amplification, and sequencing Genomic DNA was isolated from the muscles of the abdomen and legs with a DNA Mini Kit (Qiagen, Germantown, MD, USA). PCR amplification was conducted using the primer pair LCO-1490 and HCO-2198. Amplification and cycling conditions were as in a previous work. Details of the experimental conditions are provided in the Supplemental Data. Sequencing was conducted with an ABI 3130 Genetic Analyzer (Applied Biosystems, Waltham, MA, USA). Sequences were checked and assembled using MEGA7. Mitochondrial gene locus (CO1) sequences for 721 individuals were determined and aligned using ClustalW. RAD-seq We performed RAD-seq analysis to search for SNPs in individuals from Niigata and Hokkaido obtained in previous studies, two regions in northern Japan (Aomori and Sendai), and two regions in western Japan (Shizuoka and Sendai). Tables 1 and S1 list the samples used, and the sampling sites are shown in Fig. 1c. Genomic DNA was isolated from almost whole-body tissues with a DNA Mini Kit (Qiagen). Libraries for RAD-seq were prepared with EcoRI and BglII restriction enzymes. The library was sequenced with 150 + 150 bp paired-end reads in one lane of an Illumina HiSeqX instrument (Illumina, San Diego, CA, USA) by Macrogen (Seoul, South Korea). Raw reads were trimmed using Trimmomatic-0.39 with the following parameters: ILLUMINACLIP: adapter.fasta:2:30:10:keepBothReads, SLIDINGWINDOW: 4:15, CROP: 132, HEADCROP: 2, and MINLEN: 130. Sequences are available at the DNA Data Bank of Japan (DDBJ) Sequence Read Archive (DRA014204). We used two pipeline programs to call the SNPs: denovo_map.pl provided by Stacks and ipyrad. Following a previous work, we varied the combinations of the denovo_map.pl parameters as follows: (n, M) = (2, 1), (3, 2), (4, 3), (5, 6), and selected (n, M) = (2, 1), which called the most SNPs. We used Stacks’ populations program to analyze populations of individual samples, calculate population genetics statistics, and export data in various output formats for analysis. PLINK v1.90b6.18 was used for data handling. Alleles with a frequency of < 1% and sites with > 50% heterozygosity were removed. Only SNPs shared by ~80% of the individuals were retained. With ipyrad, loci with frequencies of > 50% heterozygosity were removed, and SNPs shared by ~70% of the local populations were retained. We retained SNPs shared by at least two individuals and filtered out individuals that did not have 80% of all SNPs using TASSEL 5. After filtering with TASSEL 5, we used PGDSpider to convert the VCF files for other analyses. Genetic structure analysis We tested the ability of Structure v. 2.3.4 to determine the genetic structure of populations using Bayesian cluster analysis. Ten simulations were run, with the burn-in period and Markov chain Monte Carlo iterations set to 105 and 106, respectively. The maximum value of K was determined based on the mtDNA results and geographical distribution. For the Structure analysis, one SNP was randomly sampled from each locus to avoid the effect of linkage disequilibrium. The python script vcf_single_snp.py (radseq/vcf_single_snp.py at master · pimbongaerts/radseq · GitHub) was used to obtain the one SNP datum from ipyrad, and drawings were created using the R package pophelper. In addition, PCA was performed to visualize the genetic differences among populations using the adegenet package in R. We obtained pairwise Fst values for the RAD-seq dataset using Arlequin 3.5.1.2. Fst values were used to test population structure, supported by cluster analysis, and statistical significance was based on 1000 restored extractions.

  19. d

    Data for: Desert landscape features influencing the microgeographic genetic...

    • datadryad.org
    • data.niaid.nih.gov
    zip
    Updated Nov 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ella Vázquez-Domínguez; Gissella Pineda-Sánchez (2024). Data for: Desert landscape features influencing the microgeographic genetic structure of Nelson’s pocket mouse Chaetodipus nelsoni [Dataset]. http://doi.org/10.5061/dryad.qbzkh18sz
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 7, 2024
    Dataset provided by
    Dryad
    Authors
    Ella Vázquez-Domínguez; Gissella Pineda-Sánchez
    Time period covered
    Oct 22, 2024
    Description

    Sampling individuals in the wild. Tissue samples for extracting DNA. Library preparation and sequencing were performed at the University of Wisconsin Biotechnology Center (UWBC) following a Genotyping by sequencing approach. Libraries were prepared using the ApeKI enzyme (cutting site: C[AT]G) and sequenced on a single lane of an Illumina Hi-Seq 2500 with single-end 101 bp reads. A de novo assembly was performed using raw data processed with ipyrad (Eaton and Overcast 2020; https://ipyrad.readthedocs.io/en/master/). We further processed our data using VCFtools v.0.1.13 (see publication for the bioinformatics description).

  20. Data from: Spatial and host related genomic variation in partially sympatric...

    • figshare.com
    application/gzip
    Updated Oct 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Poveda-Martínez; Laura Varone; Malena Fuentes Corona; Stephen Hight; Guillermo Logarzo; Esteban Hasson (2021). Spatial and host related genomic variation in partially sympatric cactophagous moth species [Dataset]. http://doi.org/10.6084/m9.figshare.13118249.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Oct 11, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daniel Poveda-Martínez; Laura Varone; Malena Fuentes Corona; Stephen Hight; Guillermo Logarzo; Esteban Hasson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raw reads were demultiplexed, filtered and assembled using iPYRAD v0.9.59 (Eaton and Overcast, 2020). Demultiplexing was done using the unique barcode and adapter sequences. Then, samples’ reads were filtered using the stricter filter for Illumina adapters, and filtered to clean up the edges of poor quality reads. A Reference based assembly method was implemented using the reference draft genome CactoFuEDEI.fa. We set 2 as the maximum number of unique alleles allowed in consensus reads, and ran ipyrad to obtain four datasets, each one controlling 25%, 50%, 80% and 90% of the minimum number of samples per locus (msl), respectively.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Patrick Fahey; Rachael Fowler; Todd McLay; Frank Udovicic; David Cantrill; Michael Bayly (2021). Data from: Divergent lineages in a semi-arid mallee species, Eucalyptus behriana, correspond to a major geographic break in south-eastern Australia [Dataset]. http://doi.org/10.5061/dryad.v9s4mw6sm

Data from: Divergent lineages in a semi-arid mallee species, Eucalyptus behriana, correspond to a major geographic break in south-eastern Australia

Related Article
Explore at:
26 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Aug 28, 2021
Authors
Patrick Fahey; Rachael Fowler; Todd McLay; Frank Udovicic; David Cantrill; Michael Bayly
Area covered
Australia, Eastern states of Australia
Description

Aim: To infer relationships between populations of the semi-arid, mallee eucalypt, Eucalyptus behriana, to build hypotheses regarding evolution of major disjunctions in the species’ distribution and to expand understanding of the biogeographical history of south-eastern Australia. Location: South-eastern Australia Taxon: Eucalyptus behriana (Myrtaceae, Angiospermae) Methods: We developed a large dataset of anonymous genomic loci for 97 samples from 11 populations of E. behriana using double digest restriction site associated DNA sequencing (ddRAD-seq), to determine genetic relationships between the populations. These relationships, along with species distribution models, were used to construct hypotheses regarding environmental processes that have driven fragmentation of the species’ distribution. Results: Greatest genetic divergence was between populations on either side of the Lower Murray Basin. Populations west of the Basin showed greater genetic divergence between one another than the eastern populations. The most genetically distinct population in the east (Long Forest) was separated from others by the Great Dividing Range. A close relationship was found between the outlying northernmost population (near West Wyalong) and those in the Victorian Goldfields despite a large disjunction between them. Conclusions: Patterns of genetic variation are consistent with a history of vicariant differentiation of disjunct populations. We infer that an early disjunction to develop in the species distribution was that across the Lower Murray Basin, an important biogeographical barrier separating many dry sclerophyll plant taxa in south-eastern Australia. Additionally, our results suggest that the western populations fragmented earlier than the eastern ones, with this fragmentation, both west and east of the Murray Basin, likely tied to climatic changes associated with glacial-interglacial cycles, although major geological events including uplift of the Mount Lofty Ranges and basalt flows in the Newer Volcanics Province possibly also played a role. Supplementary tables.docx - Tables containing information on Ipyrad parameters and individual samples in the dataset used for analyses Concatenated_alignment.nex - nexus format concatenated alignment of all loci generated by ipyrad used in phylogenetic analyses One_SNP_per_5000bp_Egrandis_reference_genepop.txt - Genepop format file containing filtered SNP dataset containing no more than one SNP per 5000 bp of the E. grandis reference genome used to assemble loci in ipyrad.

Search
Clear search
Close search
Google apps
Main menu