4 datasets found
  1. u

    Data from: Dsuite - fast D-statistics and related admixture evidence from...

    • repository.uantwerpen.be
    • data.niaid.nih.gov
    • +2more
    Updated 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malinsky, Milan; Matschiner, Michael; Svardal, Hannes (2020). Dsuite - fast D-statistics and related admixture evidence from VCF files [Dataset]. http://doi.org/10.5061/dryad.tdz08kpxt
    Explore at:
    Dataset updated
    2020
    Dataset provided by
    Faculty of Sciences. Biology
    University of Antwerp
    Dryad
    Authors
    Malinsky, Milan; Matschiner, Michael; Svardal, Hannes
    Description

    Patterson’s D, also known as the ABBA-BABA statistic, and related statistics such as the f4-ratio, are commonly used to assess evidence of gene flow between populations or closely related species. Currently available implementations often require custom file formats, implement only small subsets of the available statistics, and are impractical to evaluate all gene flow hypotheses across datasets with many populations or species due to computational inefficiencies. Here we present a new software package Dsuite, an efficient implementation allowing genome scale calculations of the D and f4-ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file. Our program also implements statistics suited for application to genomic windows, providing evidence of whether introgression is confined to specific loci and it can also aid in interpretation of a system of f4-ratio results with the use of the ‘f-branch’ method. Dsuite is available at https://github.com/millanek/Dsuite, is straightforward to use, substantially more computationally efficient than comparable programs, and provides a convenient suite of tools and statistics, including some not previously available in any software package. Thus, Dsuite facilitates the assessment of evidence for gene flow, especially across larger genomic datasets.

  2. Data for: Among-species rate variation produces false signals of...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    bin, pdf
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thore Koppetsch; Milan Malinsky; Michael Matschiner; Michael Matschiner; Thore Koppetsch; Milan Malinsky (2024). Data for: Among-species rate variation produces false signals of introgression [Dataset]. http://doi.org/10.5061/dryad.sf7m0cgbs
    Explore at:
    bin, pdfAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Thore Koppetsch; Milan Malinsky; Michael Matschiner; Michael Matschiner; Thore Koppetsch; Milan Malinsky
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The role of interspecific hybridization in the context of diversification dynamics has recently seen increasing attention. Genomic research has now made it abundantly clear that both hybridization and introgression - the exchange of genetic material through hybridization - are far more common than previously thought. Moreover, even highly divergent species were found to hybridize and backcross. These findings raise the question whether commonly used methods for the detection of introgression are applicable to such divergent hybridizing species, given that most of these methods were originally developed for analyses at the level of populations and recently diverged species. In particular, the assumption of constant evolutionary rates, which is implicit in many commonly used approaches, is more likely to be violated as evolutionary divergence increases. To test the limitations of introgression detection methods when being applied to divergent species, we simulated thousands of genomic datasets under a wide range of settings, with varying degrees of among-species rate variation and introgression. Using these simulated datasets, we were able to show that commonly applied statistical methods, including the D-statistic and tests based on sets of phylogenetic trees, produce false-positive signals of introgression between highly divergent taxa when these have different rates of evolution. These misleading signals are caused by the presence of homoplasies that occur at different rates when rate variation is present. To distinguish between the patterns caused by rate variation and genuine introgression, we developed a new test that is based on the expected clustering of introgressed sites and implemented this test in the program Dsuite.

  3. Z

    Data from: Origin and fate of supergenes in Atlantic cod

    • data.niaid.nih.gov
    Updated Aug 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jentoft (2021). Origin and fate of supergenes in Atlantic cod [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4560274
    Explore at:
    Dataset updated
    Aug 1, 2021
    Dataset provided by
    Matschiner, Michael
    Jakobsen
    Brieuc, Marine Servane Ono
    Pampoulie, Christophe
    Tørresen, Ole Kristian
    Bradbury
    Barth, Julia Maria Isis
    Star, Bastiaan
    Baalsrud, Helle Tessand
    Jentoft
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Input and output files for analyses of divergence times of Gadinae with StarBEAST2

    gadinae_starbeast_alignments.tgz gadinae_starbeast.xml gadinae_starbeast.log gadinae_starbeast.trees gadinae_starbeast.tre

    Input and output files for analyses of divergence times and introgression among Gadus, Arctogadus, and Boreogadus with AIM

    gadus_arctogadus_boreogadus_aim_alignments.tgz gadus_arctogadus_boreogadus_aim.xml gadus_arctogadus_boreogadus_aim.log gadus_arctogadus_boreogadus_aim.trees gadus_arctogadus_boreogadus_aim_bpp0763.tre gadus_arctogadus_boreogadus_aim_bpp0234.tre

    Dataset used for analyses of introgression among Gadus, Arctogadus, and Boreogadus with Dsuite

    gadus_arctogadus_boreogadus_dsuite.vcf.gz

    Alignments and output trees for analyses of tree-based signals of introgression among Gadus, Arctogadus, and Boreogadus with IQ-TREE

    gadus_arctogadus_boreogadus_iqtree_alignments.tgz gadus_arctogadus_boreogadus_iqtree_trees.tgz

    Datasets used for supergene-specific analyses of linkage disequilibrium in Gadus morhua with PLINK

    gadus_morhua_supergene_lg01_plink.map gadus_morhua_supergene_lg01_plink.ped

    gadus_morhua_supergene_lg02_plink.map gadus_morhua_supergene_lg02_plink.ped

    gadus_morhua_supergene_lg07_plink.map gadus_morhua_supergene_lg07_plink.ped

    gadus_morhua_supergene_lg12_plink.map gadus_morhua_supergene_lg12_plink.ped

    Input and output files for analyses of divergence times among Gadus morhua populations with SNAPP

    gadus_morhua_outside_supergenes_snapp.vcf.gz gadus_morhua_outside_supergenes_snapp.xml gadus_morhua_outside_supergenes_snapp.log gadus_morhua_outside_supergenes_snapp.tre gadus_morhua_outside_supergenes_snapp.trees

    Dataset used for analyses of introgression among Gadus morhua populations with Dsuite

    gadus_morhua_outside_supergenes_dsuite.vcf.gz

    Dataset used for analyses of demograpy of Gadus morhua populations with Relate

    gadus_morhua_outside_supergenes_relate.vcf.gz

    Input and output files for supergene-specific analyses of divergence times among Gadus morhua populations with SNAPP

    gadus_morhua_supergene_lg01_snapp.vcf.gz gadus_morhua_supergene_lg01_snapp.xml gadus_morhua_supergene_lg01_snapp.log gadus_morhua_supergene_lg01_snapp.trees gadus_morhua_supergene_lg01_snapp.tre

    gadus_morhua_supergene_lg02_snapp.vcf.gz gadus_morhua_supergene_lg02_snapp.xml gadus_morhua_supergene_lg02_snapp.log gadus_morhua_supergene_lg02_snapp.trees gadus_morhua_supergene_lg02_snapp.tre

    gadus_morhua_supergene_lg07_snapp.vcf.gz gadus_morhua_supergene_lg07_snapp.xml gadus_morhua_supergene_lg07_snapp.log gadus_morhua_supergene_lg07_snapp.trees gadus_morhua_supergene_lg07_snapp.tre

    gadus_morhua_supergene_lg12_snapp.vcf.gz gadus_morhua_supergene_lg12_snapp.xml gadus_morhua_supergene_lg12_snapp.log gadus_morhua_supergene_lg12_snapp.trees gadus_morhua_supergene_lg12_snapp.tre

    Datasets used for supergene-specific analyses of introgression among Gadus morhua populations with Dsuite

    gadus_morhua_supergene_lg01_dsuite.vcf.gz gadus_morhua_supergene_lg02_dsuite.vcf.gz gadus_morhua_supergene_lg07_dsuite.vcf.gz gadus_morhua_supergene_lg12_dsuite.vcf.gz

    Datasets used for supergene-specific analyses of demograpy of Gadus morhua populations with Relate

    gadus_morhua_supergene_lg01_relate.vcf.gz gadus_morhua_supergene_lg02_relate.vcf.gz gadus_morhua_supergene_lg07_relate.vcf.gz gadus_morhua_supergene_lg12_relate.vcf.gz

  4. d

    Data from: Phylogenetic conflict between species tree and maternally...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Dec 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dezhi Zhang (2023). Phylogenetic conflict between species tree and maternally inherited gene trees in a clade of Emberiza buntings (Aves: Emberizidae) [Dataset]. http://doi.org/10.5061/dryad.wstqjq2r1
    Explore at:
    Dataset updated
    Dec 27, 2023
    Dataset provided by
    Dryad Digital Repository
    Authors
    Dezhi Zhang
    Time period covered
    Jan 1, 2023
    Description

    Different genomic regions may reflect conflicting phylogenetic topologies on account of incomplete lineage sorting and/or gene flow. Genomic data are necessary to reconstruct the true species tree and explore potential causes of phylogenetic conflict. Here, we investigate the phylogenetic relationships of four Emberiza species (Aves: Emberizidae) and discuss the potential causes of the observed mitochondrial non-monophyly of Emberiza godlewskii (Godlewski's bunting) using phylogenomic analyses based on whole genome resequencing data from 41 birds. Phylogenetic analyses based on both the whole mitochondrial genome and ~39 kilobases from the non-recombining W chromosome reveal that the northern and southern populations of E. godlewskii are each sister to E. cioides and E. cia, respectively. In contrast, phylogenetic analysis based on genome-wide data support the monophyly of E. godlewskii with the following tree topology: (((E. godlewskii, E. cia), E. cioides), E. jankowskii). Using D-sta..., Dataset and supporting information from the manuscript of Phylogenetic conflict between species tree and maternally inherited gene trees in a clade of Emberiza buntings (Aves: Emberizidae)., , These files contains original fasta or vcf file for phylogenetic analyses in the manuscript USYB-2023-045

    File "Concatenated_NRW.fas" contains aligned NRW sequences; File "Dsuite_output_data.txt" contains output data of Dsuite analysis; huimei, liban, sandaomei, south and north represent E. cia, E. jankowskii, E. cioides, southern E. godlewskii and northern E. godlewskii, respectively; BBAA, ABBA and BABA represents the counts conform to the corresponding site patterns; for other parameters, please refer to Dsuite tutorial; File "ldfil.recode.vcf.gz" contains linkage disequilibrium pruned vcf file; File "mitochondrial_cytb.fas" contains aligned mitochondrial cytb sequences; File "mitochondrial_other_sequences.fas" contains aligned mitochondrial sequences without cytb gene File "MP-EST_input_20000 trees.rar" contains 20000 sliding window trees for the input to MP-EST analysis; File "ALSTRAL_input_trees_of_Z_chromosome.trees" contains 3583 sliding window trees from chromosome Z for the i...

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Malinsky, Milan; Matschiner, Michael; Svardal, Hannes (2020). Dsuite - fast D-statistics and related admixture evidence from VCF files [Dataset]. http://doi.org/10.5061/dryad.tdz08kpxt

Data from: Dsuite - fast D-statistics and related admixture evidence from VCF files

Related Article
Explore at:
Dataset updated
2020
Dataset provided by
Faculty of Sciences. Biology
University of Antwerp
Dryad
Authors
Malinsky, Milan; Matschiner, Michael; Svardal, Hannes
Description

Patterson’s D, also known as the ABBA-BABA statistic, and related statistics such as the f4-ratio, are commonly used to assess evidence of gene flow between populations or closely related species. Currently available implementations often require custom file formats, implement only small subsets of the available statistics, and are impractical to evaluate all gene flow hypotheses across datasets with many populations or species due to computational inefficiencies. Here we present a new software package Dsuite, an efficient implementation allowing genome scale calculations of the D and f4-ratio statistics across all combinations of tens or hundreds of populations or species directly from a variant call format (VCF) file. Our program also implements statistics suited for application to genomic windows, providing evidence of whether introgression is confined to specific loci and it can also aid in interpretation of a system of f4-ratio results with the use of the ‘f-branch’ method. Dsuite is available at https://github.com/millanek/Dsuite, is straightforward to use, substantially more computationally efficient than comparable programs, and provides a convenient suite of tools and statistics, including some not previously available in any software package. Thus, Dsuite facilitates the assessment of evidence for gene flow, especially across larger genomic datasets.

Search
Clear search
Close search
Google apps
Main menu