100+ datasets found
  1. f

    Summary Statistics.

    • figshare.com
    xls
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jason M. Fletcher (2023). Summary Statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0050576.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Jason M. Fletcher
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NHANES 1991–1994 Genetic Sample (N = 6,178).Notes: Author’s calculations from NHANES Data. Sample weights used.

  2. Data for the course "Population Genomics" at Aarhus University

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated Mar 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch (2023). Data for the course "Population Genomics" at Aarhus University [Dataset]. http://doi.org/10.5281/zenodo.7660785
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Mar 9, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch.

    1. Data.tar.gz Contains the datasets and executable files for some of the softwares
    2. Course_Env.packed.tar.gz Contains the conda environment used for the course. This needs to be unpacked to adjust all the prefixes. You do this in the command line by
      1. creating the folder Course_Env: mkdir Course_Env
      2. untar the file: tar -zxf Course_Env.packed.tar.gz -C Course_Env
      3. Activate the environment: conda activate ./Course_Env
      4. Run the unpacking script (it can take quite some time to get it done): conda-unpack
    3. Course_Env.unpacked.tar.gz The same environment as above, but will work only if untarred into the folder /usr/Material - so use the versione above if you are using it in another folder. This file is mostly to execute the course in our own cloud environment.
    4. environment_with_args.yml The file needed to generate the conda environment. Create and activate the environment with the following commands:
      1. conda env create -f environment_with_args.yml -p ./Course_Env
      2. conda activate ./Course_Env

    The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.

    Description

    The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.

    The participants must at the end of the course be able to:

    • Identify an experimental platform relevant to a population genomic analysis.
    • Apply commonly used population genomic methods.
    • Explain the theory behind common population genomic methods.
    • Reflect on strengths and limitations of population genomic methods.
    • Interpret and analyze results of population genomic inference.
    • Formulate population genetics hypotheses based on data

    The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.

    Curriculum

    The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.

    Course plan

    1. Course intro and overview:
    2. Drift and the coalescent:
    3. Recombination:
    4. Population strucure and incomplete lineage sorting:
    5. Hidden Markov models:
    6. Ancestral recombination graphs:
    7. Past population demography:
    8. Direct and linked selection:
    9. Admixture:
    10. Genome-wide association study (GWAS):
    11. Heritability:
    12. Evolution and disease:
  3. N

    Gene Autry, OK Population Breakdown by Gender Dataset: Male and Female...

    • neilsberg.com
    csv, json
    Updated Feb 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Gene Autry, OK Population Breakdown by Gender Dataset: Male and Female Population Distribution // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/b2344209-f25d-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 24, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Gene Autry, Oklahoma
    Variables measured
    Male Population, Female Population, Male Population as Percent of Total Population, Female Population as Percent of Total Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the gender classifications (biological sex) reported by the US Census Bureau. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the population of Gene Autry by gender, including both male and female populations. This dataset can be utilized to understand the population distribution of Gene Autry across both sexes and to determine which sex constitutes the majority.

    Key observations

    There is a slight majority of female population, with 50.92% of total population being female. Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Scope of gender :

    Please note that American Community Survey asks a question about the respondents current sex, but not about gender, sexual orientation, or sex at birth. The question is intended to capture data for biological sex, not gender. Respondents are supposed to respond with the answer as either of Male or Female. Our research and this dataset mirrors the data reported as Male and Female for gender distribution analysis. No further analysis is done on the data reported from the Census Bureau.

    Variables / Data Columns

    • Gender: This column displays the Gender (Male / Female)
    • Population: The population of the gender in the Gene Autry is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each gender as a proportion of Gene Autry total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Gene Autry Population by Race & Ethnicity. You can refer the same here

  4. Number of nominally significant genes before and after filtering.

    • plos.figshare.com
    xls
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Liu; Aniko Sabo; Benjamin M. Neale; Uma Nagaswamy; Christine Stevens; Elaine Lim; Corneliu A. Bodea; Donna Muzny; Jeffrey G. Reid; Eric Banks; Hillary Coon; Mark DePristo; Huyen Dinh; Tim Fennel; Jason Flannick; Stacey Gabriel; Kiran Garimella; Shannon Gross; Alicia Hawes; Lora Lewis; Vladimir Makarov; Jared Maguire; Irene Newsham; Ryan Poplin; Stephan Ripke; Khalid Shakir; Kaitlin E. Samocha; Yuanqing Wu; Eric Boerwinkle; Joseph D. Buxbaum; Edwin H. Cook Jr; Bernie Devlin; Gerard D. Schellenberg; James S. Sutcliffe; Mark J. Daly; Richard A. Gibbs; Kathryn Roeder (2023). Number of nominally significant genes before and after filtering. [Dataset]. http://doi.org/10.1371/journal.pgen.1003443.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Li Liu; Aniko Sabo; Benjamin M. Neale; Uma Nagaswamy; Christine Stevens; Elaine Lim; Corneliu A. Bodea; Donna Muzny; Jeffrey G. Reid; Eric Banks; Hillary Coon; Mark DePristo; Huyen Dinh; Tim Fennel; Jason Flannick; Stacey Gabriel; Kiran Garimella; Shannon Gross; Alicia Hawes; Lora Lewis; Vladimir Makarov; Jared Maguire; Irene Newsham; Ryan Poplin; Stephan Ripke; Khalid Shakir; Kaitlin E. Samocha; Yuanqing Wu; Eric Boerwinkle; Joseph D. Buxbaum; Edwin H. Cook Jr; Bernie Devlin; Gerard D. Schellenberg; James S. Sutcliffe; Mark J. Daly; Richard A. Gibbs; Kathryn Roeder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Note: Significance level is 0.01, not corrected for muliple testing. The analyses of the first two rows are for all genes that have at least one MAC in Baylor and Broad dataset. The last rows are restricted to the genes that have more than 15 minor alleles after combining Baylor and Broad datasets.

  5. e

    Sacramento trawl – Genetic Determination of Population of Origin 2017-2023

    • portal.edirepository.org
    csv
    Updated Mar 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott Blankenship (2025). Sacramento trawl – Genetic Determination of Population of Origin 2017-2023 [Dataset]. http://doi.org/10.6073/pasta/a7e9f998a6843f0f8ad99f18da221d21
    Explore at:
    csv(160344 byte)Available download formats
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    EDI
    Authors
    Scott Blankenship
    Time period covered
    Dec 12, 2016 - Jul 12, 2023
    Area covered
    Variables measured
    ID, Julian, FieldID, LAD_Race, ForkLength, SampleDate, Prob_Assignment, Genetic_Assignment, greb1L_classification
    Description

    Central Valley Chinook Salmon populations differ in their Endangered Species Act listing status. It is often difficult to distinguish individuals from the different Evolutionarily Significant Units. As such, many of the salmon monitoring and evaluation efforts in the Central Valley and San Francisco Bay-Delta are hampered by uncertainty about population (stock) identification and proportional effects of management actions (Dekar et al. 2013; IEP 2019). Studies have identified that the current identification method (length-at-date models) of juvenile Chinook salmon (Fisher 1992) captured in the watershed vary in their accuracy, particularly for spring-run (NMFS 2013; Harvey et al. 2014; Merz et al. 2014). The inaccuracy of the size-based methods is likely due to differences in fish distribution during early rearing, habitat-specific growth rates, and inter-annual variability in temperatures and food availability that lead to overlap in size ranges among stocks. The primary objective of this project was the genetic classification (to race; Evolutionary Significant Unit) of Chinook Salmon captured from State Water Project and Central Valley Project fish protection facilities and Interagency Ecological Program monitoring programs. The population-of-origin was determined for sampled fish by comparing their genotypes to reference genetic baselines. Genetic methods, having less statistical uncertainty that size-based models for population identification, were intended to directly target (and reduce) one source of uncertainty in the estimation of loss (take) from water diversions (operations) and develop the information necessary for understanding stock-specific distribution, habitat utilization, abundance, and life history variation. This project supports recommendations from the Interagency Ecological Program’s Salmon and Sturgeon Assessment of Indicators by Life Stage and Interagency Ecological Program Science Agenda efforts to improve Central Valley salmonid monitoring (Johnson et al. 2017; IEP 2019).

       Note that the genetic data provided here may be included in other data repositories. Regarding Sacramento trawl activities, refer to the Interagency Ecological Program: Over four decades of juvenile fish monitoring data from the San Francisco Estuary, collected by the Delta Juvenile Fish Monitoring Program, 1976-2020
    
       Package ID: edi.244.8 
    
       Literature Cited
    
       Dekar, M., P. Brandes, J. Kirsch, L. Smith, J. Speegle, P. Cadrett and M. Marshall. 2013. USFWS Delta Juvenile Fish Monitoring Program Review. Background Document. Prepared for IEP Science Advisory Group, June 2013. US Fish and Wildlife Service, Stockton Fish and Wildlife Office, Lodi, CA. 224 p.
       Fisher, F.W. 1992. Chinook Salmon, Oncorhynchus tshawytscha, growth and occurrence in the Sacramento-San Joaquin River system. California Department of Fish and Game, Inland Fisheries Divisions, draft office report, Redding. 
       Harvey, B.N., D.P. Jacobson, M.A. Banks. 2014. Quantifying the uncertainty of a juvenile Chinook Salmon Race Identification Methyod for a Mixed-Race Stock. North American Journal of Fisheries Management. 
       IEP, Interagency Ecological Program. 2019. Interagency Ecological Program Science Strategy 2020-2024: Invenstment Priorities for Interagency Collaborative Science.
       Johnson, R.C., S. Windell, P. L. Brandes, J. L. Conrad, J. Ferguson, P. A. L. Goertler, B. N. Harvey, J.Heublein, J. A. Israel, D. W. Kratville, J. E. Kirsch, R. W. Perry, J. Pisciotto, W. R. Poytress, K. Reece, and B. G. Swart. 2017. Increasing the management value of life stage monitoring networks for three imperiled fishes in California's regulated rivers: case study Sacramento Winter-run Chinook salmon. San Francisco Estuary and Watershed Science 15: 1-41.
       National Marine Fisheries Service (NMFS). 2013. Endangered and Threatened Species: Designation of a Nonessential Experimental Population of Central Valley Spring-Run Chinook Salmon Below Friant Dam in the San Joaquin River, CA. Federal Register 70: 79622, December 31, 2013.
    
  6. o

    Data for "Comparison of genetic variants in matched samples using thesaurus...

    • ora.ox.ac.uk
    Updated Jan 1, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konopka, T; Nijman, SMB (2017). Data for "Comparison of genetic variants in matched samples using thesaurus annotation" [Dataset]. http://doi.org/10.6084/m9.figshare.4555441
    Explore at:
    Dataset updated
    Jan 1, 2017
    Dataset provided by
    University of Oxford
    Authors
    Konopka, T; Nijman, SMB
    License

    https://ora.ox.ac.uk/terms_of_usehttps://ora.ox.ac.uk/terms_of_use

    Description

    Data for "Comparison of genetic variants in matched samples using thesaurus annotation"



    Background


    The files in this set hold data corresponding to the figures in the following research article:
    Konopka, Tomasz, and Sebastian MB Nijman. "Comparison of genetic variants in matched samples using thesaurus annotation." Bioinformatics(2015): btv654.

    Figure 1


    Figure 1 in the publication is a schematic of the computational method described in the article. Thus there are not data files that correpond to this figure.



    Figure 2


    File: thesaurus.fig2.tsv
    Data in this figure represent summary statistics from an in-silico benchmarking study.

    Figure 3


    File: thesaurus.fig3.tsv
    Data in this figure represent summary statistics from an in-silico benchmarking study.

    Figure 4


    Files: thesaurus.fig4-MQ16.tsv, thesaurus.fig4-MQ0.tsv
    Data in this figure 4 are based on high-throughput sequencing data for samples NA12877, NA12878, and NA12882 from dataset ERP001960. The raw sequencing data are publicly available at https://www.ebi.ac.uk/ena/data/view/PRJEB3381

    Notes


    Please refer to the published manuscript for methods and data interpretation. Motivation: Calling changes in DNA, e.g. as a result of somatic events in cancer, requires analysis of multiple matched sequenced samples. Events in low-mappability regions of the human genome are difficult to encode in variant call files and have been under-reported as a result. However, they can be described accurately through thesaurus annotation—a technique that links multiple genomic loci together to explicate a single variant. Results: We here describe software and benchmarks for using thesaurus annotation to detect point changes in DNA from matched samples. In benchmarks on matched normal/tumor samples we show that the technique can recover between five and ten percent more true events than conventional approaches, while strictly limiting false discovery and being fully consistent with popular variant analysis workflows. We also demonstrate the utility of the approach for analysis of de novo mutations in parents/child families. Availability and implementation: Software performing thesaurus annotation is implemented in java; available in source code on github at GeneticThesaurus ( https://github.com/tkonopka/GeneticThesaurus ) and as an executable on sourceforge at geneticthesaurus ( https://sourceforge.net/projects/geneticthesaurus ). Mutation calling is implemented in an R package available on github at RGeneticThesaurus ( https://github.com/tkonopka/RGeneticThesaurus). Supplementary information:Supplementary data are available at Bioinformatics online.

  7. d

    Data from: The genetic basis of variation in sexual aggression: evolution...

    • datadryad.org
    • zenodo.org
    zip
    Updated Mar 2, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Scott; Janice Yan; Carling Baxter; Ian Dworkin; Reuven Dukas (2022). The genetic basis of variation in sexual aggression: evolution versus social plasticity [Dataset]. http://doi.org/10.5061/dryad.zs7h44jbr
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 2, 2022
    Dataset provided by
    Dryad
    Authors
    Andrew Scott; Janice Yan; Carling Baxter; Ian Dworkin; Reuven Dukas
    Time period covered
    2022
    Description

    Please see attached README file for information about the data sets and experiment-specific notes on missing values.

  8. Data from: Genetics of plasminogen activator inhibitor-1 (PAI-1) in a...

    • zenodo.org
    • search.dataone.org
    • +2more
    txt
    Updated May 28, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marquitta J. White; Nuri M. Kodaman; Reed H. Harder; Folkert W. Asselbergs; Douglas E. Vaughan; Nancy J. Brown; Jason H. Moore; Scott M. Williams; Marquitta J. White; Nuri M. Kodaman; Reed H. Harder; Folkert W. Asselbergs; Douglas E. Vaughan; Nancy J. Brown; Jason H. Moore; Scott M. Williams (2022). Data from: Genetics of plasminogen activator inhibitor-1 (PAI-1) in a Ghanaian population [Dataset]. http://doi.org/10.5061/dryad.j2dj4
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 28, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marquitta J. White; Nuri M. Kodaman; Reed H. Harder; Folkert W. Asselbergs; Douglas E. Vaughan; Nancy J. Brown; Jason H. Moore; Scott M. Williams; Marquitta J. White; Nuri M. Kodaman; Reed H. Harder; Folkert W. Asselbergs; Douglas E. Vaughan; Nancy J. Brown; Jason H. Moore; Scott M. Williams
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Plasminogen activator inhibitor 1 (PAI-1), a major modulator of the fibrinolytic system, is an important factor in cardiovascular disease (CVD) susceptibility and severity. PAI-1 is highly heritable, but the few genes associated with it explain only a small portion of its variation. Studies of PAI-1 typically employ linear regression to estimate the effects of genetic variants on PAI-1 levels, but PAI-1 is not normally distributed, even after transformation. Therefore, alternative statistical methods may provide greater power to identify important genetic variants. Additionally, most genetic studies of PAI-1 have been performed on populations of European descent, limiting the generalizability of their results. We analyzed >30,000 variants for association with PAI-1 in a Ghanaian population, using median regression, a non-parametric alternative to linear regression. Three variants associated with median PAI-1, the most significant of which was in the gene arylsulfatase B (ARSB) (p = 1.09 x 10−7). We also analyzed the upper quartile of PAI-1, the most clinically relevant part of the distribution, and found 19 SNPs significantly associated in this quartile. Of note an association was found in period circadian clock 3 (PER3). Our results reveal novel associations with median and elevated PAI-1 in an understudied population. The lack of overlap between the two analyses indicates that the genetic effects on PAI-1 are not uniform across its distribution. They also provide evidence of the generalizability of the circadian pathway's effect on PAI-1, as a recent meta-analysis performed in Caucasian populations identified another circadian clock gene (ARNTL).

  9. d

    Data from: Are genetic variation and demographic performance linked?

    • datadryad.org
    • data.niaid.nih.gov
    zip
    Updated Oct 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lauren Carley; William Morris; Roberta Walsh; Donna Riebe; Tom Mitchell-Olds (2022). Are genetic variation and demographic performance linked? [Dataset]. http://doi.org/10.5061/dryad.k98sf7m70
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 4, 2022
    Dataset provided by
    Dryad
    Authors
    Lauren Carley; William Morris; Roberta Walsh; Donna Riebe; Tom Mitchell-Olds
    Time period covered
    2021
    Description

    Usage notes are provided in the enclosed README.txt file.

  10. u

    Data from: Plant Expression Database

    • agdatacommons.nal.usda.gov
    • s.cnmilf.com
    • +1more
    bin
    Updated Feb 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson (2024). Plant Expression Database [Dataset]. https://agdatacommons.nal.usda.gov/articles/dataset/Plant_Expression_Database/24661179
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    PLEXdb
    Authors
    Sudhansu S. Dash; John Van Hemert; Lu Hong; Roger P. Wise; Julie A. Dickerson
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    [NOTE: PLEXdb is no longer available online. Oct 2019.] PLEXdb (Plant Expression Database) is a unified gene expression resource for plants and plant pathogens. PLEXdb is a genotype to phenotype, hypothesis building information warehouse, leveraging highly parallel expression data with seamless portals to related genetic, physical, and pathway data. PLEXdb (http://www.plexdb.org), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facilitate the interpretation of structure, function and regulation of genes in economically important plants. A list of Gene Atlas experiments highlights data sets that give responses across different developmental stages, conditions and tissues. Tools at PLEXdb allow users to perform complex analyses quickly and easily. The Model Genome Interrogator (MGI) tool supports mapping gene lists onto corresponding genes from model plant organisms, including rice and Arabidopsis. MGI predicts homologies, displays gene structures and supporting information for annotated genes and full-length cDNAs. The gene list-processing wizard guides users through PLEXdb functions for creating, analyzing, annotating and managing gene lists. Users can upload their own lists or create them from the output of PLEXdb tools, and then apply diverse higher level analyses, such as ANOVA and clustering. PLEXdb also provides methods for users to track how gene expression changes across many different experiments using the Gene OscilloScope. This tool can identify interesting expression patterns, such as up-regulation under diverse conditions or checking any gene’s suitability as a steady-state control. Resources in this dataset:Resource Title: Website Pointer for Plant Expression Database, Iowa State University. File Name: Web Page, url: https://www.bcb.iastate.edu/plant-expression-database [NOTE: PLEXdb is no longer available online. Oct 2019.] Project description for the Plant Expression Database (PLEXdb) and integrated tools.

  11. Adaptive Introgression in Modern Human Circadian Rhythm Genes Datasets

    • zenodo.org
    application/gzip, bin +1
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chris Kendall; Chris Kendall (2025). Adaptive Introgression in Modern Human Circadian Rhythm Genes Datasets [Dataset]. http://doi.org/10.5281/zenodo.15265830
    Explore at:
    bin, application/gzip, csvAvailable download formats
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Chris Kendall; Chris Kendall
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 22, 2025
    Description

    README:

    Modern human genetic data with evidence of adaptive introgression from Neanderthals or Denisovans within circadian rhythm genes. The data was generated from the phased gnomAD 1KGP + HGDP callset (Koenig et al., 2024) and introgressed segments were identified by SPrime (Browning et al., 2018). Genes of interest were downloaded from the Circadian Genome Database (CGDB) (Li et al., 2017). Additional variants, haplotypes, and genes that have been previously reported to influence circadian rhythm or chronotype that are thought to be derived from Neanderthals and Denisovans were compiled from Dannemann & Kelso (2017), McArthur et al. (2021), Dannemann et al. (2022), and Velazquez-Arcelay et al. (2023).

    SPrime ND_Match Files

    Raw SPrime identified files that we used for our entire analysis. These were modified to include the archaic allele, archaic allele frequency, and average introgressed segment allele frequency. Note that these have been lifted over (Hinrichs et al., 2006) from GRCh38 (hg38) to GRCh37 (hg19) coordinates to match the genome builds of the archaic samples used in our study. As such, any manually generated variant IDs (chromosome:position:ReferenceAllele_AlternativeAllele naming convention) may no longer match the position they are currently sitting on as they were generated with hg38 coordinates. However, all of these were subsequently filtered out of our final results and any proper SNP IDs (dbSNP labels) will be accurate.

    Supplementary Tables

    All supplementary tables have an associated README as the first sheet that explains in detail the contents.

    NEXUS Files

    NEXUS files were used to generate haplotype networks in PopArt (Leigh & Bryant, 2015). There is a larger, master haplotype file and a smaller subset file. The larger file contains 668 haplotypes from all populations generated in the phased gnomAD 1KGP + HGDP callset (Koenig et al., 2024) for the SUSD1 core haplotype. The smaller subset file is the top 50 haplotypes and ties based on frequency, all Oceanic haplotypes with frequencies of at least 2, and the Neanderthal and Denisovan haplotypes for SUSD1.

    TRAITS file

    Accompanies the NEXUS files to create pie graphs for the haplotype network and contains frequency counts of number of haplotypes per region.

  12. o

    Genetic Classification Discrepancy Dataset

    • opendatabay.com
    .undefined
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DataDooix LTD (2025). Genetic Classification Discrepancy Dataset [Dataset]. https://www.opendatabay.com/data/science-research/b1be7488-492b-4ab2-8b48-851c409f889a
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    May 27, 2025
    Dataset authored and provided by
    DataDooix LTD
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Public Health & Epidemiology
    Description

    Provide a brief description of the dataset, including its purpose, context, and significance.

    Dataset Features

    List and describe each column or key feature of the dataset.

    • Column 1 Name: Description of what this column represents.
    • Column 2 Name: Add as needed...

    Distribution

    Detail the format, size, and structure of the dataset.

    • Data Volume: Number of rows/records, number of columns, etc.

    Usage

    This dataset is ideal for a variety of applications:

    • Application: Brief description of the first use case.
    • Application: Add more as needed.

    Coverage

    Explain the scope and coverage of the dataset:

    • Geographic Coverage: Region, country, or global.
    • Time Range: Start date - End date of data collection.
    • Demographics (if applicable): Age groups, gender, industries, etc.

    License

    CC0

    Who Can Use It

    List examples of intended users and their use cases:

    • Data Scientists: For training machine learning models.
    • Researchers: For academic or scientific studies.
    • Businesses: For analysis, insights, or AI development.

    Include any additional notes or context about the dataset that might be helpful for users.

  13. e

    Chipps Island trawl – Genetic Determination of Population of Origin...

    • portal.edirepository.org
    csv
    Updated Mar 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Scott Blankenship (2025). Chipps Island trawl – Genetic Determination of Population of Origin 2017-2023 [Dataset]. http://doi.org/10.6073/pasta/823439b7368cf59b8ba580d838d8c59d
    Explore at:
    csv(341542 byte)Available download formats
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    EDI
    Authors
    Scott Blankenship
    Time period covered
    Dec 16, 2016 - Jul 17, 2023
    Area covered
    Variables measured
    ID, Julian, FieldID, LAD_Race, ForkLength, SampleDate, Prob_Assignment, Genetic_Assignment, greb1L_classification
    Description

    Central Valley Chinook Salmon populations differ in their Endangered Species Act listing status. It is often difficult to distinguish individuals from the different Evolutionarily Significant Units. As such, many of the salmon monitoring and evaluation efforts in the Central Valley and San Francisco Bay-Delta are hampered by uncertainty about population (stock) identification and proportional effects of management actions (Dekar et al. 2013; IEP 2019). Studies have identified that the current identification method (length-at-date models) of juvenile Chinook salmon (Fisher 1992) captured in the watershed vary in their accuracy, particularly for spring-run (NMFS 2013; Harvey et al. 2014; Merz et al. 2014). The inaccuracy of the size-based methods is likely due to differences in fish distribution during early rearing, habitat-specific growth rates, and inter-annual variability in temperatures and food availability that lead to overlap in size ranges among stocks. The primary objective of this project was the genetic classification (to race; Evolutionary Significant Unit) of Chinook Salmon captured from State Water Project and Central Valley Project fish protection facilities and Interagency Ecological Program monitoring programs. The population-of-origin was determined for sampled fish by comparing their genotypes to reference genetic baselines. Genetic methods, having less statistical uncertainty that size-based models for population identification, were intended to directly target (and reduce) one source of uncertainty in the estimation of loss (take) from water diversions (operations) and develop the information necessary for understanding stock-specific distribution, habitat utilization, abundance, and life history variation. This project supports recommendations from the Interagency Ecological Program’s Salmon and Sturgeon Assessment of Indicators by Life Stage and Interagency Ecological Program Science Agenda efforts to improve Central Valley salmonid monitoring (Johnson et al. 2017; IEP 2019).

       Note that the genetic data provided here may be included in other data repositories. Regarding Chipps Island trawl activities, refer to the Interagency Ecological Program: Over four decades of juvenile fish monitoring data from the San Francisco Estuary, collected by the Delta Juvenile Fish Monitoring Program, 1976-2020
    
       Package ID: edi.244.8 
    
       Literature Cited
    
       Dekar, M., P. Brandes, J. Kirsch, L. Smith, J. Speegle, P. Cadrett and M. Marshall. 2013. USFWS Delta Juvenile Fish Monitoring Program Review. Background Document. Prepared for IEP Science Advisory Group, June 2013. US Fish and Wildlife Service, Stockton Fish and Wildlife Office, Lodi, CA. 224 p.
       Fisher, F.W. 1992. Chinook Salmon, Oncorhynchus tshawytscha, growth and occurrence in the Sacramento-San Joaquin River system. California Department of Fish and Game, Inland Fisheries Divisions, draft office report, Redding. 
       Harvey, B.N., D.P. Jacobson, M.A. Banks. 2014. Quantifying the uncertainty of a juvenile Chinook Salmon Race Identification Methyod for a Mixed-Race Stock. North American Journal of Fisheries Management. 
       IEP, Interagency Ecological Program. 2019. Interagency Ecological Program Science Strategy 2020-2024: Invenstment Priorities for Interagency Collaborative Science.
       Johnson, R.C., S. Windell, P. L. Brandes, J. L. Conrad, J. Ferguson, P. A. L. Goertler, B. N. Harvey, J.Heublein, J. A. Israel, D. W. Kratville, J. E. Kirsch, R. W. Perry, J. Pisciotto, W. R. Poytress, K. Reece, and B. G. Swart. 2017. Increasing the management value of life stage monitoring networks for three imperiled fishes in California's regulated rivers: case study Sacramento Winter-run Chinook salmon. San Francisco Estuary and Watershed Science 15: 1-41.
       National Marine Fisheries Service (NMFS). 2013. Endangered and Threatened Species: Designation of a Nonessential Experimental Population of Central Valley Spring-Run Chinook Salmon Below Friant Dam in the San Joaquin River, CA. Federal Register 70: 79622, December 31, 2013.
    
  14. u

    IFGC_Summary-statistics_Data-sharing

    • rdr.ucl.ac.uk
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International FTD-Genomics Consortium (2023). IFGC_Summary-statistics_Data-sharing [Dataset]. http://doi.org/10.5522/04/13042166.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    University College London
    Authors
    International FTD-Genomics Consortium
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Summary statistics data generated in Ferrari et al, 2014, Lancet Neurol (PMID: 24943344)The International FTD-Genetics Consortium (IFGC) shares the summary results data to allow other researchers to explore variants and/or loci for hypothesis driven work. The data provides information on ~ 6M markers and includes information about: marker – trait – allele 1 and 2 – OR or Beta – standard error – p-value, chromosome and bp position.To prevent identification of individuals, allele frequency data are not released.The data consists of the summary statistics generated during discovery phase (phase I) of the study including: bvFTD (n=1377 vs 2754 ctrls) AND/OR SD (n=308 vs 616 ctrls) AND/OR PNFA (n=269 vs 538 ctrls) AND/OR FTD-MND (n=200 vs 400 ctrls) AND/OR subtypes meta-analysis.Note:1. The IFGC requests to be included among co-authors in publications that might result from the use of this data as “The International FTD-Genetics Consortium (IFGC)” following Pubmed guidelines where Consortia or working group authors shall be listed on PubMed as collaborators rather than authors, where collaborator names are searchable on PubMed in the same way as authors. The acknowledgments associated with the IFGC as well as the IFGC members are provided as separate pdf document, together with the summary statistics;2. Publications (including but not limited to manuscripts, presentation, patent, grant) based on this IFGC’s dataset shall include the citation of the original work (Ferrari et al, 2014, Lancet Neurol, PMID: 24943344) and add the following to the acknowledgement section: “We thank the International FTD-Genetics Consortium (IFGC) for summary data”.

  15. u

    Data from: Genetic sampling for estimating density of common species

    • open.library.ubc.ca
    • borealisdata.ca
    • +1more
    Updated May 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheng, Ellen; Hodges, Karen E.; Sollmann, Rahel; Mills, L. Scott (2021). Data from: Genetic sampling for estimating density of common species [Dataset]. http://doi.org/10.14288/1.0397868
    Explore at:
    Dataset updated
    May 19, 2021
    Authors
    Cheng, Ellen; Hodges, Karen E.; Sollmann, Rahel; Mills, L. Scott
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    Jun 24, 2020
    Area covered
    Flathead National Forest, Glacier National Park, Montana
    Description

    Abstract
    Understanding population dynamics requires reliable estimates of population density, yet this basic information is often surprisingly difficult to obtain. With rare or difficult-to-capture species, genetic surveys from noninvasive collection of hair or scat has proved cost-efficient for estimating densities. Here, we explored whether noninvasive genetic sampling (NGS) also offers promise for sampling a relatively common species, the snowshoe hare (Lepus americanus Erxleben, 1777), in comparison with traditional live trapping. We optimized a protocol for single-session NGS sampling of hares. We compared spatial capture–recapture population estimates from live trapping to estimates derived from NGS, and assessed NGS costs. NGS provided population estimates similar to those derived from live trapping, but a higher density of sampling plots was required for NGS. The optimal NGS protocol for our study entailed deploying 160 sampling plots for 4 days and genotyping one pellet per plot. NGS laboratory costs ranged from approximately $670 to $3000 USD per field site. While live trapping does not incur laboratory costs, its field costs can be considerably higher than for NGS, especially when study sites are difficult to access. We conclude that NGS can work for common species, but that it will require field and laboratory pilot testing to develop cost-effective sampling protocols.

  16. d

    Data from: GOOML Big Kahuna Forecast Modeling and Genetic Optimization Files...

    • catalog.data.gov
    • gdr.openei.org
    • +2more
    Updated Jan 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Upflow (2025). GOOML Big Kahuna Forecast Modeling and Genetic Optimization Files [Dataset]. https://catalog.data.gov/dataset/gooml-big-kahuna-forecast-modeling-and-genetic-optimization-files-01985
    Explore at:
    Dataset updated
    Jan 20, 2025
    Dataset provided by
    Upflow
    Description

    This submission includes example files associated with the Geothermal Operational Optimization using Machine Learning (GOOML) Big Kahuna fictional power plant, which uses synthetic data to model a fictional power plant. A forecast was produced using the GOOML data model framework and fictional input data, and a genetic optimization is included which determines optimal flash plant parameters. The inputs and outputs associated with the forecast and genetic optimization are included. The input and output files consist of data, configuration files, and plots. A link to the Physics-Guided Neural Networks (phygnn) GitHub repository is also included, which augments a traditional neural network loss function with a generic loss term that can be used to guide the neural network to learn physical or theoretical constraints. phygnn is used by the GOOML framework to help integrate its machine learning models into the relevant physics and engineering applications. Note that the data included in this submission are intended to provide a demonstration of GOOML's capabilities. Additional files that have not been released to the public are needed for users to run these models and reproduce these results. Units can be found in the readme data resource.

  17. I

    Data from: FastMulRFS: Statistically consistent polynomial time species tree...

    • databank.illinois.edu
    Updated May 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erin K. Molloy; Tandy Warnow (2024). Data from: FastMulRFS: Statistically consistent polynomial time species tree estimation under gene duplication [Dataset]. http://doi.org/10.13012/B2IDB-5721322_V1
    Explore at:
    Dataset updated
    May 15, 2024
    Authors
    Erin K. Molloy; Tandy Warnow
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Dataset funded by
    National Science Foundationhttp://www.nsf.gov/
    Ira and Debra Cohen Graduate Fellowship in Computer Science
    Description

    This repository includes scripts and datasets for the paper, "FastMulRFS: Fast and accurate species tree estimation under generic gene duplication and loss models." Note: The results from estimating species trees with ASTRID-multi (included in this repository) are not included in the FastMulRFS paper. We estimated species trees with ASTRID-multi in the fall of 2019, but ASTRID-multi had an important bug fix in January 2020. Therefore, the ASTRID-multi species trees in this repository should be ignored.

  18. d

    Leiden Open Variation Database

    • dknet.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Leiden Open Variation Database [Dataset]. http://identifiers.org/RRID:SCR_006566/resolver
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Freely available tool for Gene-centered collection and display of DNA variations. It also provides patient-centered data storage and storage of Next Generation Sequencing (NGS) data, even of variants outside of genes. Please note that LOVD provides a system for storage of information on genes and allelic variants. To obtain information about any genes or variants, do not download the LOVD package. This information should be obtained from the respective databases, http://www.lovd.nl/2.0/index_list.php In total: 2,507,027 variants (2,208,937 unique) in 170,935 individuals in 62619 genes in 88 LOVD installations. (Aug. 2013) LOVD 3.0 shared installation, http://databases.lovd.nl/shared/genes To maintain a high quality of the data stored, LOVD connects with various resources, like HGNC, NCBI, EBI and Mutalyzer. You can download LOVD in ZIP and GZIPped TARball formats.

  19. r

    Data from: Incorporating non-equilibrium dynamics into demographic history...

    • researchdata.edu.au
    • esango.cput.ac.za
    • +5more
    Updated Jun 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Victoria J. Rowntree; Rochelle Constantine; Robert Harcourt; Rachael Alderman; Peter B. Best; Per J. Palsbøll; Oscar E. Gaggiotti; Nathalie J. Patenaude; Martine Bérubé; Mandy Watson; Luciano O. Valenzuela; Louisiane Lemaire; Laura Boren; Ken Findlay; Jon Seger; John L. Bannister; Emma L. Carroll; Debbie Steel; C. Scott Baker (2022). Data from: Incorporating non-equilibrium dynamics into demographic history inferences of a migratory marine species [Dataset]. http://doi.org/10.5061/DRYAD.VV5347P
    Explore at:
    Dataset updated
    Jun 11, 2022
    Dataset provided by
    Macquarie University
    Authors
    Victoria J. Rowntree; Rochelle Constantine; Robert Harcourt; Rachael Alderman; Peter B. Best; Per J. Palsbøll; Oscar E. Gaggiotti; Nathalie J. Patenaude; Martine Bérubé; Mandy Watson; Luciano O. Valenzuela; Louisiane Lemaire; Laura Boren; Ken Findlay; Jon Seger; John L. Bannister; Emma L. Carroll; Debbie Steel; C. Scott Baker
    Description

    Understanding how dispersal and gene flow link geographically separated populations over evolutionary history is challenging, particularly in migratory marine species. In southern right whales (SRWs, Eubalaena australis), patterns of genetic diversity are likely influenced by the glacial climate cycle and recent history of whaling. Here we use a dataset of mitochondrial DNA (mtDNA) sequences (n=1,327) and nuclear markers (17 microsatellite loci, n=222) from major wintering grounds to investigate circumpolar population structure, historical demography, and effective population size. Analyses of nuclear genetic variation identify two population clusters that correspond to the South Atlantic and Indo-Pacific ocean basins that have similar effective breeder estimates. In contrast, all wintering grounds show significant differentiation for mtDNA, but no sex-biased dispersal was detected using the microsatellite genotypes. An approximate Bayesian computation (ABC) approac h with microsatellite markers compared scenarios with gene flow through time, or isolation and secondary contact between ocean basins, while modeling declines in abundance linked to whaling. Secondary-contact scenarios yield the highest posterior probabilities, implying that populations in different ocean basins were largely isolated and came into secondary contact within the last 25,000 years, but the role of whaling in changes in genetic diversity and gene flow over recent generations could not be resolved. We hypothesise that these findings are driven by factors that promote isolation, such as female philopatry, and factors that could promote dispersal, such oceanographic changes. These findings highlight the application of ABC approaches to infer connectivity in mobile species with complex population histories and currently low levels of differentiation.

    Usage Notes

    Genotype data for Carroll et alMicrosatellite genotype data for southern right whales in Carroll et al 2018Carroll_et_al_2018_Genotypes.xlsx

  20. o

    Data from: A lack of genetic diversity and minimal adaptive evolutionary...

    • explore.openaire.eu
    • search.dataone.org
    • +2more
    Updated Jan 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Cheek (2024). A lack of genetic diversity and minimal adaptive evolutionary divergence in introduced Mysis shrimp after 50 years [Dataset]. http://doi.org/10.5061/dryad.8cz8w9gwx
    Explore at:
    Dataset updated
    Jan 9, 2024
    Authors
    Rebecca Cheek
    Description

    A lack of genetic diversity and minimal adaptive evolutionary divergence in introduced Mysis shrimp after 50 years ## Description of the data and file structure blast_fasta_RDA.fasta 350 bp consensus sequences around each candidate SNP flagged by redundancy analysis. blast_fasta_pcadapt.fasta 350 bp consensus sequences around each candidate SNP flagged by PCADAPT. Note some of the reads were flagged as truncated by samtools mysis_clean_10_for_neutral_popstats.vcf 3,803 SNPs generated following the recommendations of Schmidt et al. 2021 for popualtion genetic statistics. None of these SNPs were found in the adaptive SNP list from PCAdapt, or RDA, so they are likely unbiased by strong FST outliers or selection. mysis_clean_imputed_names_fix.vcf 18,441 SNPs imputed in Linkimpute that were used for redundancy analysis, PCAdapt, and SNMF. mysis_clean_imputed_neutral_names_fix.vcf 18,220 putative neutral SNPs with SNPs flagged as FST outliers from PCAdapt filtered out. Dataset was imputed with linkimpute before undergoing further filtering. mysis_clean_non_imputed.vcf 18,441 SNPs prior to imputation in Linkimpute so users can impose stricter filtering regimes based on missingness or replicate ADMIXTURE analyses. mysis_env_final.csv Environmental data for all sampled Lakes. Empty values are missing (blank) or NA. Variables measured by DS and BJ for all lakes apart form Clearwater Lake include: state: The state the United States of America where the sampled lake is located. Either CO (Colorado) or MI (Minnesota) county: State county where the sampled lake is water_name: Name of the sampled lake or reservoir code: Short code for sampled lakes. CAR = Carter Lake, CLER= Clearwater Lake, DIL=Dillon Reservoir, GDL= Grand Lake, GRO = Gross Reservoir, JEF= Jefferson Reservoir, TWL= Twin Lakes, RUE= Ruedi Reservoir. date_sampled: date in m/d/y when the genotyped shrimp were sampled. area-hectares: surface area of sampled lake in hectares depth-meters: Max water depth in meters of the sampled lake elevation-meters_above_sea_level: elevation of sampled lake in meters above sea level cond-microSiemens: Conductivity of sampled lake in microSeimens secchi-meters: secchi depth of sampled lake in meters turbidity-Nephelometric_Turbidity_Units: turbidity of sampled lake in nephelometric turbidity units min_Dissolved_Oxygen_on_bottom-mg_per_L: Minimum dissolved oxygen on the bottom of the sampled lake in mg per L mean_august_surface_temp_C: mean August surface temperature of the sampled lake in C num_yrs_Aug_temp: Number of years of collected data to measure mean August surface temperature mysis_detected: yes (Y), no (N) if Mysis have been found in the sampled lake. Only used for invaded (Colorado) lakes number_stocking_events: Number of times Mysis are known to have been stocked in a sampled lake based on historical records. stocked: yes(Y), no (N) if Mysis are known to have been stocked in a sampled lake connected_to_stocked_lake: yes(Y), no (N) if sampled lake is known to be connected to a lake known to have been stocked with Mysis mysis_popmap_final.txt Popmap used for populations in Stacks. Individuals are separated by lake. mysis_adaptive_snps.vcf 221 putative adaptive SNPs identified by PCADAPT and redundancy analysis mysis_clean_imputed.vcf 18,441 SNPs imputed in Linkimpute mysis_neutral_snps.vcf 18,220 neutral SNPs (candidate SNPs removed) for PCA and DAPC analysis ## Code/Software Scripts All in R markdown files with annotations and some personal commentary to help interperpret the data 001- Bioinformatics Pipeline Final.Rmd This includes workflow used to de-multiplex, and filter the raw ddRAD data from Admera Health. stacks_tests.xlsx Shows the result of multiple runs of Stacks denovo_map.pl to optimize the parameter used for final genotype calling. 002- Population_Genetics_final.Rmd Workflow to analyse neutral population structure. 003-Identification_of_Population_Structure_Associated_with_Habitat_final.Rmd Workflow to generate FST outlier SNP list that were filtered out of the dataset for "neutral" popualtion genetics, code for redundancy analysis and adaptive PCA. The successes of introduced populations in novel habitats often provide powerful examples of evolution and adaptation. In the 1950’s, opossum shrimp (Mysis diluviana) individuals from Clearwater Lake in Minnesota, USA were transported and introduced to Twin Lakes in Colorado, USA by fisheries managers to supplement food sources for trout. Shrimp were subsequently introduced from Twin Lakes into numerous lakes throughout Colorado. Because managers kept detailed records of the timing of the introductions, we had the opportunity to test for evolutionary divergence within a known time interval. Here, we used reduced representation genomic data to investigate patterns of genetic diversity and test for genetic divergence between popul...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Jason M. Fletcher (2023). Summary Statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0050576.t001

Summary Statistics.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Jason M. Fletcher
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

NHANES 1991–1994 Genetic Sample (N = 6,178).Notes: Author’s calculations from NHANES Data. Sample weights used.

Search
Clear search
Close search
Google apps
Main menu