Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch. This course material is maintained by the health data science sandbox. This webpage shows the latest version of the course material.
The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.
Description
The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.
The participants must at the end of the course be able to:
The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.
Curriculum
The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.
Course plan
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset includes GWAS (Genome-Wide Association Studies) summary statistics for top 10,000 genetic markers associated with three traits across five diverse ancestries. These traits and ancestries form part of the study outlined in the manuscript: "A new method for multiancestry polygenic prediction improves performance across diverse populations". The research manuscript can be accessed via this link: https://www.biorxiv.org/content/10.1101/2022.03.24.485519v5.abstract. The three traits explored in this dataset include height, sing back musical note (the ability to replicate a musical note), and morning person. These traits were examined across five ancestral backgrounds: African American (AFR), Native American (AMR), European (EUR), East Asian (EAS), and South Asian (SAS).
Facebook
TwitterNotes: na not applicable.1Number of MT+ and MT− sequences analyzed for each gene.2Polymorphism rate for silent sites (non-coding and synonymous)×1000. Standard deviation in parentheses. Values are given for all sequences (total) and for the MT+ and MT− isolates separately. MT+ and MT− values that differ from the total value by >1 standard deviation are shown in bold.3Population differentiation between MT+ and MT− isolates.Values near 0 correspond to no differentiation and values near 1 correspond to complete differentiation. Bold values correspond to those genes showing significant differentiation between MT+ and MT− isolates.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NHANES 1991–1994 Genetic Sample (N = 6,178).Notes: Author’s calculations from NHANES Data. Sample weights used.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: Significance level is 0.01, not corrected for muliple testing. The analyses of the first two rows are for all genes that have at least one MAC in Baylor and Broad dataset. The last rows are restricted to the genes that have more than 15 minor alleles after combining Baylor and Broad datasets.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Note: These analyses are restricted to the genes that have more than 4 minor alleles in the samples used in each study. and are calculated based on the median and the 1st quantile of the p-value distribution, respectively. PC adjustment is based on the common variants (CVs) eigen-vectors.
Facebook
TwitterPanel, W = wild, P = primitive, I = improved; L = alignment length in basepairs; l = number of synonymous sites; S = number of segregating synonymous sites; π = nucleotide diversity for synonymous sites; θ = Waterson's theta for synonymous sites; Sig. = ML-HKA significance: ns = not significant, P<0.001 = ***, P<0.01 = **, P<0.05 = *. Bold genes are those that showed significant evidence of selection. Note: we were unable to successfully sequence the IPT5 gene in P.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
[NOTE: PLEXdb is no longer available online. Oct 2019.] PLEXdb (Plant Expression Database) is a unified gene expression resource for plants and plant pathogens. PLEXdb is a genotype to phenotype, hypothesis building information warehouse, leveraging highly parallel expression data with seamless portals to related genetic, physical, and pathway data. PLEXdb (http://www.plexdb.org), in partnership with community databases, supports comparisons of gene expression across multiple plant and pathogen species, promoting individuals and/or consortia to upload genome-scale data sets to contrast them to previously archived data. These analyses facilitate the interpretation of structure, function and regulation of genes in economically important plants. A list of Gene Atlas experiments highlights data sets that give responses across different developmental stages, conditions and tissues. Tools at PLEXdb allow users to perform complex analyses quickly and easily. The Model Genome Interrogator (MGI) tool supports mapping gene lists onto corresponding genes from model plant organisms, including rice and Arabidopsis. MGI predicts homologies, displays gene structures and supporting information for annotated genes and full-length cDNAs. The gene list-processing wizard guides users through PLEXdb functions for creating, analyzing, annotating and managing gene lists. Users can upload their own lists or create them from the output of PLEXdb tools, and then apply diverse higher level analyses, such as ANOVA and clustering. PLEXdb also provides methods for users to track how gene expression changes across many different experiments using the Gene OscilloScope. This tool can identify interesting expression patterns, such as up-regulation under diverse conditions or checking any gene’s suitability as a steady-state control. Resources in this dataset:Resource Title: Website Pointer for Plant Expression Database, Iowa State University. File Name: Web Page, url: https://www.bcb.iastate.edu/plant-expression-database [NOTE: PLEXdb is no longer available online. Oct 2019.] Project description for the Plant Expression Database (PLEXdb) and integrated tools.
Facebook
Twitterdata_readmeA readme file explaining the information loaded to DryadNL map fileplink-style map file with marker locations for the NL dataset.The map file is in an amended plink format with chromosome in column 1, SNP name in column 2, cM position in column 3 (note this is different from defaul plink map files which have distances in Morgans) and genome order in column 4 (again this is different from the plink format which would usually have bp position in this column).
Chromosome codings in the map file (column 1) are as follows: Chromosomes 1-15 and 17-28 are the corresponding chromosomes 1-15 and 17-28 in the great tit genome Chromosome 29 = chromosome 1A Chromosome 30 = chromosome 4A Chromosome 31 = Z chromosome Chromosome 32 = Linkage group LGE22NL_numeric_ids.mapNL genotype fileGenotypes for the 1,407 NL individuals. These are standard plink files (see http://pngu.mgh.harvard.edu/~purcell/plink/) where the first column gives a family id, the second column an individual id (in this ...
Facebook
TwitterDetecting genetic variants under selection using FST outlier analysis (OA) and environmental association analyses (EAA) are popular approaches that provide insight into the genetic basis of local adaptation. Despite the frequent use of OA and EAA approaches and their increasing attractiveness for detecting signatures of selection, their application to field-based empirical data have not been synthesized. Here, we review 66 empirical studies that use Single Nucleotide Polymorphisms (SNPs) in OA and EAA. We report trends and biases across biological systems, sequencing methods, approaches, parameters, environmental variables and their influence on detecting signatures of selection. We found striking variability in both the use and reporting of environmental data and statistical parameters. For example, linkage disequilibrium among SNPs and numbers of unique SNP associations identified with EAA were rarely reported. The proportion of putatively adaptive SNPs detected varied widely among studies, and decreased with the number of SNPs analyzed. We found that genomic sampling effort had a greater impact than biological sampling effort on the proportion of identified SNPs under selection. OA identified a higher proportion of outliers when more individuals were sampled, but this was not the case for EAA. To facilitate repeatability, interpretation and synthesis of studies detecting selection, we recommend that future studies consistently report geographic coordinates, environmental data, model parameters, linkage disequilibrium, and measures of genetic structure. Identifying standards for how OA and EAA studies are designed and reported will aid future transparency and comparability of SNP-based selection studies and help to progress landscape and evolutionary genomics. Usage Notes Table S1 - Full data set.Data was collected by reading papers associated with environmental association analyses. Data includes location, species, methods used, genetic parameters of data sets reviewed, and analytical parameters of the analyses.Table S1_data.xlsxR code for mixed-effects linear modelsThe R code used to create the figures and estimate regressions of the data set.Ahrens et al 2018_MolEcol_review.R
Facebook
TwitterAcross the human genome, there are large-scale fluctuations in genetic diversity caused by the indirect effects of selection. This can be thought of as a “linked selection signal" that reflects the impact of selection varying according to the placement of functional regions and recombination rates along the genome. Previous work has shown that negative selection against the steady influx of new deleterious mutations into conserved regions is the predominant mode of selection in humans. However, the theoretic model that underpins these results, classic Background Selection theory, is only applicable when new mutations are so deleterious that they cannot fix in the population. Here, we develop a statistical method based on a quantitative genetics view of the linked selection, which models the effects of weak draft created according to how polygenic additive fitness variance is distributed along the genome. We use a recent model that jointly predicts the equilibrium fitness variance and su..., These Python pickle files contain the model outputs from bgspy (http://github.com/vsbuffalo/bprime/) for the CADD 6%, CADD 8%, PhastCons Priority, and Feature Priority Models., , # Main model fits and substitution rate predictions for: A quantitative genetic model of background selection in humans
All files are in in standard Python file formats. To load the pickle files, install the accompanying bprime software available on GiHub.
Note that all TSV files here were written by analyses in Jupyter notebooks that are available on the bprime GitHub page.
There are pickle files of model results, generated by bgspy collect.
cadd6_decode_altgrid.pkl: CADD 6%cadd8_decode_altgrid.pkl: CADD 8%CDS_genes_phastcons_decode_altgrid.pkl: Feature Priorityphastcons_CDS_genes_decode_altgrid.pkl: PhastCons PriorityempiricalB_chr10_expansion_false_h_0.5_results.npz: simulation B
"empirical" B maps for fixed demography
empiricalB_chr10_expansion_1.004_9.3_h_0.5_results.npz: ...
Facebook
TwitterGenetic diversity statistics assayed per Portuguese P. nigra population and SSR locus. Notes: na - observed number of alleles; ne - effective number of alleles (Kimura and Crow 1964); I - Shannon’s Information Index (Lewontin 1972); h – Nei’s gene diversity index (Nei 1973); Ho – observed heterozygosity; He – expected heterozygosity (Levene 1949); the F – fixation index; and s.d. – standard deviation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains meta-analyzed GWAS summary statistics for 35 biomarker traits described in the following preprint:N. Sinnott-Armstrong*, Y. Tanigawa*, et al, Genetics of 38 blood and urine biomarkers in the UK Biobank. bioRxiv, 660506 (2019). doi:10.1101/660506Note that we are preparing a revised version of the manuscript and this dataset contains 35 (instead of 38) biomarker phenotypes.We provide the list of 35 biomarkers in "list_of_35_biomarkers.tsv". We used the "Phenotype_name" column in this table for the file names. For each phenotype, we provide two compressed tab-delimited files, named "[Phenotype_name].array.gz" and "[Phenotype_name].imp.gz", which contain the summary statistics for genetic variants on the genotyping array and the imputed dataset, respectively.We used METAL for the meta-analysis for 4 populations (White British, non-British White, African, and South Asian) within UK Biobank. The files have the following columns: CHROM: the chromosomePOS: the positionMarkerName: the variant identifierREF: the reference alleleALT: the alternate alleleEffect: the effect size (BETA) estimateStdErr: the standard error of effect size estimateP-value: the p-value of the associationDirection: the direction of effect sizeHetISq, HetChiSq, HetDf, HetPVal: heterogeneity statistics from METAL Note that we used GRCh37/hg19 genome reference in the analysis and the BETA is always reported for the alternate allele.Please also check the METAL documentation (https://genome.sph.umich.edu/wiki/METAL_Documentation).The summary statistic files are compressed with bgzip and indexed with tabix (the .tbi files). One should be able to read those files with the standard gzip/zcat.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains summary statistics for the discovery and the replication stages of the large-scale genome-wide associations study for varicose veins of lower extremities. The discovery stage was based on genetic association data provided by the Neale Lab (http://www.nealelab.is/) for 337,199 UK biobank individuals. Phenotype “varicose veins of lower extremities” was defined based on International Classification of Disease (ICD-10) billing code “I83” present in the electronic patient record. Data were adjusted for two potential confounders – body mass index and deep venous thrombosis. A replication cohort (N=71,256) was generated by means of reverse meta-analysis of two overlapping datasets: genetic association data for 408,455 UK Biobank participants provided by the Gene ATLAS database (http://geneatlas.roslin.ed.ac.uk/), and the above mentioned data provided by the Neale Lab.
Please, note, that in Shadrina et al (PLOS Genetics 2019) we only used "discovery" dataset, while in biorxiv preprint (https://doi.org/10.1101/368365) both discovery and replication datasets were used.
The data are provided on an "AS-IS" basis, without warranty of any type, expressed or implied, including but not limited to any warranty as to their performance, merchantability, or fitness for any particular purpose. If investigators use these data, any and all consequences are entirely their responsibility. By downloading and using these data, you agree that you will cite the appropriate publication in any communications or publications arising directly or indirectly from these data; for utilisation of data available prior to publication, you agree to respect the requested responsibilities of resource users under 2003 Fort Lauderdale principles; you agree that you will never attempt to identify any participant.
When using downloaded data, please cite corresponding paper and this repository:
Shadrina, A. S., Sharapov, S. Z., Shashkova, T. I. & Tsepilov, Y. A. Varicose veins of lower extremities: Insights from the first large-scale genetic study. PLOS Genet. 15, e1008110 (2019).
Funding:
The work of ASS was supported by the Russian Science Foundation [Project No 17-75-20223].
The work of YAT was supported by the Russian Ministry of Science and Education under the 5-100 Excellence Programme.
The work of SZS was supported by the Institute of Cytology and Genetics [Project No 0324-2018-0017].
Column headers - discovery
Column headers - replication
Facebook
Twitterhttp://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html
pip install gget
import gget
gget.ref(species=None, list_species=True)[:10]
['acanthochromis_polyacanthus', 'accipiter_nisus', 'ailuropoda_melanoleuca', 'amazona_collaria', 'amphilophus_citrinellus', 'amphiprion_ocellaris', 'amphiprion_percula', 'anabas_testudineus', 'anas_platyrhynchos', 'anas_platyrhynchos_platyrhynchos']
gget.ref(species='mus_musculus')
{'mus_musculus': {'transcriptome_cdna': {'ftp': 'http://ftp.ensembl.org/pub/release-108/fasta/mus_musculus/cdna/Mus_musculus.GRCm39.cdna.all.fa.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '19:32', 'bytes': '49M'}, 'genome_dna': {'ftp': 'http://ftp.ensembl.org/pub/release-108/fasta/mus_musculus/dna/Mus_musculus.GRCm39.dna.primary_assembly.fa.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '18:37', 'bytes': '769M'}, 'annotation_gtf': {'ftp': 'http://ftp.ensembl.org/pub/release-108/gtf/mus_musculus/Mus_musculus.GRCm39.108.gtf.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '19:16', 'bytes': '31M'}, 'coding_seq_cds': {'ftp': 'http://ftp.ensembl.org/pub/release-108/fasta/mus_musculus/cds/Mus_musculus.GRCm39.cds.all.fa.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '19:32', 'bytes': '16M'}, 'non-coding_seq_ncRNA': {'ftp': 'http://ftp.ensembl.org/pub/release-108/fasta/mus_musculus/ncrna/Mus_musculus.GRCm39.ncrna.fa.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '19:45', 'bytes': '7.6M'}, 'protein_translation_pep': {'ftp': 'http://ftp.ensembl.org/pub/release-108/fasta/mus_musculus/pep/Mus_musculus.GRCm39.pep.all.fa.gz', 'ensembl_release': 108, 'release_date': '2022-10-04', 'release_time': '19:32', 'bytes': '11M'}}}
dl_links = gget.ref(species='mus_musculus', which=['ncrna'], ftp=True) import urllib.request urllib.request.urlretrieve(dl_links[0], './GRCm39_rna.fa.gz')
('./GRCm39_rna.fa.gz',
from pyGeno.Genome import * #load a genome ref = Genome(name = 'GRCh37.75') #load a gene gene = ref.get(Gene, name = 'TPST2')[0] #print the sequences of all the isoforms for prot in gene.get(Protein) : print prot.sequence pers = Genome(name = 'GRCh37.75', SNPs = ["RNA_S1"], SNPFilter = myFilter())
Facebook
TwitterSequencing data was derived through RAD-sequencing of four F2 cross families (F0s and F2s sequenced). Phenotype data was derived by phenotyping lab-reared individuals according to the methods in Whiting et al. 2022. The linkage map was made using LepMap3.
Facebook
TwitterLinkage maps constructed from genetic analysis of gene order and crossover frequency provide few clues to the basis of the genomewide distribution of meiotic recombination, such as chromosome structure, that influences meiotic recombination. To bridge this gap, we have generated the first cytological recombination map that identifies individual autosomes in the male mouse. We prepared meiotic chromosome (synaptonemal complex [SC]) spreads from 110 mouse spermatocytes, identified each autosome by multicolor fluorescence in situ hybridization of chromosome- specific DNA libraries, and mapped 12,000 sites of recombination along individual autosomes, using immunolocalization of MLH1, a mismatch repair protein that marks crossover sites. We show that SC length is strongly correlated with crossover frequency and distribution. Although the length of most SCs corresponds to that predicted from their mitotic chromosome length rank, several SCs are longer or shorter than expected, with correspond..., SC Spreads and Immunostaining Three juvenile (20–21 d old) C57BL/6J mice (the same line analyzed by the Mouse Genome Sequencing Project) were used to prepare and immunolabel the SC spreads, as described elsewhere (Anderson et al. 1999). Complete sets of SCs in which the SCs were well separated but not obviously stretched or broken and that had ≥ 19 MLH1 foci were selected for analysis. Three fluorescent images (4, 6-diamino-2-phyenylindole [DAPI], SCP3, and MLH1) were captured for each SC set. mFISH After image acquisition of the immunofluorescence signals, the spermatocyte preparations were subjected to two or three rounds of denaturation and FISH. To identify each autosome, chromosome-specific painting probes (Rabbitts et al. 1995) were combinatorially labeled with fluorescein isothiocyanate (FITC)–2-deoxyuridine 5-tri phosphate (dUTP), Cy5-dUTP (both from Amersham), or 6-carboxytetramethylrhodamine (TAMRA)-dUTP (Applied Biosystems) and were combined to form two different probe pools ..., , # Male mouse recombination maps for each autosome identified by chromosome painting
The data are presented in an Excel spreadsheet with 22 sheets. Sheet 1 (karyotype-absolute positi calc) defines the average length of each mouse SC, after identification using chromosome-specific DNA probes. Sheet 2 (Notes) contains definitions of the headings used for Sheet 3 (raw data sorted by SC) and Sheets 4 through 22 ("SC1 abs" through "SC19 abs"). Sheet 2 also contains explanations for how the karyotype was derived, and two references in which this data was used for publication are also presented. Sheet 3 contains the positions of all MLH1 foci observed on all of the SCs with each MLH1 focus position expressed as a fraction of SC length from the centromere. Sheets 4 – 22 (labeled as "SC1 abs", "SC2 abs", "SC3 abs", "SC4 abs", SC5 abs, "SC6 abs", "SC7 abs", "SC8 abs", "SC9 abs", "SC10 abs", "SC11 abs", "SC12 abs", "SC13 abs", "SC14 abs", "SC15 ...
Facebook
TwitterThis dataset contains whole-genome variant frequencies for 1000 Swedish individuals generated within the SweGen project. The frequency data is intended to be used as a resource for the research community and clinical genetics laboratories.
Please note that the 1000 individuals included in the SweGen project represent a cross-section of the Swedish population and that no disease information has been used for the selection. The frequency data may therefore include genetic variants that are associated with, or causative of, disease.
We request that any use of data from the SweGen project cite this article in the European Journal of Human Genetics.
Individual positions in the genome can be viewed using the Beacon or Graphical Browser. To download the variant frequency file you need to register.
A high confidence set of HLA allele frequencies is available for download under Dataset Access. For a detailed description of the SweGen HLA analysis, please see this bioRxiv preprint.
Facebook
TwitterBackgroundAn appropriate normalization strategy is crucial for data analysis from real time reverse transcription polymerase chain reactions (RT-qPCR). It is widely supported to identify and validate stable reference genes, since no single biological gene is stably expressed between cell types or within cells under different conditions. Different algorithms exist to validate optimal reference genes for normalization. Applying human cells, we here compare the three main methods to the online available RefFinder tool that integrates these algorithms along with R-based software packages which include the NormFinder and GeNorm algorithms.Results14 candidate reference genes were assessed by RT-qPCR in two sample sets, i.e. a set of samples of human testicular tissue containing carcinoma in situ (CIS), and a set of samples from the human adult Sertoli cell line (FS1) either cultured alone or in co-culture with the seminoma like cell line (TCam-2) or with equine bone marrow derived mesenchymal stem cells (eBM-MSC). Expression stabilities of the reference genes were evaluated using geNorm, NormFinder, and BestKeeper. Similar results were obtained by the three approaches for the most and least stably expressed genes. The R-based packages NormqPCR, SLqPCR and the NormFinder for R script gave identical gene rankings. Interestingly, different outputs were obtained between the original software packages and the RefFinder tool, which is based on raw Cq values for input. When the raw data were reanalysed assuming 100% efficiency for all genes, then the outputs of the original software packages were similar to the RefFinder software, indicating that RefFinder outputs may be biased because PCR efficiencies are not taken into account.ConclusionsThis report shows that assay efficiency is an important parameter for reference gene validation. New software tools that incorporate these algorithms should be carefully validated prior to use.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
********* Observed data *********
The file ObsData.arp contains the sequences of the mtDNA hypervariable I region from 720 individuals belonging to 25 Southeast Asian populations used as input file to compute the summary statistics with Arlequin. For further details on the format and available Summary statistics see the manual of Arlequin.
********* Input files for simulations *********
For each evolutionary scenario (NONE, LGP, LDD and LGP&LDD) find a folder (named after the scenario) containing the input files to perform 100 simulations. To run the simulations one should access the command line and execute:
./ABCsampler abc_sensitivity.input
Input files for SPLATCHE3, Arlequin and ABCtoolbox are included (for further details on them see the manual of these software).
********* Selection of the best-fitting evolutionary scenario *********
The R script (ModelSelection.R) can be used to select the evolutionary scenario that better fits the observed data, using the multinomial logistic regression method and the neural networks based method.
Firstly, one will need the summary statistics obtained from observed data (the file entitled ObsSS.txt). Then, one will need the files containing the output files of the simulations under each scenario, i.e., the genetic parameters used under each simulation and the computed summary statistics. Please, note that the output of the ABCtoolbox is a single file containing all this information, but we prefer to use a file with the summary statistics and another with the parameters. Here, we provide example files obtained from 100 simulations of each scenario:
- ssNONE.txt, the summary statistics computed from 100 simulations under the scenario NONE
- parNONE.txt, the genetic and demographic parameters per simulation under the scenario NONE
- ssLGP.txt, the summary statistics computed from 100 simulations under the scenario LGP
- parLGP.txt, the genetic and demographic parameters per simulation under the scenario LGP
- ssLDD.txt, the summary statistics computed from 100 simulations under the scenario LDD
- parLDD.txt, the genetic and demographic parameters per simulation under the scenario LDD
- ssLGP_LDD.txt, the summary statistics computed from 100 simulations under the scenario LGP&LDD
- parLGP_LDD.txt, the genetic and demographic parameters per simulation under the scenario LGP&LDD
To run the script the directory containing these files has to be specified in the script.
For details see Csilléry, et al. (2012): "Approximate Bayesian computation (ABC) in R: a Vignette."
********* Parameters estimation *********
The folder named ParametersEstimation contains all the input files to estimate the genetic and demographic parameters under the selected evolutionary scenario (LGP&LDD). Within the folder, one will find the summary statistics obtained under the selected scenario and the corresponding parameters (completeEstimator_LGP-LDD.txt), the summary statists from observed data (obs11SS.txt) and all the remaining input files to run ABCestimator (for further detail on these files see the manual of ABCtoolbox).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch. This course material is maintained by the health data science sandbox. This webpage shows the latest version of the course material.
The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.
Description
The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.
The participants must at the end of the course be able to:
The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.
Curriculum
The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.
Course plan