100+ datasets found
  1. b

    Codon Usage Tabulated from GenBank

    • bioregistry.io
    Updated Nov 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Codon Usage Tabulated from GenBank [Dataset]. https://bioregistry.io/cutg
    Explore at:
    Dataset updated
    Nov 10, 2022
    Description

    Codon usage in individual genes has been calculated using the nucleotide sequence data obtained from the GenBank Genetic Sequence Database. The compilation of codon usage is synchronized with each major release of GenBank.

  2. Z

    Codon similarity data in ATTED-II ver 8.0 (Ath, Gma, Osa, Sly, Vvi)

    • data.niaid.nih.gov
    Updated Jul 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Obayashi, Takeshi; Aoki, Yuichi (2023). Codon similarity data in ATTED-II ver 8.0 (Ath, Gma, Osa, Sly, Vvi) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8123039
    Explore at:
    Dataset updated
    Jul 10, 2023
    Dataset provided by
    Tohoku University
    Authors
    Obayashi, Takeshi; Aoki, Yuichi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Codon similarity data in ATTED-II ver 8.0

    The gene-to-gene codon similarity data is organized in the form of tables, each named according to the Entrez Gene ID of a particular query gene. Each table encompasses three columns, specifying: the Entrez Gene ID of a corresponding gene, an MR (Mutual Rank) value (where a smaller number signifies a stronger relationship), and a Pearson correlation coefficient (where a larger number suggests a stronger association).

    Protein-coding sequences utilized in this study were retrieved from NCBI's RefSeq database. For each gene, a 61-dimensional vector was derived from the count of codons in the protein-coding sequence. In instances where multiple RefSeq sequences were associated with a single gene, the longest sequence was selected for the codon usage calculation. Pearson correlation coefficients (PCCs) were determined between the vectors of any two given genes. These PCCs were subsequently converted into MRs, employed as an index to evaluate the similarity in codon usage between the genes.

  3. Codon Usage - UCI

    • kaggle.com
    zip
    Updated Nov 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salik Hussaini (2021). Codon Usage - UCI [Dataset]. https://www.kaggle.com/salikhussaini49/codon-usage
    Explore at:
    zip(2035077 bytes)Available download formats
    Dataset updated
    Nov 25, 2021
    Authors
    Salik Hussaini
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context

    We examined codon usage frequencies in the genomic coding DNA of a large sample of diverse organisms from different taxa tabulated in the CUTG database, where we further manually curated and harmonized these existing entries by re-classifying CUTG's bacteria (bct) class into archaea (arc), plasmids (plm), and bacteria proper (keeping with the original label bct'). The reclassification in the originalbct' domain was simplified by extracting from files qbxxx.spsum.txt' (where xxx = bct (bacteria), inv (invertebrates), mam (mammals), pln (plants), pri (primates), rod (rodents), vrt (vertebrates)) the different genus names of the entries, and making the classification by genus. There were 514 different genus names. The different genus categories were checked and relabeled asarc' where appropriate. In the eubacterial entries, the distinction was made of the bacterial genomes proper (keeping with the original label bct'), and bacterial plasmids (now labeledplm').

    Content

    Column 1: Kingdom Column 2: DNAtype Column 3: SpeciesID Column 4: Ncodons Column 5: SpeciesName Columns 6-69: codon (header: nucleotide bases; entries: frequency of usage (5 digit floating point number))

    The 'Kingdom' is a 3-letter code corresponding to `xxx' in the CUTG database name: 'arc'(archaea), 'bct'(bacteria), 'phg'(bacteriophage), 'plm' (plasmid), 'pln' (plant), 'inv' (invertebrate), 'vrt' (vertebrate), 'mam' (mammal), 'rod' (rodent), 'pri' (primate), and 'vrl'(virus) sequence entries. Note that the CUTG database does not contain 'arc' and 'plm' (these have been manually curated ourselves).

    The 'DNAtype' is denoted as an integer for the genomic composition in the species: 0-genomic, 1-mitochondrial, 2-chloroplast, 3-cyanelle, 4-plastid, 5-nucleomorph, 6-secondary_endosymbiont, 7-chromoplast, 8-leucoplast, 9-NA, 10-proplastid, 11-apicoplast, and 12-kinetoplast.

    The species identifier ('SpeciesID') is an integer, which uniquely indicates the entries of an organism. It is an accession identifier for each different species in the original CUTG database, followed by the first item listed in each genome.

    The number of codons (`Ncodons') is the algebraic sum of the numbers listed for the different codons in an entry of CUTG. Codon frequencies are normalized to the total codon count, hence the number of occurrences divided by 'Ncodons' is the codon frequencies listed in the data file.

    The species' name ('SpeciesName') is represented in strings purged of comma' (which are now replaced byspace'). This is a descriptive label of the name of the species for data interpretations.

    Lastly, the codon frequencies ('codon') including 'UUU', 'UUA', 'UUG', 'CUU', etc., are recorded as floats (with decimals in 5 digits).

    Acknowledgements

    Khomtchouk BB: 'Codon usage bias levels predict taxonomic identity and genetic composition'. bioRxiv, 2020, doi: 10.1101/2020.10.26.356295.

    Nakamura Y, Gojobori T, Ikemura T: 'Codon usage tabulated from international DNA sequence databases: status for the year 2000'. Nucleic Acids Research, 2000, 28:292.

    Inspiration

    Extend Biology Research.

  4. f

    Data from: tRic: a user-friendly data portal to explore the expression...

    • tandf.figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhao Zhang; Hang Ruan; Chun-Jie Liu; Youqiong Ye; Jing Gong; Lixia Diao; An-Yuan Guo; Leng Han (2024). tRic: a user-friendly data portal to explore the expression landscape of tRNAs in human cancers [Dataset]. http://doi.org/10.6084/m9.figshare.9699140.v2
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 15, 2024
    Dataset provided by
    Taylor & Francis
    Authors
    Zhao Zhang; Hang Ruan; Chun-Jie Liu; Youqiong Ye; Jing Gong; Lixia Diao; An-Yuan Guo; Leng Han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Transfer RNAs (tRNAs) play critical roles in human cancer. Currently, no database provides the expression landscape and clinical relevance of tRNAs across a variety of human cancers. Utilizing miRNA-seq data from The Cancer Genome Atlas, we quantified the relative expression of tRNA genes and merged them into the codon level and amino level across 31 cancer types. The expression of tRNAs is associated with clinical features of patient smoking history and overall survival, and disease stage, subtype, and grade. We further analysed codon frequency and amino acid frequency for each protein coding gene and linked alterations of tRNA expression with protein translational efficiency. We include these data resources in a user-friendly data portal, tRic (tRNA in cancer, https://hanlab.uth.edu/tRic/ or http://bioinfo.life.hust.edu.cn/tRic/), which can be of significant interest to the research community.

  5. f

    Codon bias and codon pair bias scores for FPR1 variants calculated based on...

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heini M. Miettinen (2023). Codon bias and codon pair bias scores for FPR1 variants calculated based on the codons for ten validated SNPs. [Dataset]. http://doi.org/10.1371/journal.pone.0028712.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Heini M. Miettinen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Haplotype designations 1A-16A are by Sahagun-Ruiz et al.[2]. B, C and D show haplotypes in which the SNP does not change the amino acid compared to A [4]. The table includes the FPR1 SNPs in the following order: c.32C>T/p.T11I, c.140T>C/p.V47A, c.301G>C/p.V101L, c.306T>C/p.F102F, c.348C>T/p.I116I, c.546C>A/p.P182P, c.568A>T/p.R190W, c.576T>G>C/p.N192K, c.993C>T/p.T331T, c.1037C>A/p.A356E. The codon bias results show the differences between the various haplotypes based on the total of each SNP codon usage score, as obtained from the GenBank Homo sapiens Codon Usage Database (http://www.kazusa.or.jp/codon/cgi-bin/showcodon.cgi?species=9606). The codon pair bias results show the differences between the various haplotypes based on the total of each SNP codon pair score, as calculated from the Supplemental Material by Coleman et al.www.sciencemag.org/cgi/content/full/320/5884/1784/DC1[6]. Amino acids are shown in single letter code. The nucleotide in the 3rd position of the synonymous codons is as shown.

  6. r

    Data on Stop Codon Usage in Bacteria and Its Correlation with Release Factor...

    • researchdata.se
    • data.europa.eu
    Updated Jun 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gürkan Korkmaz (2025). Data on Stop Codon Usage in Bacteria and Its Correlation with Release Factor Abundance [Dataset]. http://doi.org/10.57804/83hw-m374
    Explore at:
    (77355124), (106760384)Available download formats
    Dataset updated
    Jun 26, 2025
    Dataset provided by
    Uppsala University
    Authors
    Gürkan Korkmaz
    Description

    We present a comprehensive analysis of stop codon usage in bacteria by analyzing over eight million coding sequences of 4684 bacterial sequences. Using a newly developed program called "stop codon counter," the frequencies of the three classical stop codons TAA, TAG, and TGA were analyzed, and a publicly available stop codon database was built.

    Datafiles contain: 1) Complete data set of stop codon usage of all analyzed sequences as described in the publication "Comprehensive Analysis of Stop Codon Usage in Bacteria and Its Correlation with Release Factor Abundance" by Korkmaz et al (2014).

    2) The Java program that was used for the analysis of the coding sequences. Execute the file in Program\ProjectStopCodonCounter\dist

    The dataset was originally published in DiVA and moved to SND in 2024.

  7. d

    Data from: A simple model based on mutation and selection explains trends in...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A simple model based on mutation and selection explains trends in codon and amino-acid usage and GC composition within and across genomes [Dataset]. https://catalog.data.gov/dataset/a-simple-model-based-on-mutation-and-selection-explains-trends-in-codon-and-amino-acid-usa
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background: Correlations between genome composition (in terms of GC content) and usage of particular codons and amino acids have been widely reported, but poorly explained. We show here that a simple model of processes acting at the nucleotide level explains codon usage across a large sample of species (311 bacteria, 28 archaea and 257 eukaryotes). The model quantitatively predicts responses (slope and intercept of the regression line on genome GC content) of individual codons and amino acids to genome composition. Results: Codons respond to genome composition on the basis of their GC content relative to their synonyms (explaining 71-87% of the variance in response among the different codons, depending on measure). Amino-acid responses are determined by the mean GC content of their codons (explaining 71-79% of the variance). Similar trends hold for genes within a genome. Position-dependent selection for error minimization explains why individual bases respond differently to directional mutation pressure. Conclusions: Our model suggests that GC content drives codon usage (rather than the converse). It unifies a large body of empirical evidence concerning relationships between GC content and amino-acid or codon usage in disparate systems. The relationship between GC content and codon and amino-acid usage is ahistorical; it is replicated independently in the three domains of living organisms, reinforcing the idea that genes and genomes at mutation/selection equilibrium reproduce a unique relationship between nucleic acid and protein composition. Thus, the model may be useful in predicting amino-acid or nucleotide sequences in poorly characterized taxa.

  8. RSCU values per gene in the yeast genomes

    • figshare.com
    application/x-gzip
    Updated Sep 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abigail LaBella (2024). RSCU values per gene in the yeast genomes [Dataset]. http://doi.org/10.6084/m9.figshare.27074467.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Sep 20, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Abigail LaBella
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Relative synonymous codon usage was calculated for all genomes in the Y1000+ database. The RSCU for orders with a CTG codon reassignment were computed taking the reassignment into account. The script used to generate the RSCU values can be found here: https://github.com/The-Lab-LaBella/RSCU_Calculation_AnalysisThe file contains the name of the assemblies, as listed in the supplemental data 1 of Opulente et al 2024. The columns contain the clade/order and codons analyzed.

  9. d

    Data from: Serine codon-usage bias in deep phylogenomics: pancrustacean...

    • search.dataone.org
    • data-staging.niaid.nih.gov
    • +2more
    Updated Jun 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Rota-Stabelli; Nicolas Lartillot; Herve Philippe; Davide Pisani (2025). Serine codon-usage bias in deep phylogenomics: pancrustacean relationships as a case study [Dataset]. http://doi.org/10.5061/dryad.7p1k8304
    Explore at:
    Dataset updated
    Jun 10, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Omar Rota-Stabelli; Nicolas Lartillot; Herve Philippe; Davide Pisani
    Time period covered
    Oct 12, 2012
    Description

    Phylogenomic analyses of ancient relationships are usually performed using amino acid data, but it is unclear whether amino acids or nucleotides should be preferred. With the 2-fold aim of addressing this problem and clarifying pancrustacean relationships, we explored the signals in the 62 protein-coding genes carefully assembled by Regier et al. in 2010. With reference to the pancrustaceans, this data set infers a highly supported nucleotide tree that is substantially different to the corresponding, but poorly supported, amino acid one. We show that the discrepancy between the nucleotide-based and the amino acids-based trees is caused by substitutions within synonymous codon families (especially those of serine—TCN and AGY). We show that different arthropod lineages are differentially biased in their usage of serine, arginine, and leucine synonymous codons, and that the serine bias is correlated with the topology derived from the nucleotides, but not the amino acids. We suggest that a ...

  10. Data files for downstream analysis of the “mammalian codon usage” manuscript...

    • figshare.com
    txt
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konrad L M Rudolph; Bianca M Schmitt; Diego Villar; Robert J White; John C Marioni; Claudia Kutter; Duncan T Odom (2023). Data files for downstream analysis of the “mammalian codon usage” manuscript [Dataset]. http://doi.org/10.6084/m9.figshare.2056227.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Konrad L M Rudolph; Bianca M Schmitt; Diego Villar; Robert J White; John C Marioni; Claudia Kutter; Duncan T Odom
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Snapshot of the data used to run downstream analysis. A detailed methods description can be found in the manuscript and the associated analysis code.

  11. n

    Codon and Codon-Pair Usage Tables

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Jan 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Codon and Codon-Pair Usage Tables [Dataset]. http://identifiers.org/RRID:SCR_018504
    Explore at:
    Dataset updated
    Jan 28, 2025
    Description

    Database includes genomic codon-pair and dinucleotide statistics of all organisms with sequenced genome. Facilitates genetic variation analyses and recombinant gene design. Derived from all available GenBank and RefSeq data.

  12. t

    BIOGRID CURATED DATA FOR PUBLICATION: Host adaptation of codon usage in...

    • thebiogrid.org
    zip
    Updated Sep 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BioGRID Project (2022). BIOGRID CURATED DATA FOR PUBLICATION: Host adaptation of codon usage in SARS-CoV-2 from mammals indicates potential natural selection and viral fitness. [Dataset]. https://thebiogrid.org/244971/publication/host-adaptation-of-codon-usage-in-sars-cov-2-from-mammals-indicates-potential-natural-selection-and-viral-fitness.html
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 28, 2022
    Dataset authored and provided by
    BioGRID Project
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Protein-Protein, Genetic, and Chemical Interactions for Fu Y (2022):Host adaptation of codon usage in SARS-CoV-2 from mammals indicates potential natural selection and viral fitness. curated by BioGRID (https://thebiogrid.org); ABSTRACT: SARS-CoV-2 infection, which is the cause of the COVID-19 pandemic, has expanded across various animal hosts, and the virus can be transmitted particularly efficiently in minks. It is still not clear how SARS-CoV-2 is selected and evolves in its hosts, or how mutations affect viral fitness. In this report, sequences of SARS-CoV-2 isolated from human and animal hosts were analyzed, and the binding energy and capacity of the spike protein to bind human ACE2 and the mink receptor were compared. Codon adaptation index (CAI) analysis indicated the optimization of viral codons in some animals such as bats and minks, and a neutrality plot demonstrated that natural selection had a greater influence on some SARS-CoV-2 sequences than mutational pressure. Molecular dynamics simulation results showed that the mutations Y453F and N501T in mink SARS-CoV-2 could enhance the binding of the viral spike to the mink receptor, indicating the involvement of these mutations in natural selection and viral fitness. Receptor binding analysis revealed that the mink SARS-CoV-2 spike interacted more strongly with the mink receptor than the human receptor. Tracking the variations and codon bias of SARS-CoV-2 is helpful for understanding the fitness of the virus in virus transmission, pathogenesis, and immune evasion.

  13. Data from: Transcriptome-wide meta-analysis of codon usage in Escherichia...

    • zenodo.org
    zip
    Updated Sep 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anima Sutradhar; Anima Sutradhar; Jonathan Pointon; Christopher Lennon; Giovanni Stracquadanio; Giovanni Stracquadanio; Jonathan Pointon; Christopher Lennon (2023). Transcriptome-wide meta-analysis of codon usage in Escherichia coli [Dataset]. http://doi.org/10.5281/zenodo.8305120
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anima Sutradhar; Anima Sutradhar; Jonathan Pointon; Christopher Lennon; Giovanni Stracquadanio; Giovanni Stracquadanio; Jonathan Pointon; Christopher Lennon
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data generated by CUBseq on Escherichia coli RNA-seq data.

  14. d

    Data from: Gene expression levels are correlated with synonymous codon...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Feb 5, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Williford; Jeffery P. Demuth (2013). Gene expression levels are correlated with synonymous codon usage, amino acid composition and gene architecture in the red flour beetle, Tribolium castaneum [Dataset]. http://doi.org/10.5061/dryad.r0t1q
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 5, 2013
    Dataset provided by
    Dryad
    Authors
    Anna Williford; Jeffery P. Demuth
    Time period covered
    Jul 15, 2012
    Description

    ExpressionDataThe expression data from Tribolium castaneum whole body and reproductive tracts samples provided here is the output of ArrayStar gene expression software that was used to processes and normalize NimbleGen-generated raw expression data (Prince, Kirkland and Demuth 2010, Genome Biol Evol 2:336-346).

  15. f

    Data from: The Selective Advantage of Synonymous Codon Usage Bias in...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Mar 11, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hughes, Diarmaid; Brandis, Gerrit (2016). The Selective Advantage of Synonymous Codon Usage Bias in Salmonella [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001820410
    Explore at:
    Dataset updated
    Mar 11, 2016
    Authors
    Hughes, Diarmaid; Brandis, Gerrit
    Description

    The genetic code in mRNA is redundant, with 61 sense codons translated into 20 different amino acids. Individual amino acids are encoded by up to six different codons but within codon families some are used more frequently than others. This phenomenon is referred to as synonymous codon usage bias. The genomes of free-living unicellular organisms such as bacteria have an extreme codon usage bias and the degree of bias differs between genes within the same genome. The strong positive correlation between codon usage bias and gene expression levels in many microorganisms is attributed to selection for translational efficiency. However, this putative selective advantage has never been measured in bacteria and theoretical estimates vary widely. By systematically exchanging optimal codons for synonymous codons in the tuf genes we quantified the selective advantage of biased codon usage in highly expressed genes to be in the range 0.2–4.2 x 10−4 per codon per generation. These data quantify for the first time the potential for selection on synonymous codon choice to drive genome-wide sequence evolution in bacteria, and in particular to optimize the sequences of highly expressed genes. This quantification may have predictive applications in the design of synthetic genes and for heterologous gene expression in biotechnology.

  16. t

    BIOGRID CURATED DATA FOR PUBLICATION: Codon usage affects the structure and...

    • thebiogrid.org
    zip
    Updated Aug 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BioGRID Project (2016). BIOGRID CURATED DATA FOR PUBLICATION: Codon usage affects the structure and function of the Drosophila circadian clock protein PERIOD. [Dataset]. https://thebiogrid.org/203841/publication/codon-usage-affects-the-structure-and-function-of-the-drosophila-circadian-clock-protein-period.html
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 1, 2016
    Dataset authored and provided by
    BioGRID Project
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Protein-Protein, Genetic, and Chemical Interactions for Fu J (2016):Codon usage affects the structure and function of the Drosophila circadian clock protein PERIOD. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Codon usage bias is a universal feature of all genomes, but its in vivo biological functions in animal systems are not clear. To investigate the in vivo role of codon usage in animals, we took advantage of the sensitivity and robustness of the Drosophila circadian system. By codon-optimizing parts of Drosophila period (dper), a core clock gene that encodes a critical component of the circadian oscillator, we showed that dper codon usage is important for circadian clock function. Codon optimization of dper resulted in conformational changes of the dPER protein, altered dPER phosphorylation profile and stability, and impaired dPER function in the circadian negative feedback loop, which manifests into changes in molecular rhythmicity and abnormal circadian behavioral output. This study provides an in vivo example that demonstrates the role of codon usage in determining protein structure and function in an animal system. These results suggest a universal mechanism in eukaryotes that uses a codon usage "code" within genetic codons to regulate cotranslational protein folding.

  17. f

    Data from: Variation and selection on codon usage bias across an entire...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jul 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hittinger, Chris Todd; Rokas, Antonis; Steenwyk, Jacob L.; LaBella, Abigail L.; Opulente, Dana A. (2019). Variation and selection on codon usage bias across an entire subphylum [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000066729
    Explore at:
    Dataset updated
    Jul 31, 2019
    Authors
    Hittinger, Chris Todd; Rokas, Antonis; Steenwyk, Jacob L.; LaBella, Abigail L.; Opulente, Dana A.
    Description

    Variation in synonymous codon usage is abundant across multiple levels of organization: between codons of an amino acid, between genes in a genome, and between genomes of different species. It is now well understood that variation in synonymous codon usage is influenced by mutational bias coupled with both natural selection for translational efficiency and genetic drift, but how these processes shape patterns of codon usage bias across entire lineages remains unexplored. To address this question, we used a rich genomic data set of 327 species that covers nearly one third of the known biodiversity of the budding yeast subphylum Saccharomycotina. We found that, while genome-wide relative synonymous codon usage (RSCU) for all codons was highly correlated with the GC content of the third codon position (GC3), the usage of codons for the amino acids proline, arginine, and glycine was inconsistent with the neutral expectation where mutational bias coupled with genetic drift drive codon usage. Examination between genes’ effective numbers of codons and their GC3 contents in individual genomes revealed that nearly a quarter of genes (381,174/1,683,203; 23%), as well as most genomes (308/327; 94%), significantly deviate from the neutral expectation. Finally, by evaluating the imprint of translational selection on codon usage, measured as the degree to which genes’ adaptiveness to the tRNA pool were correlated with selective pressure, we show that translational selection is widespread in budding yeast genomes (264/327; 81%). These results suggest that the contribution of translational selection and drift to patterns of synonymous codon usage across budding yeasts varies across codons, genes, and genomes; whereas drift is the primary driver of global codon usage across the subphylum, the codon bias of large numbers of genes in the majority of genomes is influenced by translational selection.

  18. d

    Data from: Mitochondrial phylogenomics of early land plants: mitigating the...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jun 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Liu; Cymon J. Cox; Wei Wang; Bernard Goffinet (2025). Mitochondrial phylogenomics of early land plants: mitigating the effects of saturation, compositional heterogeneity, and codon-usage bias [Dataset]. http://doi.org/10.5061/dryad.7b470
    Explore at:
    Dataset updated
    Jun 12, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Yang Liu; Cymon J. Cox; Wei Wang; Bernard Goffinet
    Time period covered
    Jan 1, 2014
    Description

    Phylogenetic analyses using concatenation of genomic-scale data have been seen as the panacea to resolving the incongruences among inferences from few or single genes. However, phylogenomics may also suffer from systematic errors, due to the, perhaps cumulative, effects of saturation, among-taxa compositional (GC content) heterogeneity, or codon-usage bias plaguing the individual nucleotide loci that are concatenated. Here we provide an example of how these factors affect the inferences of the phylogeny of early land plants based on mitochondrial genomic data. Mitochondrial sequences evolve slowly in plants and hence are thought to be suitable for resolving deep relationships. We newly assembled mitochondrial genomes from 20 bryophytes, complemented these with 40 other streptophytes (land plants plus algal outgroups), compiling a data matrix of 60 taxa and 41 mitochondrial genes. Homogeneous analyses of the concatenated nucleotide data resolve mosses as sister-group to the remaining lan...

  19. d

    Data from: Translational selection frequently overcomes genetic drift in...

    • search.dataone.org
    • datadryad.org
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aoife Doherty; James O. McInerney (2025). Translational selection frequently overcomes genetic drift in shaping synonymous codon usage patterns in vertebrates [Dataset]. http://doi.org/10.5061/dryad.4k887
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Aoife Doherty; James O. McInerney
    Time period covered
    Aug 5, 2013
    Description

    Synonymous codon usage (SCU) patterns are shaped by a balance between mutation, drift, and natural selection. To date, detection of translational selection in vertebrates has proven to be a challenging task, obscured by small long-term effective population sizes in larger animals and the existence of isochores in some species. The consensus is that, in such species, natural selection is either completely ineffective at overcoming mutational pressures and genetic drift or perhaps is effective but so weak that it is not detectable. The aim of this research is to understand the interplay between mutation, selection, and genetic drift in vertebrates. We observe that although variation in mutational bias is undoubtedly the dominant force influencing codon usage, translational selection acts as a weak additional factor influencing synonymous codon usage. These observations indicate that translational selection is a widespread phenomenon in vertebrates and is not limited to a few species.

  20. Codons that are significantly over-enriched in high RFP count positions in...

    • plos.figshare.com
    xls
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabriel Wright; Anabel Rodriguez; Jun Li; Patricia L. Clark; Tijana Milenković; Scott J. Emrich (2023). Codons that are significantly over-enriched in high RFP count positions in at least 10 of the 14 data sets considered (% Enriched > 70). [Dataset]. http://doi.org/10.1371/journal.pone.0232003.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 15, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Gabriel Wright; Anabel Rodriguez; Jun Li; Patricia L. Clark; Tijana Milenković; Scott J. Emrich
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These codons are significantly enriched at the estimated A-site in the top 10% of normalized footprint counts using a Bonferroni corrected p-value of 8.2 * 10−4 (.05/61). These codons are also analyzed with respect to each bias measure, such that a larger negative number indicates a stronger correspondence with the model. Note that there are only four bias measures listed (compared to the five codon usage models analyzed earlier) as the High-Phi %MinMax and High-Phi CAI models use the same underlying CUB measure.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Codon Usage Tabulated from GenBank [Dataset]. https://bioregistry.io/cutg

Codon Usage Tabulated from GenBank

Explore at:
106 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Nov 10, 2022
Description

Codon usage in individual genes has been calculated using the nucleotide sequence data obtained from the GenBank Genetic Sequence Database. The compilation of codon usage is synchronized with each major release of GenBank.

Search
Clear search
Close search
Google apps
Main menu