100+ datasets found

r
COG
rrid.site
neuinfo.org
+2more
Updated Oct 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). COG [Dataset]. http://identifiers.org/RRID:SCR_007139
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007139
Dataset updated
Oct 21, 2025
Description
A database for phylogenetic classification for proteins encoded in complete genomes. Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain. Please be aware that COGs hasn't been updated in many years and will not be.
d
Classification of the UBCF_13 COG based on COG database in NCBI
datadryad.org
search.dataone.org
+1more
zip
Updated Aug 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Raudhatul Fatiah; Irfan Suliansyah; Djong Hon Tjong; Lily Syukriani; Roza Yunita; Robi Trivano; Nurefni Azizah; Jamsari Jamsari (2021). Classification of the UBCF_13 COG based on COG database in NCBI [Dataset]. http://doi.org/10.5061/dryad.sn02v6x4g
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.sn02v6x4g
Dataset updated
Aug 2, 2021
Dataset provided by
Dryad
Authors
Raudhatul Fatiah; Irfan Suliansyah; Djong Hon Tjong; Lily Syukriani; Roza Yunita; Robi Trivano; Nurefni Azizah; Jamsari Jamsari
Time period covered
Jul 21, 2021
Description
Background: The Serratia plymuthica UBCF_13 is a phylloplane associated plant bacterium showing antifungal activity. Whole genome sequence provides information to get more insight about evolutionary study, unique traits in the genome and possibility to explore potential of this microorganism for future study. Here, we report the genome sequence of S. plymuthica UBCF_13 and the comparison with other seventeen strain.

Methods: Continuous short reads were attained from Illumina sequencing runs and reads of 150 bp were merged into a single dataset. A pan-genome based method was used to identify the core-genome of S. plymuthica species and the unique gene in UBCF-13.

Results: Assembled Illumina reads of S. plymuthica strain UBCF_13 genome was produced a 5.46 Mb circular genome sequence. 3315 genes were found to belong to the core-genome sheared by the 18 strains evaluated. The UBCF_13 genome harbors 488 unique genes, where 300 of which only can be found in this strain. The raw and assemble...
Abundance of myxobacterial PTPs in contrast to the 66 genome COG database.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anke Treuner-Lange (2023). Abundance of myxobacterial PTPs in contrast to the 66 genome COG database. [Dataset]. http://doi.org/10.1371/journal.pone.0011164.t003
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0011164.t003
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Anke Treuner-Lange
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The three numbers in the COG database row represent the following: total number of proteins in the 66 genome COG database/total number of genomes in which those proteins were found/highest number of proteins per single bacterial genome.
b
COG Pathways
bioregistry.io
Updated Aug 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). COG Pathways [Dataset]. https://bioregistry.io/cog.pathway
Explore at:
Dataset updated
Aug 12, 2021
Description
Database of Clusters of Orthologous Genes grouped by pathways and functional systems. It includes the complete genomes of 1,187 bacteria and 122 archaea that map into 1,234 genera.
n
ProOpDB
neuinfo.org
dknet.org
+2more
Updated Oct 8, 2011
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2011). ProOpDB [Dataset]. http://identifiers.org/RRID:SCR_006111
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006111
Dataset updated
Oct 8, 2011
Description
The Prokaryotic Operon DataBase (ProOpDB) constitutes one of the most precise and complete repository of operon predictions in our days. Using our novel and highly accurate operon algorithm, we have predicted the operon structures of more than 1,200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: i) organism name, ii) metabolic pathways, as defined by the KEGG database, iii) gene orthology, as defined by the COG database, iv) conserved protein motifs, as defined by the Pfam database, v) reference gene, vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient protocol to select the more representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool (GeConT) to visualize their genomic context and retrieve the sequence of their corresponding 5�� regulatory regions, as well as the nucleotide or amino acid sequences of their genes. The prediction algorithm The algorithm is a multilayer perceptron neural network (MLP) classifier, that used as input the intergenic distances of contiguous genes and the functional relationship scores of the STRING database between the different groups of orthologous proteins, as defined in the COG database. Nevertheless, the operon prediction of our method is not restricted to only those genes with a COG assignation, since we successfully defined new groups of orthologous genes and obtained, by extrapolation, a set of equivalent STRING-like scores based on conserved gene pairs on different genomes. Since the STRING functional relationships scores are determined in an un-bias manner and efficiently integrates a large amount of information coming from different sources and kind of evidences, the prediction made by our MLP are considerably less influenced by the bias imposed in the training procedure using one specific organism.
n
Phylogenetic Clusters of Orthologous Groups Ranking
neuinfo.org
scicrunch.org
+2more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Phylogenetic Clusters of Orthologous Groups Ranking [Dataset]. http://identifiers.org/RRID:SCR_008223
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008223
Dataset updated
Jan 29, 2022
Description
THIS RESOURCE IS NO LONGER IN SERVICE, documented on August 20,2019.The COG-database has become a powerful tool in the field of comparative genomics. The construction of this data-base is based on sequence homologies of proteins from different completely sequenced genomes. Highly homologous proteins are assigned to clusters of orthologous groups. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies. The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Here is a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or approximately 54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of approximately 20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (approximately 1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes.
COG-curation_masterTallySheet
figshare.com
xlsx
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Colbie Reed (2023). COG-curation_masterTallySheet [Dataset]. http://doi.org/10.6084/m9.figshare.23515527.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23515527.v1
Dataset updated
Jun 14, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Colbie Reed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Master tally sheet of the total curation process of generating a new list of COGs representative of gene/protein families involved in tRNA modifications as per published gene-/protein-modification pairs curated from the literature. Original COG Pathway list (via the COG Database, June 2022) contained 59 COGs; the final list (see other Object, namely 4-S3) totalled 89 COGs, 52 retained from the original list and 37 were added to contribute to the new list. Of the original 59, 7 were removed.
b
Data from: COG Categories
bioregistry.io
Updated Mar 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). COG Categories [Dataset]. https://bioregistry.io/cog.category
Explore at:
Dataset updated
Mar 9, 2025
Description
Higher-level classifications of COG Pathways
COG-BCI database: A multi-session and multi-task EEG cognitive dataset for...
zenodo.org
data.niaid.nih.gov
bin, pdf, txt, zip
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy; Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy (2024). COG-BCI database: A multi-session and multi-task EEG cognitive dataset for passive brain-computer interfaces [Dataset]. http://doi.org/10.5281/zenodo.6874129
Explore at:
zip, bin, txt, pdfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6874129
Dataset updated
Jul 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy; Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Brain-Computer Interfaces, and especially passive Brain-Computer Interfaces (pBCI), with their ability to estimate and detect mental states, are receiving increasing attention from both the scientific and the research and development communities. Many pBCIs aim to increase the safety of complex work environments such as in the aeronautical domain. Therefore, mental workload, vigilance and decision-making are some of the most commonly examined aspects of cognition within this field of research. A large proportion of pBCIs involve a component of machine learning and signal processing as the data that are collected need to be transformed into a reliable estimate of the users’ current mental state (e.g. mental workload). Improving this component is a major challenge for researchers, requiring large quantities of data. While data sharing is common for the active BCI community, open pBCI datasets are scarcer and generally incomplete with regards to the information they report. This is particularly true for datasets encompassing several tasks or sessions, which are of importance for tackling the challenges of transfer learning. Testing new pipelines, feature extraction algorithms and classifiers are central issues for future advances in research within this domain, as well as for algorithm benchmark and research reproducibility.The COG-BCI database presented here is comprised of the recordings of 29 participants over 3 individual sessions with 4 different tasks designed to elicit different cognitive states. This results in a total of over 100 hours of open electrophysiological (EEG) and electrocardiogram (ECG) data. The project was validated by the local ethical committee of the University of Toulouse (CER number 2021-342). The dataset was validated on a subjective, behavioral and physiological level (i.e. cardiac and cerebral activity), to ensure its usefulness to the pBCI community. This body of work represents a large effort to promote the use of pBCIs, as well as the use of open science.

The data are in the Brain Imaging Data Structure (BIDS) format. For more information, please read the COG-BCI_info.pdf file.
Code + Data for COG Identification
figshare.com
Updated Nov 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Troy Osborn (2025). Code + Data for COG Identification [Dataset]. http://doi.org/10.6084/m9.figshare.30615452.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.30615452.v1
Dataset updated
Nov 14, 2025
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Troy Osborn
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
To identify clusters of orthologous genes (COGs) that correlate with nutrient limitation in the modern ocean, we examined the Ocean Microbial Reference Catalog v2 (OM-RGC.v2) from the Tara Oceans Project. The OM-RGC.v2 includes relative gene abundances of all COGs (n = 4,787) in 139 Tara Oceans metagenomic samples, along with metadata information including phosphate, oxygen, and nitrate/nitrite concentrations. (Nitrate/nitrite values were reported together for OM-RGC v2.) Iron concentrations for Tara Oceans samples were not available and were thus estimated using the PISCES2 model based on iron concentration model predictions for Tara Oceans sampling locations as described in Table S1 of Caputi et al., 2019. Iron concentrations were predicted for surface and the deep chlorophyll maximum (DCM) only; iron concentrations for samples from the mesopelagic zone were not available under the PISCES2 model. All other metadata for Tara Oceans samples were directly obtained from Salazar et al., 2019.Estimation of correlations between COGs and metadata information was performed using regression models. Compound poisson linear models were fitted in bulk using the MaAsLin2 software package (v. 1.18.0). Separate models were fit for each COG to analyze the effect of metadata variables on individual COG abundances. While the main focus was to investigate correlation with nutrient abundance, environmental metadata was included in the model to control for as many potential confounding effects as the data allowed. The following predictors were included in the final model (based on variables available from the Tara Oceans dataset): the size fraction at which the sample was taken, mean temperature, depth, salinity, mean oxygen concentration, PO4 concentration, NO2 + NO3 concentration, iron concentration, and absolute latitude. Of these, the following predictors were log-transformed to allow greater model fit: depth, PO4 concentration, NO2 + NO3 concentration. To the same end, the iron concentration was transformed by taking the square root, and the absolute value of the latitude was taken. Otherwise, no transformations or normalization was performed. No abundance cutoff was applied, but COGs present in less than one-third of the Tara Oceans samples were discarded in order to ensure that the COGs identified by the statistical model were meaningful.
d
2022 Connecticut Parcel and CAMA Data by COG
catalog.data.gov
data.ct.gov
+2more
Updated Jun 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.ct.gov (2025). 2022 Connecticut Parcel and CAMA Data by COG [Dataset]. https://catalog.data.gov/dataset/2022-connecticut-town-parcels-and-cama-tables
Explore at:
Dataset updated
Jun 21, 2025
Dataset provided by
data.ct.gov
Area covered
Connecticut
Description
The towns of Connecticut (CT) Parcels and Computer-Assisted Mass Appraisal (CAMA) data for 2022 are part of a zipped file containing two items: CT parcels in geodatabases organized by COGs and associated CAMA files. The parcel information includes 169 out of 169 town organized with geodatabases for each of the 9 Council of Governments. Most of the parcel data sets can be linked to the CAMA data which has attribute information (e.g. value of house, number of bedrooms) about real property. The parcel features for each town are in shape files, feature classes, or within a geodatabase. Most parcels are organized by town and COG and placed within a geodatabases. The CAMA data sets have information about real property within the towns of CT. It may be linked to the parcels using a join process within a GIS package like ArcGIS Pro or QGIS. 154 out of 169 towns have complete CAMA information. Of the remaining 15 towns, four have no information and the remaining have some limited information mixed into the parcel attribute tables. These files were gathered from the CT towns by the COGs and then submitted to CT OPM. Town data is organized by COG. Attribute names, primary key, secondary key, naming conventions, and file formats are not fully consistent but some cleaning and reorganization was conducted to improve quality. This file was created on 03/08/2023 from data collected in 2021-2022.
d
Data from: Towards understanding the first genome sequence of a crenarchaeon...
catalog.data.gov
odgavaprod.ogopendata.com
Updated Sep 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (2025). Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs) [Dataset]. https://catalog.data.gov/dataset/towards-understanding-the-first-genome-sequence-of-a-crenarchaeon-by-genome-annotation-usi
Explore at:
Dataset updated
Sep 7, 2025
Dataset provided by
National Institutes of Health
Description
Background: Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi. Results: A. pernix and P. abyssi proteins were assigned to COGs using the COGNITOR program; the results were verified on a case-by-case basis and augmented by additional database searches using the PSI-BLAST and TBLASTN programs. Functions were predicted for over 300 proteins from A. pernix, which could not be assigned a function using conventional methods with a conservative sequence similarity threshold, an approximately 50% increase compared to the original annotation. A. pernix shares most of the conserved core of proteins that were previously identified in the Euryarchaeota. Cluster analysis or distance matrix tree construction based on the co-occurrence of genomes in COGs showed that A. pernix forms a distinct group within the archaea, although grouping with the two species of Pyrococci, indicative of similar repertoires of conserved genes, was observed. No indication of a specific relationship between Crenarchaeota and eukaryotes was obtained in these analyses. Several proteins that are conserved in Euryarchaeota and most bacteria are unexpectedly missing in A. pernix, including the entire set of de novo purine biosynthesis enzymes, the GTPase FtsZ (a key component of the bacterial and euryarchaeal cell-division machinery), and the tRNA-specific pseudouridine synthase, previously considered universal. A. pernix is represented in 48 COGs that do not contain any euryarchaeal members. Many of these proteins are TCA cycle and electron transport chain enzymes, reflecting the aerobic lifestyle of A. pernix. Conclusions: Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and predicted protein functions provide for a significant improvement in genome annotation. A differential genome display approach helps in a systematic investigation of common and distinct features of gene repertoires and in some cases reveals unexpected connections that may be indicative of functional similarities between phylogenetically distant organisms and of lateral gene exchange.
KO-to-COGmapping
figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Jun 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Colbie Reed (2023). KO-to-COGmapping [Dataset]. http://doi.org/10.6084/m9.figshare.23515770.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23515770.v1
Dataset updated
Jun 14, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Colbie Reed
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Worksheet includes the mapping of both tRNA modification-relevant and -irrelevant K numbers to their respective overlapping COGs. Representative sequences of Object 4-S2 informed overlap at the sequence-level, maintaining the theme of data being generated and curated corresponding to support provided by published data. Additional tabs include the same data with expanded names as well as other KEGG K number and representative sequence entry-sourced data (e.g., EC numbers).
f
Data elements available in COG and PHIS.
datasetcatalog.nlm.nih.gov
plos.figshare.com
Updated Nov 25, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fisher, Brian T.; Adamson, Peter C.; Huang, Yuan-Shung; Getz, Kelly D.; Alonzo, Todd A.; Bagatell, Rochelle; Gerbing, Robert B.; Aplenc, Richard; Gamis, Alan; Sung, Lillian; Seif, Alix E.; Hall, Matt; Li, Yimei (2015). Data elements available in COG and PHIS. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001847785
Explore at:
Dataset updated
Nov 25, 2015
Authors
Fisher, Brian T.; Adamson, Peter C.; Huang, Yuan-Shung; Getz, Kelly D.; Alonzo, Todd A.; Bagatell, Rochelle; Gerbing, Robert B.; Aplenc, Richard; Gamis, Alan; Sung, Lillian; Seif, Alix E.; Hall, Matt; Li, Yimei
Description
Data elements available in COG and PHIS.
m
Cabot Oil & Gas Corporation Alternative Data Analytics
meyka.com
Updated Sep 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meyka (2025). Cabot Oil & Gas Corporation Alternative Data Analytics [Dataset]. https://meyka.com/stock/COG/alt-data/
Explore at:
Dataset updated
Sep 20, 2025
Dataset provided by
Meyka
Description
Non-traditional data signals from social media and employment platforms for COG stock analysis
f
Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG,...
datasetcatalog.nlm.nih.gov
springernature.figshare.com
Updated Jun 4, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang, Jing; Jian, Jianbo; Chen, Jianwei; Liu, Xueqing; Xia, Jinquan; Gao, Yong; Du, Hejun; Chen, Lei; Xiao, Kan; Wang, Binzhong (2019). Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG, COG, InterPro and GO database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000152846
Explore at:
Dataset updated
Jun 4, 2019
Authors
Yang, Jing; Jian, Jianbo; Chen, Jianwei; Liu, Xueqing; Xia, Jinquan; Gao, Yong; Du, Hejun; Chen, Lei; Xiao, Kan; Wang, Binzhong
Description
Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG, COG, InterPro and GO database.
e
Cog Services Trans Export Import Data | Eximpedia
eximpedia.app
Updated Sep 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Cog Services Trans Export Import Data | Eximpedia [Dataset]. https://www.eximpedia.app/companies/cog-services-trans/54119853
Explore at:
Dataset updated
Sep 13, 2025
Description
Cog Services Trans Export Import Data. Follow the Eximpedia platform for HS code, importer-exporter records, and customs shipment details.
Additional file 1: Tables S1, S2 and S3. of COGcollator: a web server for...
springernature.figshare.com
xlsx
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daria Dibrova; Kirill Konovalov; Vadim Perekhvatov; Konstantin Skulachev; Armen Mulkidjanian (2023). Additional file 1: Tables S1, S2 and S3. of COGcollator: a web server for analysis of distant relationships between homologous protein families [Dataset]. http://doi.org/10.6084/m9.figshare.5648683.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5648683.v1
Dataset updated
Jun 4, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Daria Dibrova; Kirill Konovalov; Vadim Perekhvatov; Konstantin Skulachev; Armen Mulkidjanian
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Representative list of 124 genomes sampled from the 711 genomes of the current COG database release [2]. Table S2. Representative list of 27 eukaryotic genomes sampled manually. Table S3. Results of the similarity assessment for the homologs of catalytic β-subunit of the bacterial FOF1-type ATP synthase by applying the HHpred algorithm [19]. The top hits for the α- and β-subunits of the F-type ATP synthase of E.coli and the B- and A- subunits of the A-type ATP synthase of Methanosarcina mazei (cf with Table 1) are colored red. (XLSX 29 kb)
SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data -...
healthdatagateway.org
unknown
Updated Aug 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
COG-UK (2021). SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data - Scotland [Dataset]. https://healthdatagateway.org/dataset/72
Explore at:
unknownAvailable download formats
Dataset updated
Aug 20, 2021
Dataset provided by
COVID-19 Genomics UK Consortium
Authors
COG-UK
License
https://publichealthscotland.scot/services/data-research-and-innovation-services/electronic-data-research-and-innovation-service-edris/services-we-offer/https://publichealthscotland.scot/services/data-research-and-innovation-services/electronic-data-research-and-innovation-service-edris/services-we-offer/
Description
File contains basic public metadata, including sequence_name, location, date, pangolin lineage assignment, version and associated scores, scorpio VOC/VUI constellation call and associated scores, key spike protein mutations calls and a list of all nucleotide mutations found.
f
Classification of the genes with different levels of expression according to...
figshare.com
xls
Updated Jun 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amir Miraj Ul Hussain Shah; Ye Zhao; Yunfei Wang; Guoquan Yan; Qikun Zhang; Liangyan Wang; Bing Tian; Huan Chen; Yuejin Hua (2023). Classification of the genes with different levels of expression according to the Cluster of Orthologous Groups of proteins (COG) database. [Dataset]. http://doi.org/10.1371/journal.pone.0106341.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0106341.t002
Dataset updated
Jun 3, 2023
Dataset provided by
PLOS ONE
Authors
Amir Miraj Ul Hussain Shah; Ye Zhao; Yunfei Wang; Guoquan Yan; Qikun Zhang; Liangyan Wang; Bing Tian; Huan Chen; Yuejin Hua
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
J: Translation, ribosomal structure and biogenesis; K: Transcription; L: Replication, recombination and repair; B: Chromatin structure and dynamics; D: Cell cycle control, cell division, chromosome partitioning; V: Defense mechanisms; T: Signal transduction mechanisms; M: Cell wall/membrane/envelope biogenesis; N: Cell motility; U: Intracellular trafficking, secretion, and vesicular transport; O: Posttranslational modification, protein turnover, chaperones; C: Energy production and conversion; G: Carbohydrate transport and metabolism; E: Amino acid transport and metabolism; F: Nucleotide transport and metabolism; H: Coenzyme transport and metabolism; Lipid transport and metabolism; P: Inorganic ion transport and metabolism; Q: Secondary metabolites biosynthesis, transport and catabolism; S: Function unknown; R: General function prediction only.2. the total number of significant genes/Number of total genes in this COGClassification of the genes with different levels of expression according to the Cluster of Orthologous Groups of proteins (COG) database.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2025). COG [Dataset]. http://identifiers.org/RRID:SCR_007139

COG

RRID:SCR_007139, OMICS_01688, nif-0000-02672, COG (RRID:SCR_007139), COG, COG Cluster, COG Function, COG Pathway, COG Database, Clusters of Orthologous Groups of proteins, COGs, COGs - Clusters of Orthologous Groups of proteins, COGs - Phylogenetic classification of proteins encoded in complete genomes, COG Cluster, COG Pathway, COG Function

Explore at:

3 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_007139

Dataset updated

Oct 21, 2025

Description

A database for phylogenetic classification for proteins encoded in complete genomes. Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain. Please be aware that COGs hasn't been updated in many years and will not be.

Clear search

Close search

Google apps

Main menu

COG

Classification of the UBCF_13 COG based on COG database in NCBI

Abundance of myxobacterial PTPs in contrast to the 66 genome COG database.

COG Pathways

ProOpDB

Phylogenetic Clusters of Orthologous Groups Ranking

COG-curation_masterTallySheet

Data from: COG Categories

COG-BCI database: A multi-session and multi-task EEG cognitive dataset for...

Code + Data for COG Identification

2022 Connecticut Parcel and CAMA Data by COG

Data from: Towards understanding the first genome sequence of a crenarchaeon...

KO-to-COGmapping

Data elements available in COG and PHIS.

Cabot Oil & Gas Corporation Alternative Data Analytics

Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG,...

Cog Services Trans Export Import Data | Eximpedia

Additional file 1: Tables S1, S2 and S3. of COGcollator: a web server for...

SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data -...

Classification of the genes with different levels of expression according to...

COG