100+ datasets found
  1. r

    COG

    • rrid.site
    • neuinfo.org
    • +3more
    Updated Jul 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). COG [Dataset]. http://identifiers.org/RRID:SCR_007139
    Explore at:
    Dataset updated
    Jul 19, 2025
    Description

    A database for phylogenetic classification for proteins encoded in complete genomes. Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain. Please be aware that COGs hasn't been updated in many years and will not be.

  2. d

    Classification of the UBCF_13 COG based on COG database in NCBI

    • search.dataone.org
    • datadryad.org
    Updated Apr 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Raudhatul Fatiah; Irfan Suliansyah; Djong Hon Tjong; Lily Syukriani; Roza Yunita; Robi Trivano; Nurefni Azizah; Jamsari Jamsari (2025). Classification of the UBCF_13 COG based on COG database in NCBI [Dataset]. http://doi.org/10.5061/dryad.sn02v6x4g
    Explore at:
    Dataset updated
    Apr 29, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Raudhatul Fatiah; Irfan Suliansyah; Djong Hon Tjong; Lily Syukriani; Roza Yunita; Robi Trivano; Nurefni Azizah; Jamsari Jamsari
    Time period covered
    Jan 1, 2021
    Description

    Background: The Serratia plymuthica UBCF_13 is a phylloplane associated plant bacterium showing antifungal activity. Whole genome sequence provides information to get more insight about evolutionary study, unique traits in the genome and possibility to explore potential of this microorganism for future study. Here, we report the genome sequence of S. plymuthica UBCF_13 and the comparison with other seventeen strain.

    Methods: Continuous short reads were attained from Illumina sequencing runs and reads of 150 bp were merged into a single dataset. A pan-genome based method was used to identify the core-genome of S. plymuthica species and the unique gene in UBCF-13.

    Results: Assembled Illumina reads of S. plymuthica strain UBCF_13 genome was produced a 5.46 Mb circular genome sequence. 3315 genes were found to belong to the core-genome sheared by the 18 strains evaluated. The UBCF_13 genome harbors 488 unique genes, where 300 of which only can be found in this strain. The raw and asse...

  3. n

    ProOpDB

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Oct 8, 2011
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). ProOpDB [Dataset]. http://identifiers.org/RRID:SCR_006111
    Explore at:
    Dataset updated
    Oct 8, 2011
    Description

    The Prokaryotic Operon DataBase (ProOpDB) constitutes one of the most precise and complete repository of operon predictions in our days. Using our novel and highly accurate operon algorithm, we have predicted the operon structures of more than 1,200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: i) organism name, ii) metabolic pathways, as defined by the KEGG database, iii) gene orthology, as defined by the COG database, iv) conserved protein motifs, as defined by the Pfam database, v) reference gene, vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient protocol to select the more representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool (GeConT) to visualize their genomic context and retrieve the sequence of their corresponding 5�� regulatory regions, as well as the nucleotide or amino acid sequences of their genes. The prediction algorithm The algorithm is a multilayer perceptron neural network (MLP) classifier, that used as input the intergenic distances of contiguous genes and the functional relationship scores of the STRING database between the different groups of orthologous proteins, as defined in the COG database. Nevertheless, the operon prediction of our method is not restricted to only those genes with a COG assignation, since we successfully defined new groups of orthologous genes and obtained, by extrapolation, a set of equivalent STRING-like scores based on conserved gene pairs on different genomes. Since the STRING functional relationships scores are determined in an un-bias manner and efficiently integrates a large amount of information coming from different sources and kind of evidences, the prediction made by our MLP are considerably less influenced by the bias imposed in the training procedure using one specific organism.

  4. COG-curation_masterTallySheet

    • figshare.com
    xlsx
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Colbie Reed (2023). COG-curation_masterTallySheet [Dataset]. http://doi.org/10.6084/m9.figshare.23515527.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Colbie Reed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Master tally sheet of the total curation process of generating a new list of COGs representative of gene/protein families involved in tRNA modifications as per published gene-/protein-modification pairs curated from the literature. Original COG Pathway list (via the COG Database, June 2022) contained 59 COGs; the final list (see other Object, namely 4-S3) totalled 89 COGs, 52 retained from the original list and 37 were added to contribute to the new list. Of the original 59, 7 were removed.

  5. COG-BCI database: A multi-session and multi-task EEG cognitive dataset for...

    • zenodo.org
    bin, pdf, txt, zip
    Updated Jul 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy; Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy (2024). COG-BCI database: A multi-session and multi-task EEG cognitive dataset for passive brain-computer interfaces [Dataset]. http://doi.org/10.5281/zenodo.6874129
    Explore at:
    zip, bin, txt, pdfAvailable download formats
    Dataset updated
    Jul 16, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy; Marcel F. Hinss; Emilie S. Jahanpour; Bertille Somon; Lou Pluchon; Frédéric Dehais; Raphaëlle N. Roy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Brain-Computer Interfaces, and especially passive Brain-Computer Interfaces (pBCI), with their ability to estimate and detect mental states, are receiving increasing attention from both the scientific and the research and development communities. Many pBCIs aim to increase the safety of complex work environments such as in the aeronautical domain. Therefore, mental workload, vigilance and decision-making are some of the most commonly examined aspects of cognition within this field of research. A large proportion of pBCIs involve a component of machine learning and signal processing as the data that are collected need to be transformed into a reliable estimate of the users’ current mental state (e.g. mental workload). Improving this component is a major challenge for researchers, requiring large quantities of data. While data sharing is common for the active BCI community, open pBCI datasets are scarcer and generally incomplete with regards to the information they report. This is particularly true for datasets encompassing several tasks or sessions, which are of importance for tackling the challenges of transfer learning. Testing new pipelines, feature extraction algorithms and classifiers are central issues for future advances in research within this domain, as well as for algorithm benchmark and research reproducibility.The COG-BCI database presented here is comprised of the recordings of 29 participants over 3 individual sessions with 4 different tasks designed to elicit different cognitive states. This results in a total of over 100 hours of open electrophysiological (EEG) and electrocardiogram (ECG) data. The project was validated by the local ethical committee of the University of Toulouse (CER number 2021-342). The dataset was validated on a subjective, behavioral and physiological level (i.e. cardiac and cerebral activity), to ensure its usefulness to the pBCI community. This body of work represents a large effort to promote the use of pBCIs, as well as the use of open science.

    The data are in the Brain Imaging Data Structure (BIDS) format. For more information, please read the COG-BCI_info.pdf file.

  6. n

    Phylogenetic Clusters of Orthologous Groups Ranking

    • neuinfo.org
    Updated Oct 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Phylogenetic Clusters of Orthologous Groups Ranking [Dataset]. http://identifiers.org/RRID:SCR_008223
    Explore at:
    Dataset updated
    Oct 16, 2024
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on August 20,2019.The COG-database has become a powerful tool in the field of comparative genomics. The construction of this data-base is based on sequence homologies of proteins from different completely sequenced genomes. Highly homologous proteins are assigned to clusters of orthologous groups. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies. The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Here is a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or approximately 54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of approximately 20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (approximately 1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes.

  7. b

    Data from: COG Categories

    • bioregistry.io
    Updated Mar 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). COG Categories [Dataset]. https://bioregistry.io/cog.category
    Explore at:
    Dataset updated
    Mar 9, 2025
    Description

    Higher-level classifications of COG Pathways

  8. r

    Phylogenetic Clusters of Orthologous Groups Ranking

    • rrid.site
    • dknet.org
    • +2more
    Updated Aug 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Phylogenetic Clusters of Orthologous Groups Ranking [Dataset]. http://identifiers.org/RRID:SCR_008223
    Explore at:
    Dataset updated
    Aug 17, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on August 20,2019.The COG-database has become a powerful tool in the field of comparative genomics. The construction of this data-base is based on sequence homologies of proteins from different completely sequenced genomes. Highly homologous proteins are assigned to clusters of orthologous groups. The updated collection of orthologous protein sets for prokaryotes and eukaryotes is expected to be a useful platform for functional annotation of newly sequenced genomes, including those of complex eukaryotes, and genome-wide evolutionary studies. The availability of multiple, essentially complete genome sequences of prokaryotes and eukaryotes spurred both the demand and the opportunity for the construction of an evolutionary classification of genes from these genomes. Such a classification system based on orthologous relationships between genes appears to be a natural framework for comparative genomics and should facilitate both functional annotation of genomes and large-scale evolutionary studies. Here is a major update of the previously developed system for delineation of Clusters of Orthologous Groups of proteins (COGs) from the sequenced genomes of prokaryotes and unicellular eukaryotes and the construction of clusters of predicted orthologs for 7 eukaryotic genomes, which we named KOGs after eukaryotic orthologous groups. The COG collection currently consists of 138,458 proteins, which form 4873 COGs and comprise 75% of the 185,505 (predicted) proteins encoded in 66 genomes of unicellular organisms. The eukaryotic orthologous groups (KOGs) include proteins from 7 eukaryotic genomes: three animals (the nematode Caenorhabditis elegans, the fruit fly Drosophila melanogaster and Homo sapiens), one plant, Arabidopsis thaliana, two fungi (Saccharomyces cerevisiae and Schizosaccharomyces pombe), and the intracellular microsporidian parasite Encephalitozoon cuniculi. The current KOG set consists of 4852 clusters of orthologs, which include 59,838 proteins, or approximately 54% of the analyzed eukaryotic 110,655 gene products. Compared to the coverage of the prokaryotic genomes with COGs, a considerably smaller fraction of eukaryotic genes could be included into the KOGs; addition of new eukaryotic genomes is expected to result in substantial increase in the coverage of eukaryotic genomes with KOGs. Examination of the phyletic patterns of KOGs reveals a conserved core represented in all analyzed species and consisting of approximately 20% of the KOG set. This conserved portion of the KOG set is much greater than the ubiquitous portion of the COG set (approximately 1% of the COGs). In part, this difference is probably due to the small number of included eukaryotic genomes, but it could also reflect the relative compactness of eukaryotes as a clade and the greater evolutionary stability of eukaryotic genomes.

  9. d

    2022 Connecticut Parcel and CAMA Data by COG

    • catalog.data.gov
    • data.ct.gov
    • +2more
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2025). 2022 Connecticut Parcel and CAMA Data by COG [Dataset]. https://catalog.data.gov/dataset/2022-connecticut-town-parcels-and-cama-tables
    Explore at:
    Dataset updated
    Jun 21, 2025
    Dataset provided by
    data.ct.gov
    Area covered
    Connecticut
    Description

    The towns of Connecticut (CT) Parcels and Computer-Assisted Mass Appraisal (CAMA) data for 2022 are part of a zipped file containing two items: CT parcels in geodatabases organized by COGs and associated CAMA files. The parcel information includes 169 out of 169 town organized with geodatabases for each of the 9 Council of Governments. Most of the parcel data sets can be linked to the CAMA data which has attribute information (e.g. value of house, number of bedrooms) about real property. The parcel features for each town are in shape files, feature classes, or within a geodatabase. Most parcels are organized by town and COG and placed within a geodatabases. The CAMA data sets have information about real property within the towns of CT. It may be linked to the parcels using a join process within a GIS package like ArcGIS Pro or QGIS. 154 out of 169 towns have complete CAMA information. Of the remaining 15 towns, four have no information and the remaining have some limited information mixed into the parcel attribute tables. These files were gathered from the CT towns by the COGs and then submitted to CT OPM. Town data is organized by COG. Attribute names, primary key, secondary key, naming conventions, and file formats are not fully consistent but some cleaning and reorganization was conducted to improve quality. This file was created on 03/08/2023 from data collected in 2021-2022.

  10. h

    SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data - NI

    • healthdatagateway.org
    unknown
    Updated Aug 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACKNOWLEDGEMENT The authors would like to acknowledge the help provided by the staff of the Honest Broker Service (HBS) within the Business Services Organisation Northern Ireland (BSO). The HBS is funded by the BSO and the Department of Health (DoH). The authors alone are responsible for the interpretation of the data and any views or opinions presented are solely those of the author and do not necessarily represent those of the BSO. (2021). SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data - NI [Dataset]. https://healthdatagateway.org/dataset/13
    Explore at:
    unknownAvailable download formats
    Dataset updated
    Aug 16, 2021
    Dataset authored and provided by
    ACKNOWLEDGEMENT The authors would like to acknowledge the help provided by the staff of the Honest Broker Service (HBS) within the Business Services Organisation Northern Ireland (BSO). The HBS is funded by the BSO and the Department of Health (DoH). The authors alone are responsible for the interpretation of the data and any views or opinions presented are solely those of the author and do not necessarily represent those of the BSO.
    License

    https://bso.hscni.net/directorates/digital-operations/honest-broker-service/https://bso.hscni.net/directorates/digital-operations/honest-broker-service/

    Description

    File contains basic public metadata, including sequence_name, location, date, pangolin lineage assignment, version and associated scores, scorpio VOC/VUI constellation call and associated scores, key spike protein mutations calls, and a list of all nucleotide mutations found.

  11. KO-to-COGmapping

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    xlsx
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Colbie Reed (2023). KO-to-COGmapping [Dataset]. http://doi.org/10.6084/m9.figshare.23515770.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Colbie Reed
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Worksheet includes the mapping of both tRNA modification-relevant and -irrelevant K numbers to their respective overlapping COGs. Representative sequences of Object 4-S2 informed overlap at the sequence-level, maintaining the theme of data being generated and curated corresponding to support provided by published data. Additional tabs include the same data with expanded names as well as other KEGG K number and representative sequence entry-sourced data (e.g., EC numbers).

  12. f

    Data elements available in COG and PHIS.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yimei Li; Matt Hall; Brian T. Fisher; Alix E. Seif; Yuan-Shung Huang; Rochelle Bagatell; Kelly D. Getz; Todd A. Alonzo; Robert B. Gerbing; Lillian Sung; Peter C. Adamson; Alan Gamis; Richard Aplenc (2023). Data elements available in COG and PHIS. [Dataset]. http://doi.org/10.1371/journal.pone.0143480.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Yimei Li; Matt Hall; Brian T. Fisher; Alix E. Seif; Yuan-Shung Huang; Rochelle Bagatell; Kelly D. Getz; Todd A. Alonzo; Robert B. Gerbing; Lillian Sung; Peter C. Adamson; Alan Gamis; Richard Aplenc
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data elements available in COG and PHIS.

  13. f

    Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG,...

    • datasetcatalog.nlm.nih.gov
    • springernature.figshare.com
    Updated Jun 4, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang, Jing; Jian, Jianbo; Chen, Jianwei; Liu, Xueqing; Xia, Jinquan; Gao, Yong; Du, Hejun; Chen, Lei; Xiao, Kan; Wang, Binzhong (2019). Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG, COG, InterPro and GO database. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000152846
    Explore at:
    Dataset updated
    Jun 4, 2019
    Authors
    Yang, Jing; Jian, Jianbo; Chen, Jianwei; Liu, Xueqing; Xia, Jinquan; Gao, Yong; Du, Hejun; Chen, Lei; Xiao, Kan; Wang, Binzhong
    Description

    Annotation of Acipenser sinensis unigenes in the NR, NT, SwissProt, KEGG, COG, InterPro and GO database.

  14. Councils of Government (COG) Regions

    • data-nccommerce.opendata.arcgis.com
    • hub.arcgis.com
    Updated Dec 10, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    North Carolina Department of Commerce (2018). Councils of Government (COG) Regions [Dataset]. https://data-nccommerce.opendata.arcgis.com/datasets/nccommerce::councils-of-government-cog-regions/about
    Explore at:
    Dataset updated
    Dec 10, 2018
    Dataset authored and provided by
    North Carolina Department of Commercehttps://www.commerce.nc.gov/
    Area covered
    Description

    The sixteen regional councils in North Carolina serve their member governments through a broad range of services. Some of those are traditional: delivery of federal and state programs in aging, transportation planning, workforce development, community planning – GIS mapping services and convening of regional leaders for problem solving. A more robust range of services has emerged through member demand for administrative and financial services, interim executive management, financial administration, human services program delivery and economic development.For more informaiton, visit https://www.ncregions.org/regional-councils/

  15. Additional file 1: Tables S1, S2 and S3. of COGcollator: a web server for...

    • springernature.figshare.com
    xlsx
    Updated Jun 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daria Dibrova; Kirill Konovalov; Vadim Perekhvatov; Konstantin Skulachev; Armen Mulkidjanian (2023). Additional file 1: Tables S1, S2 and S3. of COGcollator: a web server for analysis of distant relationships between homologous protein families [Dataset]. http://doi.org/10.6084/m9.figshare.5648683.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Daria Dibrova; Kirill Konovalov; Vadim Perekhvatov; Konstantin Skulachev; Armen Mulkidjanian
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Representative list of 124 genomes sampled from the 711 genomes of the current COG database release [2]. Table S2. Representative list of 27 eukaryotic genomes sampled manually. Table S3. Results of the similarity assessment for the homologs of catalytic β-subunit of the bacterial FOF1-type ATP synthase by applying the HHpred algorithm [19]. The top hits for the α- and β-subunits of the F-type ATP synthase of E.coli and the B- and A- subunits of the A-type ATP synthase of Methanosarcina mazei (cf with Table 1) are colored red. (XLSX 29 kb)

  16. V

    Data from: Towards understanding the first genome sequence of a crenarchaeon...

    • odgavaprod.ogopendata.com
    • res1catalogd-o-tdatad-o-tgov.vcapture.xyz
    • +1more
    html
    Updated Jul 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs) [Dataset]. https://odgavaprod.ogopendata.com/dataset/towards-understanding-the-first-genome-sequence-of-a-crenarchaeon-by-genome-annotation-using-cl
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 23, 2025
    Dataset provided by
    National Institutes of Health
    Description

    Background: Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi.

       Results:
       A. pernix and P. abyssi proteins were assigned to COGs using the COGNITOR program; the results were verified on a case-by-case basis and augmented by additional database searches using the PSI-BLAST and TBLASTN programs. Functions were predicted for over 300 proteins from A. pernix, which could not be assigned a function using conventional methods with a conservative sequence similarity threshold, an approximately 50% increase compared to the original annotation. A. pernix shares most of the conserved core of proteins that were previously identified in the Euryarchaeota. Cluster analysis or distance matrix tree construction based on the co-occurrence of genomes in COGs showed that A. pernix forms a distinct group within the archaea, although grouping with the two species of Pyrococci, indicative of similar repertoires of conserved genes, was observed. No indication of a specific relationship between Crenarchaeota and eukaryotes was obtained in these analyses. Several proteins that are conserved in Euryarchaeota and most bacteria are unexpectedly missing in A. pernix, including the entire set of de novo purine biosynthesis enzymes, the GTPase FtsZ (a key component of the bacterial and euryarchaeal cell-division machinery), and the tRNA-specific pseudouridine synthase, previously considered universal. A. pernix is represented in 48 COGs that do not contain any euryarchaeal members. Many of these proteins are TCA cycle and electron transport chain enzymes, reflecting the aerobic lifestyle of A. pernix.
    
    
       Conclusions:
       Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and predicted protein functions provide for a significant improvement in genome annotation. A differential genome display approach helps in a systematic investigation of common and distinct features of gene repertoires and in some cases reveals unexpected connections that may be indicative of functional similarities between phylogenetically distant organisms and of lateral gene exchange.
    
  17. H

    SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data - NI

    • dtechtive.com
    • finddatagovscot.dtechtive.com
    • +1more
    Updated Nov 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    HEALTH AND SOCIAL CARE NORTHERN IRELAND (2023). SARS-CoV-2 viral sequencing data (COG-UK data) - Lineage/Variant Data - NI [Dataset]. https://dtechtive.com/datasets/25688
    Explore at:
    Dataset updated
    Nov 13, 2023
    Dataset provided by
    HEALTH AND SOCIAL CARE NORTHERN IRELAND
    Area covered
    United Kingdom
    Description

    Summary of SARS-CoV-2 lineages and mutations

  18. o

    Cog Hill Drive Cross Street Data in Honey Brook, PA

    • ownerly.com
    Updated Dec 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ownerly (2021). Cog Hill Drive Cross Street Data in Honey Brook, PA [Dataset]. https://www.ownerly.com/pa/honey-brook/cog-hill-dr-home-details
    Explore at:
    Dataset updated
    Dec 9, 2021
    Dataset authored and provided by
    Ownerly
    Area covered
    Honey Brook, Pennsylvania, Cog Hill Drive
    Description

    This dataset provides information about the number of properties, residents, and average property values for Cog Hill Drive cross streets in Honey Brook, PA.

  19. v

    Global export data of Lcd Cog

    • volza.com
    csv
    Updated Dec 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Volza FZ LLC (2025). Global export data of Lcd Cog [Dataset]. https://www.volza.com/p/lcd-cog/export/export-from-china/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 8, 2025
    Dataset authored and provided by
    Volza FZ LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Count of exporters, Sum of export value, 2014-01-01/2021-09-30, Count of export shipments
    Description

    82 Global export shipment records of Lcd Cog with prices, volume & current Buyer's suppliers relationships based on actual Global export trade database.

  20. d

    3D-Genomics Database

    • dknet.org
    • scicrunch.org
    • +3more
    Updated Jan 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). 3D-Genomics Database [Dataset]. http://identifiers.org/RRID:SCR_007430
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. Database containing structural annotations for the proteomes of just under 100 organisms. Using data derived from public databases of translated genomic sequences, representatives from the major branches of Life are included: Prokaryota, Eukaryota and Archaea. The annotations stored in the database may be accessed in a number of ways. The help page provides information on how to access the database. 3D-GENOMICS is now part of a larger project, called e-Protein. The project brings together similar databases at three sites: Imperial College London , University College London and the European Bioinformatics Institute . e-Protein''s mission statement is To provide a fully automated distributed pipeline for large-scale structural and functional annotation of all major proteomes via the use of cutting-edge computer GRID technologies. The following databases are incorporated: NRprot, SCOP, ASTRAL, PFAM, Prosite, taxonomy, COG The following eukaryotic genomes are incorporated: Anopheles gambiae, protein sequences from the mosquito genome; Arabidopsis thaliana, protein sequences from the Arabidopsis genome; Caenorhabditis briggsae, protein sequences from the C.briggsae genome; Caenorhabditis elegans protein sequences from the worm genome; Ciona intestinalis protein sequences from the sea squirt genome; Danio rerio protein sequences from the zebrafish genome; Drosophila melanogaster protein sequences from the fruitfly genome; Encephalitozoon cuniculi protein sequences from the E.cuniculi genome; Fugu rubripes protein sequences from the pufferfish genome; Guillardia theta protein sequences from the G.theta genome; Homo sapiens protein sequences from the human genome; Mus musculus protein sequences from the mouse genome; Neurospora crassa protein sequences from the N.crassa genome; Oryza sativa protein sequences from the rice genome; Plasmodium falciparum protein sequences from the P.falciparum genome; Rattus norvegicus protein sequences from the rat genome; Saccharomyces cerevisiae protein sequences from the yeast genome; Schizosaccharomyces pombe protein sequences from the yeast genome

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). COG [Dataset]. http://identifiers.org/RRID:SCR_007139

COG

RRID:SCR_007139, OMICS_01688, nif-0000-02672, COG (RRID:SCR_007139), COG, COG Cluster, COG Function, COG Pathway, COG Database, Clusters of Orthologous Groups of proteins, COGs, COGs - Clusters of Orthologous Groups of proteins, COGs - Phylogenetic classification of proteins encoded in complete genomes, COG Cluster, COG Pathway, COG Function

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 19, 2025
Description

A database for phylogenetic classification for proteins encoded in complete genomes. Clusters of Orthologous Groups of proteins (COGs) were delineated by comparing protein sequences encoded in complete genomes, representing major phylogenetic lineages. Each COG consists of individual proteins or groups of paralogs from at least 3 lineages and thus corresponds to an ancient conserved domain. Please be aware that COGs hasn't been updated in many years and will not be.

Search
Clear search
Close search
Google apps
Main menu