Facebook
TwitterIVDB hosts complete genome sequences of influenza A virus generated by BGI and curates all other published influenza virus sequences after expert annotations. IVDB provides a series of tools and viewers for analyzing the viral genomes, genes, genetic polymorphisms and phylogenetic relationships comparatively.
Facebook
TwitterVIDA is a new virus database that organizes open reading frames (ORFs) from partial and complete genomic sequences from animal viruses. Currently VIDA includes all sequences from GenBank for Herpesviridae, Coronaviridae and Arteriviridae. The ORFs are organized into homologous protein families, which are identified on the basis of sequence similarity relationships. Conserved sequence regions of potential functional importance are identified and can be retrieved as sequence alignments. We use a controlled taxonomical and functional classification for all the proteins and protein families in the database. When available, protein structures that are related to the families have also been included. The database is available for online search and sequence information retrieval at http://www.biochem.ucl.ac.uk/bsm/virus_database/VIDA.html.
Facebook
TwitterDatabase that organizes information on genomes including sequences, maps, chromosomes, assemblies, and annotations in six major organism groups: Archaea, Bacteria, Eukaryotes, Viruses, Viroids, and Plasmids. Genomes of over 1,200 organisms can be found in this database, representing both completely sequenced organisms and those for which sequencing is in progress. Users can browse by organism, and view genome maps and protein clusters. Links to other prokaryotic and archaeal genome projects, as well as BLAST tools and access to the rest of the NCBI online resources are available.
Facebook
TwitterThe presence and replication of honeybee deformed wing virus variant A (DWV-A) was recently confirmed in the red imported fire ants, Solenopsis invicta Buren. Reported here is the complete genome sequence data of this virus from S. invicta, which is valuable for future research on the DWV.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Details
This is a gzipped tar file that includes the plant virus database files required to run the Kodoja workflow (https://github.com/abaizan/kodoja)[1]. Kodoja is a workflow for the detection of plant virus sequences in RNA-seq data files that uses two previoulsy published tools Kraken[2] and Kaiju[3].
This file contains databases for Kraken [2] and Kaiju [3]. The file includes the kraken database files: database.idx, database.kdb, nodes.dmp, names.dmp and the kaiju database file kaij_library.fmi.
These k-mer databases are based on virus sequences in RefSeq [4] (ttps://www.ncbi.nlm.nih.gov/refseq/) with plant hosts as defined in the Virus-Host Database [5] (https://www.genome.jp/virushostdb/).
Version 1.0
kodojaDB_v1.0 is based on RefSeq v89 and the Virus-Host Database (accessed 03/09/2018 which is based on RefSeq 89 and Genbank 226.0). The viral partition of RefSeq v89 genome comprises 7946 viruses (ftp://ftp.ncbi.nlm.nih.gov/genomes/refseq/viral/assembly_summary.txt).
kodojaDB_v1.0 was created using kodoja_retrieve.py which is part of the kodoja workflow (v0.05) (https://github.com/abaizan/kodoja).
References
[1] Baizan-Edge, A, Cock, P, MacFarlane, S, McGavin, W, Torrance, T, Jones, S. Kodoja: A workflow for virus detection in plants using k-mer analysis of RNA-sequencing data (under review Nucleic Acids Research).
[2] Wood,D.E. and Salzberg,S.L. (2014) Kraken: ultrafast metagenomic sequence classification using exact alignments. Genome Biol., 15, R46
[3] Menzel,P., Ng,K.L. and Krogh,A. (2016) Fast and sensitive taxonomic classification for metagenomics with Kaiju. Nat. Commun., 7, 1–9.
[4] O’Leary,N.A., Wright,M.W., Brister,J.R., Ciufo,S., Haddad,D., McVeigh,R., Rajput,B., Robbertse,B., Smith-White,B., Ako-Adjei,D., et al. (2016) Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Res., 44, D733–D745.
[5] Mihara,T., Nishimura,Y., Shimizu,Y., Nishiyama,H., Yoshikawa,G., Uehara,H., Hingamp,P., Goto,S. and Ogata,H. (2016) Linking virus genomes with host taxonomy. Viruses, 8, 10–15
Facebook
TwitterVirosaurus (from virus thesaurus) is a curated virus genome database, aimed at facilitating clinical metagenomics analysis. The data comprises clustered and annotated sequences of Vertebrate viruses , Others viruses (Insect, Fungus, Eukaryotic microorgansism) or Plant viruses in FASTA format. Virosaurus also provides complete virus sequence dataset for all those viruses, which comprises complete genomes for nonsegmented viruses, and complete segments for segmented viruses. Complete sequences: This dataset contains full-length genomes (monopartite virus) or segments (segmented virus) for all vertebrate virus families. Virosaurus: Virus reference sequence databases for clinical metagenomics. All complete sequences were clustered at 90% to remove redundancy in Virosaurus Vertebrate 90 (23,615 FASTAs); or clustered at 98% in Virosaurus vertebrate 98 (73,160 FASTAs). Many clusters can belong to the same virus species. For example, there are 100 Lassa virus clusters in Virosaurus90, 638 in Virosaurus98. The FASTA header have been annotated with metadata to facilitate metagenomic analysis. For instance, viral nucleic acid is annotated as RNA, DNA or RNA/DNA, thereby improving interpretation from sequencing either molecule.
Facebook
TwitterIVDB hosts complete genome sequences of influenza A virus generated by BGI and curates all other published influenza virus sequences after expert annotations. For the convenience of efficient data utilization, our Q-Filter system classifies and ranks all nucleotide sequences into 7 categories according to sequence content and integrity. IVDB provides a series of tools and viewers for analyzing the viral genomes, genes, genetic polymorphisms and phylogenetic relationships comparatively. A searching system is developed for users to retrieve a combination of different data types by setting various search options. To facilitate analysis of the global viral transmission and evolution, the IV Sequence Distribution Tool (IVDT) is developed to display worldwide geographic distribution of the viral genotypes and to couple genomic data with epidemiological data. The BLAST, multiple sequence alignment tools and phylogenetic analysis tools were integrated for online data analysis. Furthermore, IVDB offers instant access to the pre-computed alignments and polymorphism analysis of influenza virus genes and proteins and presents the results by SNP distribution plots and minor allele distributions. IVDB aims to be a powerful information resource and an analysis workbench for scientists working on IV genetics, evolution, diagnostics, vaccine development, and drug design.
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented August 19, 2016. It is a database and web application describing the genome organization and providing analytical tools for the 938 known species of RNA virus. It can identify submitted nucleotide sequences, can place them into multiple whole-genome alignments (in species where more than one isolate has been fully sequenced) and contains translated genome sequences for all species. It has been created for two main purposes: to facilitate the comparative analysis of RNA viruses and to become a hub for other, more specialised virus Web sites.
Facebook
TwitterVIDA contains a collection of homologous protein families derived from open reading frames from complete and partial virus genomes. For each family, users can get an alignment of the conserved regions, functional and taxonomy information, and links to DNA sequences and structures. * Search homologous protein families from particular virus families * Links to complete genome sequence: Arteriviridae, Coronaviridae, Herpesviridae, Poxviridae The Virus Database at University College London has been developed as a system to organize animal virus open reading frame sequences. All known and predicted protein sequences from complete and partial genomes of particular virus families are extracted from GenBank and filtered to remove 100% redundancy. On the basis of sequence similarity the sequences are then clustered into homologous protein families (HPFs). The families are enriched with annotations including function and functional classification, related protein structures, taxonomy, length of the proteins, boundaries of the conserved region/s, virus-specific gene name and links to EMBL entries and SWISSPROT., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Facebook
TwitterVIRsiRNAdb is a curated database of experimentally validated viral siRNA / shRNA targeting diverse genes of 42 important human viruses including influenza, SARS and Hepatitis viruses. Submissions are welcome. Currently, the database provides detailed experimental information of 1358 siRNA/shRNA which includes siRNA sequence, virus subtype, target gene, GenBank accession, design algorithm, cell type, test object, test method and efficacy (mostly quantitative efficacies). Further, wherever available, information regarding alternative efficacies of above 300 siRNAs derived from different assays has also been incorporated. The database has facilities like search, advance search (using Boolean operators AND, OR) browsing (with data sorting option), internal linking and external linking to other databases (Pubmed, Genbank, ICTV). Additionally useful siRNA analysis tools are also provided e.g. siTarAlign for aligning the siRNA sequence with reference viral genomes or user defined sequences. virsiRNAdb would prove useful for RNAi researchers especially in siRNA based antiviral therapeutics development.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Compressed database files used by vConTACT3 to classify virus sequences.
This database is version 228 and is derived from NCBI Virus RefSeq release 228, incorporating information from the Virus-Host DB (using GenBank release 264)
This file is downloaded by vConTACT3 during database setup, and then users can select which domains to use during analysis. Alternatively, users can download these databases (provided through Zenodo.org) manually, expanded to any desired location, and the path to the resulting folder can be passed to vConTACT3.
Full details are provided through the vConTACT3 website and accompanying documentation.
Facebook
TwitterThe Database: Kraken2 [1] database built from a classification tree containing over 700k metagenomic viruses from JGI IMG/VR [2]. (1) Wood, D. E., Lu, J., & Langmead, B. (2019). Improved metagenomic analysis with Kraken 2. Genome Biol., 20(1), 1–13. doi: 10.1186/s13059-019-1891-0 (2) Paez-Espino D, Chen I-MA, Palaniappan K, Ratner A, Chu K, Szeto E, et al. IMG/VR: a database of cultured and uncultured DNA Viruses and retroviruses. Nucleic Acids Res. 2017;45:D457–65. For Paper: Title: A k-mer based approach for virus classification in metatranscriptomic and metagenomic samples identifies viral associations in the Populus phytobiome and autism brains Abstract Background Viruses are an underrepresented taxa in the study and identification of microbiome constituents; however, they play an important role in health, microbiome regulation, and transfer of genetic material. Only a few thousand viruses have been isolated, sequenced, and assigned a taxonomy, which further limits the ability to identify and quantify viruses in the microbiome. Additionally, the vast diversity of viruses represents a challenge for classification, not only in constructing a viral taxonomy, but also in identifying similarities between a virus' genotype and its phenotype. However, the diversity of viral sequences can be leveraged to classify their sequences in metagenomic and metatranscriptomic samples. Methods To identify viruses in transcriptomic and genomic samples, we developed a dynamic programming algorithm for creating a classification tree out of 715,672 metagenome viruses. To create the classification tree, we clustered proportional similarity scores generated from the k-mer profiles of each of the metagenome viruses. We then integrated the viral classification tree with the NCBI taxonomy for use with ParaKraken, a metagenomic/transcriptomic classifier. Results To illustrate the breadth of our utility for classifying viruses with ParaKraken, we analyzed data from a plant metagenome study identifying the differences between two Populus genotypes in three different compartments and on a human metatranscriptome study identifying the differences between Autism Spectrum Disorder patients and controls in post mortem brain biopsies. In the Populus study, we identified genotype and compartment specific viral signatures, while in the Autism study we identified a significant increased abundance of eight viral sequences in Autism brain biopsies. Conclusion Viruses represent an important aspect of the microbiome. The ability to classify viruses represents the first step in being able to better understand their role in the microbiome. The viral classification method presented here allows for more complete identification of viral sequences for use in identifying associations between viruses and the host and viruses and other microbiome members. Acknowledgements and Funding This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research was also supported by the Plant-Microbe Interfaces Scientific Focus Area in the Genomic Science Program, the Office of Biological and Environmental Research (BER) in the U.S. Department of Energy Office of Science, and by the Department of Energy, Laboratory Directed Research and Development funding (ProjectID 8321), at the Oak Ridge National Laboratory. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the US DOE under contract DE-AC05-00OR22725. This research used resources of the Compute and Data Environment for Science (CADES).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The file "viral.genomic.gbk.tar.gz" contains all the RefSeq viral database information in GenBank format, used as the gold standard for the comparisons. In such a way, it should be run as is when using the script "genecounter.py" to count the number of genes, while it is the second (mandatory) input file for the counting of true positives (TP), false positives (FP) and false negatives (FN) via "coordinateschecker.py". In any case, it could also be used for other evaluation purposes.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Genome study of different viruses help in associating them with previous known viruses thus helping in precautionary treatment.
The dataset contain genome sequence strain for 5 types of viruses Corona,Ebola,zika,Mers,Dengue. Each line is pair DNA sequence containing 50 sequence in pair of 10 Nucleobases.. Viruses are primarily RNA , that generate a DNA sequence once it receives a host.
https://www.viprbrc.org/brc/viprStrainDetails.spg?ncbiAccession=MG757593&decorator=corona
This dataset will help us study viruses genome pattern and there similarity/differences.
Facebook
TwitterThe RNA Virus Database is a database and web application describing the genome organization and providing analytical tools for the 938 known species of RNA virus. It can identify submitted nucleotide sequences, can place them into multiple whole-genome alignments and contains translated genome sequences for all species.
Facebook
TwitterDatabase containing .csv files with samples of dengue protein sequences labeled with the severity degree of infection in human hosts.
Facebook
TwitterBackground Freshwater planarians are widely used as models for investigation of pattern formation and studies on genetic variation in populations. Despite extensive information on the biology and genetics of planaria, the occurrence and distribution of viruses in these animals remains an unexplored area of research. Results Using a combination of Suppression Subtractive Hybridization (SSH) and Mirror Orientation Selection (MOS), we compared the genomes of two strains of freshwater planarian, Girardia tigrina. The novel extrachromosomal DNA-containing virus-like element denoted PEVE (Planarian Extrachromosomal Virus-like Element) was identified in one planarian strain. The PEVE genome (about 7.5 kb) consists of two unique regions (Ul and Us) flanked by inverted repeats. Sequence analyses reveal that PEVE comprises two helicase-like sequences in the genome, of which the first is a homolog of a circoviral replication initiator protein (Rep), and the second is similar to the papillomavirus E1 helicase domain. PEVE genome exists in at least two variant forms with different arrangements of single-stranded and double-stranded DNA stretches that correspond to the Us and Ul regions. Using PCR analysis and whole-mount in situ hybridization, we characterized PEVE distribution and expression in the planarian body. Conclusions PEVE is the first viral element identified in free-living flatworms. This element differs from all known viruses and viral elements, and comprises two potential helicases that are homologous to proteins from distant viral phyla. PEVE is unevenly distributed in the worm body, and is detected in specific parenchyma cells.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Viral genomes and genome fragments from the Global Coral Viruses Database (GCVDB).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Sequenced genomes of 88 viruses belonging to Peribunyaviridae (Orthobunyavirus or Pacuvirus).
Facebook
TwitterVIDA contains a collection of homologous protein families derived from open reading frames from complete and partial virus genomes. For each family, users can get an alignment of the conserved regions, functional and taxonomy information, and links to DNA sequences and structures. * Search homologous protein families from particular virus families * Links to complete genome sequence: Arteriviridae, Coronaviridae, Herpesviridae, Poxviridae The Virus Database at University College London has been developed as a system to organize animal virus open reading frame sequences. All known and predicted protein sequences from complete and partial genomes of particular virus families are extracted from GenBank and filtered to remove 100% redundancy. On the basis of sequence similarity the sequences are then clustered into homologous protein families (HPFs). The families are enriched with annotations including function and functional classification, related protein structures, taxonomy, length of the proteins, boundaries of the conserved region/s, virus-specific gene name and links to EMBL entries and SWISSPROT., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
Facebook
TwitterIVDB hosts complete genome sequences of influenza A virus generated by BGI and curates all other published influenza virus sequences after expert annotations. IVDB provides a series of tools and viewers for analyzing the viral genomes, genes, genetic polymorphisms and phylogenetic relationships comparatively.