100+ datasets found
  1. d

    Alternative Splicing Annotation Project II Database

    • dknet.org
    • scicrunch.org
    • +3more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Alternative Splicing Annotation Project II Database [Dataset]. http://identifiers.org/RRID:SCR_000322
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented on 8/12/13. An expanded version of the Alternative Splicing Annotation Project (ASAP) database with a new interface and integration of comparative features using UCSC BLASTZ multiple alignments. It supports 9 vertebrate species, 4 insects, and nematodes, and provides with extensive alternative splicing analysis and their splicing variants. As for human alternative splicing data, newly added EST libraries were classified and included into previous tissue and cancer classification, and lists of tissue and cancer (normal) specific alternatively spliced genes are re-calculated and updated. They have created a novel orthologous exon and intron databases and their splice variants based on multiple alignment among several species. These orthologous exon and intron database can give more comprehensive homologous gene information than protein similarity based method. Furthermore, splice junction and exon identity among species can be valuable resources to elucidate species-specific genes. ASAP II database can be easily integrated with pygr (unpublished, the Python Graph Database Framework for Bioinformatics) and its powerful features such as graph query, multi-genome alignment query and etc. ASAP II can be searched by several different criteria such as gene symbol, gene name and ID (UniGene, GenBank etc.). The web interface provides 7 different kinds of views: (I) user query, UniGene annotation, orthologous genes and genome browsers; (II) genome alignment; (III) exons and orthologous exons; (IV) introns and orthologous introns; (V) alternative splicing; (IV) isoform and protein sequences; (VII) tissue and cancer vs. normal specificity. ASAP II shows genome alignments of isoforms, exons, and introns in UCSC-like genome browser. All alternative splicing relationships with supporting evidence information, types of alternative splicing patterns, and inclusion rate for skipped exons are listed in separate tables. Users can also search human data for tissue- and cancer-specific splice forms at the bottom of the gene summary page. The p-values for tissue-specificity as log-odds (LOD) scores, and highlight the results for LOD >= 3 and at least 3 EST sequences are all also reported.

  2. f

    List of bioinformatics tools and databases students used.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    João Carlos Sousa; Manuel João Costa; Joana Almeida Palha (2023). List of bioinformatics tools and databases students used. [Dataset]. http://doi.org/10.1371/journal.pone.0000481.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    João Carlos Sousa; Manuel João Costa; Joana Almeida Palha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of bioinformatics tools and databases students used.

  3. d

    Bio Resource for Array Genes Database

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Bio Resource for Array Genes Database [Dataset]. http://identifiers.org/RRID:SCR_000748
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Bio Resource for array genes is a free online resource for easy access to collective and integrated information from various public biological resources for human, mouse, rat, fly and c. elegans genes. The resource includes information about the genes that are represented in Unigene clusters. This resource provides interactive tools to selectively view, analyze and interpret gene expression patterns against the background of gene and protein functional information. Different query options are provided to mine the biological relationships represented in the underlying database. Search button will take you to the list of query tools available. This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets. This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins. A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases. Sponsors: BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona. 2002,2003 University of Arizona.

  4. I

    Molecular Biology Databases Published in Nucleic Acids Research between...

    • databank.illinois.edu
    Updated Feb 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heidi Imker (2024). Molecular Biology Databases Published in Nucleic Acids Research between 1991-2016 [Dataset]. http://doi.org/10.13012/B2IDB-4311325_V1
    Explore at:
    Dataset updated
    Feb 1, 2024
    Authors
    Heidi Imker
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset was developed to create a census of sufficiently documented molecular biology databases to answer several preliminary research questions. Articles published in the annual Nucleic Acids Research (NAR) “Database Issues” were used to identify a population of databases for study. Namely, the questions addressed herein include: 1) what is the historical rate of database proliferation versus rate of database attrition?, 2) to what extent do citations indicate persistence?, and 3) are databases under active maintenance and does evidence of maintenance likewise correlate to citation? An overarching goal of this study is to provide the ability to identify subsets of databases for further analysis, both as presented within this study and through subsequent use of this openly released dataset.

  5. e

    PROSITE profiles

    • ebi.ac.uk
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). PROSITE profiles [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Feb 5, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PROSITE is a database of protein families and domains. It consists of biologically significant sites, patterns and profiles that help to reliably identify to which known protein family a new sequence belongs. PROSITE is based at the Swiss Institute of Bioinformatics (SIB), Geneva, Switzerland.

  6. f

    Data_Sheet_1_riceExplorer: Uncovering the Hidden Potential of a National...

    • frontiersin.figshare.com
    zip
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Clive T. Darwell; Samart Wanchana; Vinitchan Ruanjaichon; Meechai Siangliw; Burin Thunnom; Wanchana Aesomnuk; Theerayut Toojinda (2023). Data_Sheet_1_riceExplorer: Uncovering the Hidden Potential of a National Genomic Resource Against a Global Database.zip [Dataset]. http://doi.org/10.3389/fpls.2022.781153.s001
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    Frontiers
    Authors
    Clive T. Darwell; Samart Wanchana; Vinitchan Ruanjaichon; Meechai Siangliw; Burin Thunnom; Wanchana Aesomnuk; Theerayut Toojinda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Agricultural crop breeding programs, particularly at the national level, typically consist of a core panel of elite breeding cultivars alongside a number of local landrace varieties (or other endemic cultivars) that provide additional sources of phenotypic and genomic variation or contribute as experimental materials (e.g., in GWAS studies). Three issues commonly arise. First, focusing primarily on core development accessions may mean that the potential contributions of landraces or other secondary accessions may be overlooked. Second, elite cultivars may accumulate deleterious alleles away from nontarget loci due to the strong effects of artificial selection. Finally, a tendency to focus solely on SNP-based methods may cause incomplete or erroneous identification of functional variants. In practice, integration of local breeding programs with findings from global database projects may be challenging. First, local GWAS experiments may only indicate useful functional variants according to the diversity of the experimental panel, while other potentially useful loci—identifiable at a global level—may remain undiscovered. Second, large-scale experiments such as GWAS may prove prohibitively costly or logistically challenging for some agencies. Here, we present a fully automated bioinformatics pipeline (riceExplorer) that can easily integrate local breeding program sequence data with international database resources, without relying on any phenotypic experimental procedure. It identifies associated functional haplotypes that may prove more robust in determining the genotypic determinants of desirable crop phenotypes. In brief, riceExplorer evaluates a global crop database (IRRI 3000 Rice Genomes) to identify haplotypes that are associated with extreme phenotypic variation at the global level and recorded in the database. It then examines which potentially useful variants are present in the local crop panel, before distinguishing between those that are already incorporated into the elite breeding accessions and those only found among secondary varieties (e.g., landraces). Results highlight the effectiveness of our pipeline, identifying potentially useful functional haplotypes across the genome that are absent from elite cultivars and found among landraces and other secondary varieties in our breeding program. riceExplorer can automatically conduct a full genome analysis and produces annotated graphical output of chromosomal maps, potential global diversity sources, and summary tables.

  7. I

    Funding and Operating Organizations for Long-Lived Molecular Biology...

    • databank.illinois.edu
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heidi Imker, Funding and Operating Organizations for Long-Lived Molecular Biology Databases [Dataset]. http://doi.org/10.13012/B2IDB-3993338_V1
    Explore at:
    Authors
    Heidi Imker
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The organizations that contribute to the longevity of 67 long-lived molecular biology databases published in Nucleic Acids Research (NAR) between 1991-2016 were identified to address two research questions 1) which organizations fund these databases? and 2) which organizations maintain these databases? Funders were determined by examining funding acknowledgements in each database's most recent NAR Database Issue update article published (prior to 2017) and organizations operating the databases were determine through review of database websites.

  8. C

    Bioinformatics for Researchers in Life Sciences: Tools and Learning...

    • data.iadb.org
    csv, pdf
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IDB Datasets (2025). Bioinformatics for Researchers in Life Sciences: Tools and Learning Resources [Dataset]. http://doi.org/10.60966/kwvb-wr19
    Explore at:
    csv(276253), pdf(2989058), csv(355108)Available download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    IDB Datasets
    License

    Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2020 - Jan 1, 2021
    Description

    The COVID-19 pandemic has shown that bioinformatics--a multidisciplinary field that combines biological knowledge with computer programming concerned with the acquisition, storage, analysis, and dissemination of biological data--has a fundamental role in scientific research strategies in all disciplines involved in fighting the virus and its variants. It aids in sequencing and annotating genomes and their observed mutations; analyzing gene and protein expression; simulation and modeling of DNA, RNA, proteins and biomolecular interactions; and mining of biological literature, among many other critical areas of research. Studies suggest that bioinformatics skills in the Latin American and Caribbean region are relatively incipient, and thus its scientific systems cannot take full advantage of the increasing availability of bioinformatic tools and data. This dataset is a catalog of bioinformatics software for researchers and professionals working in life sciences. It includes more than 300 different tools for varied uses, such as data analysis, visualization, repositories and databases, data storage services, scientific communication, marketplace and collaboration, and lab resource management. Most tools are available as web-based or desktop applications, while others are programming libraries. It also includes 10 suggested entries for other third-party repositories that could be of use.

  9. c

    Bioinformatics Market size was USD 12.76 Billion in 2022!

    • cognitivemarketresearch.com
    pdf,excel,csv,ppt
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cognitive Market Research (2025). Bioinformatics Market size was USD 12.76 Billion in 2022! [Dataset]. https://www.cognitivemarketresearch.com/bioinformatics-market-report
    Explore at:
    pdf,excel,csv,pptAvailable download formats
    Dataset updated
    Apr 30, 2025
    Dataset authored and provided by
    Cognitive Market Research
    License

    https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy

    Time period covered
    2021 - 2033
    Area covered
    Global
    Description

    Global Bioinformatics market size was USD 12.76 Billion in 2022 and it is forecasted to reach USD 29.32 Billion by 2030. Bioinformatics Industry's Compound Annual Growth Rate will be 10.4% from 2023 to 2030. What are the driving factors for the Bioinformatics market?

    The primary factors propelling the global bioinformatics industry are advances in genomics, rising demand for protein sequencing, and rising public-private sector investment in bioinformatics. Large volumes of data are being produced by the expanding use of next-generation sequencing (NGS) and other genomic technologies; these data must be analyzed using advanced bioinformatics tools. Furthermore, the global bioinformatics industry may benefit from the development of emerging advanced technologies. However, the bioinformatics discipline contains intricate algorithms and massive amounts of data, which can be difficult for researchers and demand a lot of processing power. What is Bioinformatics?

    Bioinformatics is related to genetics and genomics, which involves the use of computer technology to store, collect, analyze, and disseminate biological information, and data, such as DNA and amino acid sequences or annotations about these sequences. Researchers and medical professionals use databases that organize and index this biological data to better understand health and disease, and in some circumstances, as a component of patient care. Through the creation of software and algorithms, bioinformatics is primarily used to extract knowledge from biological data. Bioinformatics is frequently used in the analysis of genomics, proteomics, 3D protein structure modeling, image analysis, drug creation, and many other fields.

  10. e

    SFLD

    • ebi.ac.uk
    Updated Sep 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2018). SFLD [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Sep 7, 2018
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    SFLD (Structure-Function Linkage Database) is a hierarchical classification of enzymes that relates specific sequence-structure features to specific chemical capabilities.

  11. Pharokka Databases

    • zenodo.org
    application/gzip
    Updated Jan 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Bouras; George Bouras (2023). Pharokka Databases [Dataset]. http://doi.org/10.5281/zenodo.7081772
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    George Bouras; George Bouras
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is intended to hold the databases for Pharokka (https://github.com/gbouras13/pharokka)

    It includes the PHROGs database, and mmseqs2 compatible versions of the CARD and VFDB databases.

  12. r

    University of Pittsburgh Bioinformatics Resources Collection

    • rrid.site
    Updated Jul 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). University of Pittsburgh Bioinformatics Resources Collection [Dataset]. http://identifiers.org/RRID:SCR_005845
    Explore at:
    Dataset updated
    Jul 27, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented August 23, 2016. To bridge the gap between the rising information needs of biological and medical researchers and the rapidly growing number of online bioinformatics resources we have created the Online Bioinformatics Resources Collection (OBRC) at the Health Sciences Library System at the University of Pittsburgh. The OBRC containing 1542 major online bioinformatics databases and software tools was constructed using the HSLS content management system built on the Zope? Web application server. To enhance the output of search results we further implemented the Vivsimo Clustering Engine? which automatically organizes the search results into categories created dynamically based on the textual information of the retrieved records. As the largest online collection of its kind and the only one with advanced search results clustering OBRC is aimed at becoming a one-stop guided information gateway to the major bioinformatics databases and software tools on the Web. OBRC is available at the University of Pittsburgh's Health Sciences Library System.

  13. o

    Data from: PATRIC, the bacterial bioinformatics database and analysis...

    • explore.openaire.eu
    Updated Apr 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alice R. Wattam; David Abraham; Oral Dalay; Terry L. Disz; Timothy Driscoll; Joseph L. Gabbard; Joseph J. Gillespie; Roger Gough; Deborah Hix; Ronald W. Kenyon; Dustin Machi; Chunhong Mao; Eric K. Nordberg; Robert Olson; Ross Overbeek; Gordon D. Pusch; Maulik Shukla; Julie Schulman; Rick L. Stevens; Daniel E. Sullivan; Veronika Vonstein; Andrew S. Warren; Rebecca Will; Meredith J. C. Wilson; Hyunseung Yoo; Chengdong Zhang; Yan Zhang; Bruno Sobral (2019). PATRIC, the bacterial bioinformatics database and analysis resource [Dataset]. https://explore.openaire.eu/search/other?pid=10919%2F88954
    Explore at:
    Dataset updated
    Apr 12, 2019
    Authors
    Alice R. Wattam; David Abraham; Oral Dalay; Terry L. Disz; Timothy Driscoll; Joseph L. Gabbard; Joseph J. Gillespie; Roger Gough; Deborah Hix; Ronald W. Kenyon; Dustin Machi; Chunhong Mao; Eric K. Nordberg; Robert Olson; Ross Overbeek; Gordon D. Pusch; Maulik Shukla; Julie Schulman; Rick L. Stevens; Daniel E. Sullivan; Veronika Vonstein; Andrew S. Warren; Rebecca Will; Meredith J. C. Wilson; Hyunseung Yoo; Chengdong Zhang; Yan Zhang; Bruno Sobral
    Description

    The Pathosystems Resource Integration Center (PATRIC) is the all-bacterial Bioinformatics Resource Center (BRC) (http://www.patricbrc.org). A joint effort by two of the original National Institute of Allergy and Infectious Diseases-funded BRCs, PATRIC provides researchers with an online resource that stores and integrates a variety of data types [e. g. genomics, transcriptomics, protein-protein interactions (PPIs), three-dimensional protein structures and sequence typing data] and associated metadata. Datatypes are summarized for individual genomes and across taxonomic levels. All genomes in PATRIC, currently more than 10 000, are consistently annotated using RAST, the Rapid Annotations using Subsystems Technology. Summaries of different data types are also provided for individual genes, where comparisons of different annotations are available, and also include available transcriptomic data. PATRIC provides a variety of ways for researchers to find data of interest and a private workspace where they can store both genomic and gene associations, and their own private data. Both private and public data can be analyzed together using a suite of tools to perform comparative genomic or transcriptomic analysis. PATRIC also includes integrated information related to disease and PPIs. All the data and integrated analysis and visualization tools are freely available. This manuscript describes updates to the PATRIC since its initial report in the 2007 NAR Database Issue. Funding for open access charge: National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Service [Contract No. HHSN272200900040C]. National Institute of Allergy and Infectious Diseases, National Institutes of Health, Department of Health and Human Service [HHSN272200900040C]

  14. i

    IDPredictor: predict database links in biomedical database. Supplementary...

    • doi.ipk-gatersleben.de
    Updated Jan 1, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber; Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber; Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber (2012). IDPredictor: predict database links in biomedical database. Supplementary material A.3 for the paper [Dataset]. https://doi.ipk-gatersleben.de/DOI/ce9f7e62-56e5-4554-bb11-d7ab29e6fa1d/dd34a994-daf0-4b7f-9809-d875c1e771d2/2
    Explore at:
    Dataset updated
    Jan 1, 2012
    Dataset provided by
    e!DAL - Plant Genomics and Phenomics Research Data Repository (PGP), IPK Gatersleben, Seeland OT Gatersleben, Corrensstraße 3, D-06466, Germany
    Authors
    Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber; Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber; Matthias Lange; Hendrik Mehlhorn; Uwe Scholz; Falk Schreiber
    Description

    Supplementary material A.3 for the paper 'IDPredictor: predict database links in biomedical database'. Abstract: Knowledge found in biomedical databases, in particular in Web information systems, is a major bioinformatics resource. In general, this biological knowledge is worldwide represented in a network of databases. These data are spread among thousands of databases, which overlap in content, but differ substantially with respect to content detail, interface, formats and data structure. To support a functional annotation of lab data, such as protein sequences, metabolites or DNA sequences as well as a semi-automated data exploration in information retrieval environments an integrated view to databases is essential. Search engines have the potential of assisting in data retrieval from these structured sources, but fall short of providing a comprehensive knowledge excerpt out of the interlinked databases. A prerequisit for supporting the concept of an integrated data view is the to acquiring insights into cross-references among database entities. But only a fraction of all possible cross-references are explicitely tagged in the particular biomedical informations systems. In this work, we investigate to what extend an automated construction of an integrated data network is possible. We propose a method that predict and extracts cross-references from multiple life science databases and thier possible referenced data targets. We study the retrieval quality of our method and the relationship between manually crafted relevance ranking and relevance ranking based on cross-references, and report on first, promising results.

  15. r

    Bioinformatics Links Directory

    • rrid.site
    • neuinfo.org
    • +3more
    Updated Jul 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018/resolver
    Explore at:
    Dataset updated
    Jul 19, 2025
    Description

    Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.

  16. m

    Data from: PeTMbase: A database of plant endogenous target mimics (eTMs)

    • data.mendeley.com
    • plos.figshare.com
    Updated Nov 23, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gökhan Karakülah (2016). PeTMbase: A database of plant endogenous target mimics (eTMs) [Dataset]. http://doi.org/10.17632/htgxryrcv2.1
    Explore at:
    Dataset updated
    Nov 23, 2016
    Authors
    Gökhan Karakülah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MicroRNAs (miRNA) are small endogenous RNA molecules, which regulate target gene expression at post-transcriptional level. Besides, miRNA activity can be controlled by a newly discovered regulatory mechanism called endogenous target mimicry (eTM). In target mimicry, eTMs bind to the corresponding miRNAs to block the binding of specific transcript leading to increase mRNA expression. Thus, miRNA-eTM-target-mRNA regulation modules involving a wide range of biological processes; an increasing need for a comprehensive eTM database arose. Except miRSponge with limited number of Arabidopsis eTM data no available database and/or repository was developed and released for plant eTMs yet. Here, we present an online plant eTM database, called PeTMbase (http://petmbase.org), with a highly efficient search tool. To establish the repository a number of identified eTMs was obtained utilizing from high-throughput RNA-sequencing data of 11 plant species. Each transcriptome libraries is first mapped to corresponding plant genome, then long non-coding RNA (lncRNA) transcripts are characterized. Furthermore, additional lncRNAs retrieved from GREENC and PNRD were incorporated into the lncRNA catalog. Then, utilizing the lncRNA and miRNA sources a total of 2,728 eTMs were successfully predicted. Our regularly updated database, PeTMbase, provides high quality information regarding miRNA:eTM modules and will aid functional genomics studies particularly, on miRNA regulatory networks.

  17. d

    3D-Genomics Database

    • dknet.org
    • scicrunch.org
    • +3more
    Updated May 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). 3D-Genomics Database [Dataset]. http://identifiers.org/RRID:SCR_007430
    Explore at:
    Dataset updated
    May 13, 2025
    Description

    THIS RESOURCE IS NO LONGER IN SERVICE, documented August 29, 2016. Database containing structural annotations for the proteomes of just under 100 organisms. Using data derived from public databases of translated genomic sequences, representatives from the major branches of Life are included: Prokaryota, Eukaryota and Archaea. The annotations stored in the database may be accessed in a number of ways. The help page provides information on how to access the database. 3D-GENOMICS is now part of a larger project, called e-Protein. The project brings together similar databases at three sites: Imperial College London , University College London and the European Bioinformatics Institute . e-Protein''s mission statement is To provide a fully automated distributed pipeline for large-scale structural and functional annotation of all major proteomes via the use of cutting-edge computer GRID technologies. The following databases are incorporated: NRprot, SCOP, ASTRAL, PFAM, Prosite, taxonomy, COG The following eukaryotic genomes are incorporated: Anopheles gambiae, protein sequences from the mosquito genome; Arabidopsis thaliana, protein sequences from the Arabidopsis genome; Caenorhabditis briggsae, protein sequences from the C.briggsae genome; Caenorhabditis elegans protein sequences from the worm genome; Ciona intestinalis protein sequences from the sea squirt genome; Danio rerio protein sequences from the zebrafish genome; Drosophila melanogaster protein sequences from the fruitfly genome; Encephalitozoon cuniculi protein sequences from the E.cuniculi genome; Fugu rubripes protein sequences from the pufferfish genome; Guillardia theta protein sequences from the G.theta genome; Homo sapiens protein sequences from the human genome; Mus musculus protein sequences from the mouse genome; Neurospora crassa protein sequences from the N.crassa genome; Oryza sativa protein sequences from the rice genome; Plasmodium falciparum protein sequences from the P.falciparum genome; Rattus norvegicus protein sequences from the rat genome; Saccharomyces cerevisiae protein sequences from the yeast genome; Schizosaccharomyces pombe protein sequences from the yeast genome

  18. e

    CATH-Gene3D

    • ebi.ac.uk
    Updated Oct 21, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). CATH-Gene3D [Dataset]. https://www.ebi.ac.uk/interpro/
    Explore at:
    Dataset updated
    Oct 21, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The CATH-Gene3D database describes protein families and domain architectures in complete genomes. Protein families are formed using a Markov clustering algorithm, followed by multi-linkage clustering according to sequence identity. Mapping of predicted structure and sequence domains is undertaken using hidden Markov models libraries representing CATH and Pfam domains. CATH-Gene3D is based at University College, London, UK.

  19. i

    GCBN de.NBI User Training - PLANT 2030 Summer School - Basis Bioinformatics...

    • doi.ipk-gatersleben.de
    Updated Oct 5, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Uwe Scholz; Andrea Bräutigam; Martin Mascher; Matthias Lange; Yusheng Zhao; Uwe Scholz (2017). GCBN de.NBI User Training - PLANT 2030 Summer School - Basis Bioinformatics Training for Biologists [Dataset]. https://doi.ipk-gatersleben.de/DOI/966a00f1-1a75-470a-a2b8-195f34bcde3e/cdeec50e-3923-48da-8a03-365443002f79/6/
    Explore at:
    Dataset updated
    Oct 5, 2017
    Dataset provided by
    e!DAL - Plant Genomics and Phenomics Research Data Repository (PGP), IPK Gatersleben, Seeland OT Gatersleben, Corrensstraße 3, 06466, Germany
    Authors
    Uwe Scholz; Andrea Bräutigam; Martin Mascher; Matthias Lange; Yusheng Zhao; Uwe Scholz
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The 4th German Crop BioGreenformatics Network (GCBN, https://www.denbi.de/gcbn) user training provided a hands-on introduction to useful bioinformatics tools for biologists with little or no previous knowledge. The training enabled biologists to process their own small and large datasets using R and Linux based methods and is entirely computer-based with interspersed lectures. The first part started with an introduction into the use and basic administration (software installation) of the Linux distribution Ubuntu and demonstrated the first steps into the use the R software (trainer Andrea Bräutigam, folder AB). In part two the use of Blast+ in the command line version, of simple Linux commands like 'cut', of Perl scripts and the graphical user interface of the phylogeny tool 'seaview' were demonstrated (trainer Uwe Scholz, folder US). The third session introduced basic concepts and practical tools for processing biological datasets in Linux. In particular, 'awk' and 'sed' were used. Moreover, 'SAMtools' and 'BEDTools' were applied (trainer Martin Mascher, folder MM). In the fourth part 'Introduction to Databases' a quick start guide to use relational databases was presented. By providing easy examples, this lesson set the fundamentals to motivate to use relational database systems as daily bioinformatics tool to store, retrieve and even analyze -omics data in the big data age (trainer Matthias Lange, folder ML). The last part introduced basic statistics to biologist, teach some commonly used statistical methods and demonstrated the creation of graphical visualizations with software R (trainer Yusheng Zhao, folder YZ).

  20. o

    Supporting data for "Sequence Compression Benchmark (SCB) database"

    • explore.openaire.eu
    Updated Jan 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kirill Kryukov; Mahoko, Takahashi Ueda; So Nakagawa; Tadashi Imanishi (2020). Supporting data for "Sequence Compression Benchmark (SCB) database" [Dataset]. http://doi.org/10.5524/100762
    Explore at:
    Dataset updated
    Jan 1, 2020
    Authors
    Kirill Kryukov; Mahoko, Takahashi Ueda; So Nakagawa; Tadashi Imanishi
    Description

    Nearly all molecular sequence databases currently use gzip for data compression. Ongoing rapid accumulation of stored data calls for more efficient compression tools. Although numerous compressors exist, both specialized and general-purpose, choosing one of them was difficult because no comprehensive analysis of their comparative advantages for sequence compression was available.We systematically benchmarked 430 settings of 48 compressors (including 29 specialized sequence compressors and 19 general-purpose compressors) on representative FASTA-formatted datasets of DNA, RNA and protein sequences. Each compressor was evaluated on 17 performance measures, including compression strength, as well as time and memory required for compression and decompression. We used 27 test datasets including individual genomes of various sizes, DNA and RNA datasets, and standard protein datasets. We summarized the results as the Sequence Compression Benchmark database (SCB database) that allows building custom visualizations for selected subsets of benchmark results.We found that modern compressors offer a large improvement in compactness and speed compared to gzip. Our benchmark allows comparing compressors and their settings using a variety of performance measures, offering the opportunity to select the optimal compressor based on the data type and usage scenario specific to a particular application.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). Alternative Splicing Annotation Project II Database [Dataset]. http://identifiers.org/RRID:SCR_000322

Alternative Splicing Annotation Project II Database

RRID:SCR_000322, nif-0000-02572, Alternative Splicing Annotation Project II Database (RRID:SCR_000322), ASAP II, ASAP II Database, Alternative Splicing Annotation Project II

Explore at:
Dataset updated
Jan 29, 2022
Description

THIS RESOURCE IS NO LONGER IN SERVICE, documented on 8/12/13. An expanded version of the Alternative Splicing Annotation Project (ASAP) database with a new interface and integration of comparative features using UCSC BLASTZ multiple alignments. It supports 9 vertebrate species, 4 insects, and nematodes, and provides with extensive alternative splicing analysis and their splicing variants. As for human alternative splicing data, newly added EST libraries were classified and included into previous tissue and cancer classification, and lists of tissue and cancer (normal) specific alternatively spliced genes are re-calculated and updated. They have created a novel orthologous exon and intron databases and their splice variants based on multiple alignment among several species. These orthologous exon and intron database can give more comprehensive homologous gene information than protein similarity based method. Furthermore, splice junction and exon identity among species can be valuable resources to elucidate species-specific genes. ASAP II database can be easily integrated with pygr (unpublished, the Python Graph Database Framework for Bioinformatics) and its powerful features such as graph query, multi-genome alignment query and etc. ASAP II can be searched by several different criteria such as gene symbol, gene name and ID (UniGene, GenBank etc.). The web interface provides 7 different kinds of views: (I) user query, UniGene annotation, orthologous genes and genome browsers; (II) genome alignment; (III) exons and orthologous exons; (IV) introns and orthologous introns; (V) alternative splicing; (IV) isoform and protein sequences; (VII) tissue and cancer vs. normal specificity. ASAP II shows genome alignments of isoforms, exons, and introns in UCSC-like genome browser. All alternative splicing relationships with supporting evidence information, types of alternative splicing patterns, and inclusion rate for skipped exons are listed in separate tables. Users can also search human data for tissue- and cancer-specific splice forms at the bottom of the gene summary page. The p-values for tissue-specificity as log-odds (LOD) scores, and highlight the results for LOD >= 3 and at least 3 EST sequences are all also reported.

Search
Clear search
Close search
Google apps
Main menu