100+ datasets found

f
ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx
auckland.figshare.com
figshare.com
xlsx
Updated May 17, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Elizabeth McKenzie (2019). ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx [Dataset]. http://doi.org/10.17608/k6.auckland.8142947.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.17608/k6.auckland.8142947.v1
Dataset updated
May 17, 2019
Dataset provided by
The University of Auckland
Authors
Elizabeth McKenzie
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
DNA sequences used to identify fungi cultured from human faeces.The ITS1‑5.8s‑ITS2 region of the extracted rDNA of fungal isolates was chosen to be amplified based on its success in identifying a wide range of fungal species [53]. For DNA amplification, 10.0 mL of REDExtract-N-Amp™ PCR Ready Mix; 7.8 mL of PCR-grade H2O; 0.8 mL of 10 mM forward primer (ITS1, sequence TCCGTAGGTGAACCTGCGG); 0.8 mL of 10 mM reverse primer (ITS4, sequence TCCTCCGCTTATTGATATGC); and 1.0 mL of extracted fungal DNA sample were added to a 200 mL Eppendorf PCR tube. The same method was used to prepare the negative control. PCR amplification was performed with a preliminary step of polymerase activation at 94 oC for 2 minutes; 35 cycles of denaturation at 94 oC for 30 seconds, annealing at 51 oC for 20 seconds, and extension at 77 oC for 1 minute; and a final extension step at 72 oC for 8 minutes, using the Eppendorf Vapo. Protect ™ Mastercycler® Pro S.

To confirm a successful fungal DNA extraction and amplification, 4 mL of the amplified fungal rDNA product of the PCR reaction was loaded onto a 1 % (w/v) agarose gel in a 1x Tris/Borate/EDTA (TBE) buffer, and 1 mL cyanide dye SYBR® DNA gel stain was added for visualisation purposes. One kilobase (1kb) plus DNA ladder (5 mL) and 5 mL of the negative control were also loaded onto the agarose gel. Following the completion of gel electrophoresis, PCR products were visualised with the GelDocTM XR Plus System (BIO‑RAD, USA). The 1kb plus DNA ladder was used to determine the size of the amplified fungal DNA fragments using the Gelanalyzer 2010a quantification programme. The fungal rDNA fragments of the ITS1‑5.8s‑ITS2 region obtained from PCR were then transferred to the Centre of Genomics, Proteomics and Metabolomics DNA sequencing facility for sequencing.

Capillary Electrophoresis DNA Sequencing (Sanger Sequencing) was used to obtain the DNA sequences of the amplified ITS1‑5.8s‑ITS2 region. Each sample containing fungal DNA template had two reactions performed, one for each primer and were mixed with the ABI PRISMTM BIG DYE Terminator Sequencing Kit version 3.1 (ThermoFisher Scientific) containing DNA polymerase enzyme, a buffer, four DNA nucleotides and four chain-terminating dideoxy nucleotides with fluorescent dyes. The samples were then subjected to cycle sequencing on the thermal cycler Applied Biosystems GeneAmp® PCR System 9700 using standard cycling conditions: a preliminary step of polymerase activation at 96 oC for 1 minute; 25 cycles of denaturation at 96 oC for 10 seconds, annealing at 50 oC for 5 seconds, and extension at 60 oC for 4 minutes. Following the cycle sequencing, the samples were purified using Agencourt® CleanSEQ® magnetic beads in order to remove the excess fluorescent dyes, nucleotides, salts and other contaminants. The remaining purified DNA samples were then separated by size by capillary electrophoresis with the ABI PRISMTM 3130XL Genetic Analyzer using 50 cm capillaries and POP7 polymer. The final data output of the ITS‑5.8s‑ITS2 region DNA sequences was based on the detection of the attached fluorescent dyes excited by a laser.

Geneious programme version 11.1.5 (www.geneious.com) was used to analyse the raw data [54]. The data included both forward and reverse rDNA sequences for each fungal isolate. These sequences were aligned and ends showing poor quality reads were trimmed, to obtain a consensus sequence. A tool within the Geneious programme, BLAST (Basic Local Alignment Search Tool) developed by Altschul et al. [55], optimised for fast and high similarity search (MegaBLAST version), was used to compare the consensus query sequence with known DNA sequences in GenBank (NCBI genetic sequence database), EMBL (European Molecular Biology Laboratory), DDBJ (DNA DataBank of Japan) and PDB (Protein Data Bank, Worldwide). The search results included: grade percentage score showing combinatorial results of the query input sequence coverage, expectation-value (e-value) and identity value for each hit against the database; identities match and percentage score indicating the extent to which the query DNA sequence matched the database nucleotide sequence; and bit-score showing the quality of alignment and measuring sequence similarity [56]. The higher the score of each result, the higher the certainty of identification of the fungal species. Grade percentage score of >98 % was considered as correct genomic identification.
n
Corn Fungal Resistance Associated Sequences Database
neuinfo.org
scicrunch.org
+1more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Corn Fungal Resistance Associated Sequences Database [Dataset]. http://identifiers.org/RRID:SCR_010644
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_010644
Dataset updated
Jan 29, 2022
Description
A relational database with dynamic querying and data integration that can be used by researchers to identify genetic sequences with a high probability of being associated with aflatoxin accumulation resistance, according to multiple lines of evidence. CFRAS-DB integrates genomic, proteomic, and genetic data from multiple studies in maize dealing with aflatoxin accumulation or Aspergillus flavus resistance., THIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 16,2025.
n
Fungi Sequencing Projects
neuinfo.org
dknet.org
+2more
Updated Aug 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Fungi Sequencing Projects [Dataset]. http://identifiers.org/RRID:SCR_008524
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008524
Dataset updated
Aug 8, 2024
Description
Fungal genomes available from the Sanger Institute. Data are accessible in a number of ways; for each organism there is a BLAST server, allowing search of the sequences. Sequences can also be down-loaded directly by FTP. In addition, for those organisms being sequenced using a cosmid approach, finished and annotated cosmids are submitted to EMBL and other public databases.
UNITE - Unified system for the DNA based fungal species linked to the...
smng.net
demo.gbif.org
+1more
Updated May 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UNITE Community; UNITE Community (2024). UNITE - Unified system for the DNA based fungal species linked to the classification [Dataset]. http://doi.org/10.15468/mkpcy3
Explore at:
Unique identifier
https://doi.org/10.15468/mkpcy3
Dataset updated
May 21, 2024
Dataset provided by
Global Biodiversity Information Facilityhttps://www.gbif.org/
PlutoF
Authors
UNITE Community; UNITE Community
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
UNITE is a rDNA sequence database designed to provide a stable and reliable platform for sequence-borne identification of all fungal species. UNITE provides a unified way for delimiting, identifying, communicating, and working with DNA-based Species Hypotheses (SH). All fungal ITS sequences in the International Nucleotide Sequence Databases (INSD: GenBank, ENA, DDBJ) are clustered to approximately the species level by applying a set of dynamic distance values (0.5 - 3.0%). All species hypotheses are given a unique, stable name in the form of a DOI, and their taxonomic and ecological annotations are verified through distributed, web-based third-party annotation efforts. SHs are connected to a taxon name and its classification as far as possible (phylum, class, order, etc.) by taking into account identifications for all sequences in the SH. An automatically or manually designated sequence is chosen to represent each such SH. These sequences are released (https://unite.ut.ee/repository.php) for use by the scientific community in, for example, local sequence similarity searches and next-generation sequencing analysis pipelines. The system and the data are updated automatically as the number of public fungal ITS sequences grows.
e
Data from: Cogeme Phytopathogenic Fungi and Oomycete EST Database
ore.exeter.ac.uk
application/x-gzip
Updated Jul 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darren Soanes; Nicholas J. Talbot (2025). Cogeme Phytopathogenic Fungi and Oomycete EST Database [Dataset]. https://ore.exeter.ac.uk/articles/dataset/Cogeme_Phytopathogenic_Fungi_and_Oomycete_EST_Database/29673914
Explore at:
application/x-gzipAvailable download formats
Dataset updated
Jul 30, 2025
Dataset provided by
University of Exeter
Authors
Darren Soanes; Nicholas J. Talbot
License
https://www.rioxx.net/licenses/all-rights-reservedhttps://www.rioxx.net/licenses/all-rights-reserved
Description
Expressed sequence tags (ESTs) have been obtained from eighteen species of plant pathogenic fungi, two species of phytopathogenic oomycete and three species of saprophytic fungi. Hierarchical clustering software was used to classify together ESTs representing the same gene and produce a single contig, or consensus sequence. The unisequence set for each pathogen therefore represents a set of unique gene sequences, each one consisting of either a single EST or a contig sequence made from a group of ESTs. Unisequences were annotated based on top hits against the NCBI non-redundant protein database using blastx.
b
Molecular database for the identification of fungi
bioregistry.io
Updated Apr 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Molecular database for the identification of fungi [Dataset]. http://identifiers.org/re3data:r3d100011316
Explore at:
Unique identifier
https://identifiers.org/re3data:r3d100011316
Dataset updated
Apr 26, 2021
Description
UNITE is a fungal rDNA internal transcribed spacer (ITS) sequence database. It focuses on high-quality ITS sequences generated from fruiting bodies collected and identified by experts and deposited in public herbaria. Entries may be supplemented with metadata on describing locality, habitat, soil, climate, and interacting taxa.
n
PROTAX-fungi: a web-based tool for probabilistic taxonomic placement of...
data.niaid.nih.gov
datadryad.org
zip
Updated May 29, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kessy Abarenkov; Panu Somervuo; R. Henrik Nilsson; Paul M. Kirk; Tea Huotari; Nerea Abrego; Otso Ovaskainen (2019). PROTAX-fungi: a web-based tool for probabilistic taxonomic placement of fungal ITS sequences [Dataset]. http://doi.org/10.5061/dryad.9dr6j0c
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.9dr6j0c
Dataset updated
May 29, 2019
Dataset provided by
University of Gothenburg
Royal Botanic Gardens, Kew
University of Tartu
University of Helsinki
Authors
Kessy Abarenkov; Panu Somervuo; R. Henrik Nilsson; Paul M. Kirk; Tea Huotari; Nerea Abrego; Otso Ovaskainen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Area covered
Zackenberg, Greenland
Description
• Incompleteness of reference sequence databases and unresolved taxonomic relationships complicates taxonomic placement of fungal sequences. We developed PROTAX-fungi, a general tool for taxonomic placement of fungal ITS sequences, and implemented it into the PlutoF platform of the UNITE database for molecular identification of fungi. • PROTAX-fungi outperformed the SINTAX and RDB classifiers in terms of increased accuracy and decreased calibration error when applied to data on mock communities representing species groups with poor sequence database coverage. • With empirical data on root- and wood-associated fungi, PROTAX-fungi identified reliably (with at least 90% identification probability) the majority of sequences to the order level but only ca. one fifth of them to the species level, reflecting the current limited coverage of the databases. • When applied to examine the internal consistencies of the Index Fungorum and UNITE databases, PROTAX-fungi revealed inconsistencies in the taxonomy database as well as mislabelling and sequence quality problems in the reference database. The according improvements were implemented in both databases. • PROTAX-fungi provides a robust tool for performing statistically reliable identifications of fungi in spite of the incompleteness of extant reference sequence databases and unresolved taxonomic relationships.
Fungal 18S Ribosomal RNA (SSU) RefSeq Targeted Loci Project
gbif.org
canadensys.net
+1more
Updated Nov 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barbara Robbertse; Barbara Robbertse (2025). Fungal 18S Ribosomal RNA (SSU) RefSeq Targeted Loci Project [Dataset]. http://doi.org/10.15468/gpmmya
Explore at:
Unique identifier
https://doi.org/10.15468/gpmmya
Dataset updated
Nov 16, 2025
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Global Biodiversity Information Facilityhttps://www.gbif.org/
Authors
Barbara Robbertse; Barbara Robbertse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The 18S ribosomal RNA targeted loci project is a RefSeq curated data set sourced from INSDC records. At a minimum the sequence contains most of the variable V4 region and part of the V5 region and each record contain a collection identifier (predominantly type material) from a public collection. The presence of the 18S signature has been verified by the ribovore pipeline (https://github.com/nawrockie/ribovore) using hidden Markov and covariance models. Other verification steps for example checking for vector sequences, too many ambiguous nucleotides, and misassembled sequences are also included. SSU RefSeq accessions (NG_ ) include sequences mostly obtained from type specimens and a few from reference specimens. Type and reference identifiers are curated by NCBI Taxonomy. The collection source of type material is indicated in each record and collection acronyms follows the collection codes maintained at https://www.ncbi.nlm.nih.gov/biocollections/. All sequences will have the same project ID and can be found as such. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA39195.
f
Table_1_Whole Genome Shotgun Sequencing Detects Greater Lichen Fungal...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kyle Garrett Keepers; Cloe S. Pogoda; Kristin H. White; Carly R. Anderson-Stewart; Jordan R. Hoffman; Ana Maria Ruiz; Christy M. McCain; James C. Lendemer; Nolan Coburn Kane; Erin A. Tripp (2023). Table_1_Whole Genome Shotgun Sequencing Detects Greater Lichen Fungal Diversity Than Amplicon-Based Methods in Environmental Samples.xlsx [Dataset]. http://doi.org/10.3389/fevo.2019.00484.s002
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.3389/fevo.2019.00484.s002
Dataset updated
May 31, 2023
Dataset provided by
Frontiers
Authors
Kyle Garrett Keepers; Cloe S. Pogoda; Kristin H. White; Carly R. Anderson-Stewart; Jordan R. Hoffman; Ana Maria Ruiz; Christy M. McCain; James C. Lendemer; Nolan Coburn Kane; Erin A. Tripp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this study we demonstrate the utility of whole genome shotgun (WGS) metagenomics in study organisms with small genomes to improve upon amplicon-based estimates of biodiversity and microbial diversity in environmental samples for the purpose of understanding ecological and evolutionary processes. We generated a database of full-length and near-full-length ribosomal DNA sequence complexes from 273 lichenized fungal species and used this database to facilitate fungal species identification in the southern Appalachian Mountains using low coverage WGS at higher resolution and without the biases of amplicon-based approaches. Using this new database and methods herein developed, we detected between 2.8 and 11 times as many species from lichen fungal propagules by aligning reads from WGS-sequenced environmental samples compared to a traditional amplicon-based approach. We then conducted complete taxonomic diversity inventories of the lichens in each one-hectare plot to assess overlap between standing taxonomic diversity and diversity detected based on propagules present in environmental samples (i.e., the “potential” of diversity). From the environmental samples, we detected 94 species not observed in organism-level sampling in these ecosystems with high confidence using both WGS and amplicon-based methods. This study highlights the utility of WGS sequence-based approaches in detecting hidden species diversity and demonstrates that amplicon-based methods likely miss important components of fungal diversity. We suggest that the adoption of this method will not only improve understanding of biotic constraints on the distributions of biodiversity but will also help to inform important environmental policy.
I
Next-gen sequencing and metadata analyses of Great Lakes fungal data
databank.illinois.edu
Updated Dec 18, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew N. Miller (2017). Next-gen sequencing and metadata analyses of Great Lakes fungal data [Dataset]. http://doi.org/10.13012/B2IDB-9320144_V2
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-9320144_V2
Dataset updated
Dec 18, 2017
Authors
Andrew N. Miller
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
The Great Lakes
Dataset funded by
U.S. National Institutes of Health (NIH)
Description
The data set consists of Illumina sequences derived from 48 sediment samples, collected in 2015 from Lake Michigan and Lake Superior for the purpose of inventorying the fungal diversity in these two lakes. DNA was extracted from ca. 0.5g of sediment using the MoBio PowerSoil DNA isolation kits following the Earth Microbiome protocol. PCR was completed with the fungal primers ITS1F and fITS7 using the Fluidigm Access Array. The resulting amplicons were sequenced using the Illumina Hi-Seq2500 platform with rapid 2 x 250nt paired-end reads. The enclosed data sets contain the forward read files for both primers, both fixed-header index files, and the associated map files needed to be processed in QIIME. In addition, enclosed are two rarefied OTU files used to evaluate fungal diversity. All decimal latitude and decimal longitude coordinates of our collecting sites are also included. File descriptions: Great_lakes_Map_coordinates.xlsx = coordinates of sample sites QIIME Processing ITS1 region: These are the raw files used to process the ITS1 Illumina reads in QIIME. ***only forward reads were processed GL_ITS1_HW_mapFile_meta.txt = This is the map file used in QIIME. ITS1F_Miller_Fludigm_I1_fixedheader.fastq = Index file from Illumina. Headers were fixed to match the forward reads (R1) file in order to process in QIIME ITS1F_Miller_Fludigm_R1.fastq = Forward Illumina reads for the ITS1 region. QIIME Processing ITS2 region: These are the raw files used to process the ITS2 Illumina reads in QIIME. ***only forward reads were processed GL_ITS2_HW_mapFile_meta.txt = This is the map file used in QIIME. ITS7_Miller_Fludigm_I1_Fixedheaders.fastq = Index file from Illumina. Headers were fixed to match the forward reads (R1) file in order to process in QIIME ITS7_Miller_Fludigm_R1.fastq = Forward Illumina reads for the ITS2 region. Resulting OTU Table and OTU table with taxonomy ITS1 Region wahl_ITS1_R1_otu_table.csv = File contains Representative OTUs based on ITS1 region for all the R1 data and the number of each OTU found in each sample. wahl_ITS1_R1_otu_table_w_tax.csv = File contains Representative OTUs based on ITS1 region for all the R1 and the number of each OTU found in each sample along with taxonomic determination based on the following database: sh_taxonomy_qiime_ver7_97_s_31.01.2016_dev ITS2 Region wahl_ITS2_R1_otu_table.csv = File contains Representative OTUs based on ITS2 region for all the R1 data and the number of each OTU found in each sample. wahl_ITS2_R1_otu_table_w_tax.csv = File contains Representative OTUs based on ITS2 region for all the R1 data and the number of each OTU found in each sample along with taxonomic determination based on the following database: sh_taxonomy_qiime_ver7_97_s_31.01.2016_dev Rarified illumina dataset for each ITS Region ITS1_R1_nosing_rare_5000.csv = Environmental parameters and rarefied OTU dataset for ITS1 region. ITS2_R1_nosing_rare_5000.csv = Environmental parameters and rarefied OTU dataset for ITS2 region. Column headings: #SampleID = code including researcher initials and sequential run number BarcodeSequence = LinkerPrimerSequence = two sequences used CTTGGTCATTTAGAGGAAGTAA or GTGARTCATCGAATCTTTG ReversePrimer = two sequences used GCTGCGTTCTTCATCGATGC or TCCTCCGCTTATTGATATGC run_prefix = initials of run operator Sample = location code, see thesis figures 1 and 2 for mapped locations and Great_lakes_Map_coordinates.xlsx for exact coordinates. DepthGroup = S= shallow (50-100 m), MS=mid-shallow (101-150 m), MD=mid-deep (151-200 m), and D=deep (>200 m)" Depth_Meters = Depth in meters Lake = lake name, Michigan or Superior Nitrogen % Carbon % Date = mm/dd/yyyy pH = acidity, potential of Hydrogen (pH) scale SampleDescription = Sample or control X = sequential run number OTU ID = Operational taxonomic unit ID
r
UNITE
rrid.site
scicrunch.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). UNITE [Dataset]. http://identifiers.org/RRID:SCR_006518
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_006518
Dataset updated
Jan 29, 2022
Description
A fungal rDNA internal transcribed spacer (ITS) sequence database (although additional genes and genetic markers are also welcome) to facilitate identification of environmental samples of fungal DNA. Additional important features include user annotation of INSD sequences to add metadata on, e.g., locality, habitat, soil, climate, and interacting taxa. The user can furthermore annotate INSD sequences with additional species identifications that will appear in the results of any analyses done. UNITE focuses on high-quality ITS sequences generated from fruiting bodies collected and identified by experts and deposited in public herbaria. In addition, it also holds all fungal ITS sequences in the International Nucleotide Sequence Databases (INSD: NCBI, EMBL, DDBJ). Both sets of sequences may be used in any analyses carried out. UNITE is accompanied by a project management system called PlutoF, where users can store field data, document the sequencing lab procedures, manage sequences, and make analyses. PlutoF intends to make it possible for taxonomists, ecologists, and biogeographers to use a common platform for data storage, handling, and analyses, with the intent of facilitating an integration of these disciplines. A user can have an unlimited number of projects but still make analyses across any project data available to him.

MycoMobilome: A non-redundant database of transposable element consensus...

zenodo.org

application/gzip, bin

Updated Oct 28, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Tobias Baril; Tobias Baril; Daniel Croll; Daniel Croll (2025). MycoMobilome: A non-redundant database of transposable element consensus sequences for the fungal kingdom [Dataset]. http://doi.org/10.5281/zenodo.17037469

Explore at:

application/gzip, binAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.17037469

Dataset updated

Oct 28, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Tobias Baril; Tobias Baril; Daniel Croll; Daniel Croll

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

MycoMobilome: A non-redundant database of transposable element consensus sequences for the fungal kingdom.

For more information on using the database, how it was constructed, best practices, and how to contribute, please visit the MycoMobilome GitHub.

Three versions of the database are provided:

MycoMobilome_v1.0-allConsensus_TE_library.fasta: All known and unknown TE consensus sequences detected across fungal diversity. Most useful for most use cases.
MycoMobilome_v1.0-proteinEvidence_TE_library.fasta: All TE consensus sequences with ORF hits to known TE proteins. Note the evidence markers in sequence headers and that this subset will not contain any non-autonomous TEs (i.e. SINEs, MITEs, solo LTRs, etc).
MycoMobilome_v1.0-unknown_TE_library.fasta: All TE consensus sequences with NO protein evidence supporting their status as true TEs. These have the potential to be real given little existing knowledge of TE diversity across the kingdom. Many of these are likely non-autonomous elements, such as MITEs (non-autonomous DNA elements), solo LTRs, and SINEs, which will NOT be found in the proteinEvidence subset. However, some sequences are also likely to be erroneous, so use carefully.

In addition to these three database files, the following files are also provided:

MycoMobilome_v1.0_assemblyRecord.xlsx: A record of all publicly available genome assemblies used to generate MycoMobilome. Here, you will find information on assembly length, N50, L50, GC content, species phylogenetic information, genome assembly source and ID, publication, and BUSCO scores.
MycoMobilome-hitsToKnownTransposonProteins-repetPfam35.txt: A TAB-separated file showing hmmscan hits for each MycoMobilome consensus sequence open reading frame to TE domains from the REPET Pfam 35.0 and Gypsy DB curated TE domain dataset. Here, qseqid ends with _n, where n is the ORF number. The query sequence to match to MycoMobilome sequence headers can be found in the column named qseqid_noFrame.
MycoMobilome-hitsToKnownTransposonProteins-rmRepeatPeps.txt: A TAB-separated file showing BLASTp hits for each MycoMobilome consensus sequence open reading frame to TE domains from the RepeatMasker RepeatPeps.lib file supplied with RepeatMasker v4.1.9.

More About MycoMobilome Curation

MycoMobilome was generated from all publicly available fungal genome resources from JGI (excluding restricted assemblies) and NCBI. This is an uncurated database, but all consensus sequences have been generated using a consistent and reproducible curation process.

The MycoMobilome database was generated using a standardised de novo TE curation approach among all publicly available fungal genome resources (n=4,309 genomes). A table containing information on all assemblies used to generate version 1.0 of this database is provided within MycoMobilome_v1.0 in the file MycoMobilome_v1.0_assemblyRecord.xlsx.

Each genome was used to generate putative TE consensus sequences using earlGreyLibConstruct in Earl Grey (v4.4.0)[1], configured with Dfam curated elements (v3.7)[2], using default settings. All putative consensus sequences were combined into a single FASTA file containing 773,843 entries. A non-redundant TE library was constructed using a scalable cascaded clustering approach using MMseqs2[3] easy-cluster with --min-seq-id 0.8 -c 0.8 --cov-mode 1 --cluster-reassign, resulting in 354,315 non-redundant sequences. Representative sequences for each cluster were extracted and labelled with the species name from which the representative originated.

Open reading frames (ORFs) were detected in all six frames of each consensus sequence using transeq in EMBOSS (v6.6.0)[4] with -clean -frame 6. Matches to known host proteins were identified using the Fungi RefSeq[5] database (Release 228) and Diamond BLASTp[6] with --sensitive --matrix BLOSUM62 --evalue 1e-3. Potential hits were combined for each query sequence. Sequences with hits to RefSeq, and either no hits to known TE protein domains, or partial hits to known TE protein domains that do not overlap with RefSeq hit coordinates, were labelled as potential host genes and removed from the MycoMobilome dataset. Any hits to proteins labelled as uncharacterized|hypothetical|low quality|predicted protein were kept due to the potential to be TE-derived.

Matches to known TE proteins were identified using two complementary approaches: (i) Using HMMscan in HMMER (v3.4)[7] to detect homology to known TE protein domains curated by the REPET group. Matches were identified using hmmscan -E 10 --noali. Hits were filtered to retain those where fseq_evalue <=0.001 and fseq_bitscore >= 50. Hits were retained as potential TEs unless the query also matched RefSeq proteins, in which case they were removed to avoid including host genes or chimeric TE–host gene models.

(ii) Using BLASTp to detect homology to known TE protein domains supplied with RepeatMasker (v4.1.5) RepeatPeps.lib.(repeatmasker.org). Matches were identified using blastp -evalue 1e-3. Nested hits were removed to retain the highest quality protein hit for each query, followed by combining adjacent and overlapping hits. Hits were retained as potential TEs unless non-overlapping hits to the same query were also found in the RefSeq hits set, in which case these were removed due to the potential that these hits could be host genes, or chimeric TE-host gene models.

A total of 24,571 consensus sequences were identified as putative host genes and removed from the database, resulting in a potential TE consensus set containing 329,744 sequences. This set was further filtered to remove all putative TE consensus sequences <120bp in length, as these are likely to be poor quality and incomplete. In addition, the base composition of each consensus was calculated using seqtk comp (https://github.com/lh3/seqtk) and all sequences with an N content >=5% were removed due to being poor quality, reducing the final MycoMobilome library to 276,641 sequences.

For each consensus sequence, if there are hits to known TE protein domains, the sequences were labelled as "supported". Following this, the identity of each protein domain hit was evaluated to determine whether the consensus sequence classification is supported by protein hits from the REPET profiles bank or RepeatMasker RepeatPeps. If the identified domains support the consensus classification, the consensus sequence is labelled with _PE for protein evidence. If the identified domains conflict with the consensus classification, the consensus sequence is labelled with _DA for disagreement. If there are no identified domains, the consensus sequence is labelled with _NE for no evidence. The appropriate domains for each classification are defined in the table below:

High level TE classification	Appropriate Domain Hits from REPET	RepeatMasker RepeatPeps
DNA	Tase,Tase*,DDE,HTH,[ATP,INT,AP for crypton,maverick]	DNA
RC	HEL,EN,RPA	RC
LTR	RT,INT,RH,GAG,AP,VirusRelated,LTRrelated,Caulimovirus,ClassIrelated,ENV	LTR
LINE	RT,EN,RH,GAG,ClassIrelated,LINErelated	LINE
PLE	RT,EN,ClassIrelated	PLE
Retroposon	RT,INT,RH,GAG,AP,VirusRelated,LTRrelated,Caulimovirus,ClassIrelated,ENV,EN,LINErelated	Retroposon

Sequences are named with the convention MycMob1.0_family-[n]-[six digit species code]_[protein evidence]#[high level classification]/[sub level classification] @[genus species]. Protein hits to known TE proteins are provided with MycoMobilome to support further investigation in specific use cases. No changes were made to classifications assigned during automated curation, therefore this database should be treated as uncurated and caution should be used to check important or interesting TE loci on a case-by-case basis. Please note that all nonautonomous elements will have the label _NE as they do not contain any intact protein domains. This does not mean they are not real TEs. As such, for most use cases we suggest using the complete MycoMobilome v1.0 dataset, unless you are specifically interested in autonomous TEs only.

Bibliography

Baril T, Galbraith J, Hayward A. Earl Grey: a fully automated user-friendly transposable element annotation and analysis pipeline. Molecular Biology and Evolution. 2024 Apr;41(4):msae068.
Hubley R, Finn RD, Clements J, Eddy SR, Jones TA, Bao W, Smit AF, Wheeler TJ. The Dfam database of repetitive DNA families. Nucleic acids research. 2016 Jan 4;44(D1):D81-9.

Fungal strains and genotypes.
plos.figshare.com
datasetcatalog.nlm.nih.gov
xls
Updated Jun 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Josep V. Forment; Michel Flipphi; Luisa Ventura; Ramón González; Daniel Ramón; Andrew P. MacCabe (2023). Fungal strains and genotypes. [Dataset]. http://doi.org/10.1371/journal.pone.0094662.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0094662.t001
Dataset updated
Jun 7, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Josep V. Forment; Michel Flipphi; Luisa Ventura; Ramón González; Daniel Ramón; Andrew P. MacCabe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Markers not separated by semi-colons are located on the same linkage group.The ∼ symbol indicates the presence of the allele in the genome at an unknown location and/or copy number.
Z
Supplementary data related to draft genome of the ascomycotal fungal species...
data.niaid.nih.gov
Updated Jul 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arumugam, Krithika; Ho, Sherilyn; Bessarab, Irina; Goh, Falicia; Haryono, Mindia; Santillan, Ezequiel; Wuertz, Stefan; Chow, Yvonne; Williams, Rohan (2024). Supplementary data related to draft genome of the ascomycotal fungal species Pseudopithomyces maydicus (family Didymosphaeriaceae) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7374665
Explore at:
Dataset updated
Jul 15, 2024
Dataset provided by
National University of Singapore
Singapore Institute of Food and Biotechnology Innovation (SIFBI)
Nanyang Technological University
Authors
Arumugam, Krithika; Ho, Sherilyn; Bessarab, Irina; Goh, Falicia; Haryono, Mindia; Santillan, Ezequiel; Wuertz, Stefan; Chow, Yvonne; Williams, Rohan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In a recent manuscript, we report a draft genome of the ascomycotal fungal species Pseudopithomyces maydicus (isolate name SBW1) obtained using a culture isolate from brewery wastewater. From a 22 contig assembly, we predict 13502 protein coding gene models, of which 4389 (32.5%) were annotated to KEGG Orthology and identify 39 biosynthetic gene clusters. Here we provide supplementary data from our analysis:

Supplementary Figure 1 Sequence alignment between Sanger-sequenced partial 28S LSU-rRNA sequence and the top ranked BLASTN hit from NCBI nr/nt database.

Supplementary Figure 2 Pairs plot for contig GC-content, contig coverage and contig length from the P. maydicus assembly.

Supplementary Data File 1 Table listing properties of contigs from the P. maydicus assembly.

Supplementary Data File 2 Summary of taxonomic classification analysis of recovered 18S SSU-rRNA sequences to the SILVA 138 database.

Supplementary Data File 3 Alignment of Sanger-sequenced partial 28S LSU-rRNA sequence against three 28S LSU-rRNA gene sequences recovered from the P. maydicus long read genome assembly and a set of 62 28S LSU-rRNA sequences from members of genus Psuedopithomyces (NCBI Nucleotide searched for “Pseudopithomyces AND 28S" on 30th May 2022).

Supplementary Data File 4 MASH similarity statistics obtained by comparing the P. maydicus long read genome assembly sequence to 9563 fungal genomes obtained from NCBI. The reference genomes from NCBI were downloaded using the NCBI ‘dataset’ (version 13.6.0) command line tool (datasets_13.6.0 download genome taxon 4751 --filename fungi.zip --assembly-level complete_genome,chromosome,scaffold,contig --exclude-gff3 --exclude-protein --exclude-rna).

Supplementary Data File 5 BlastKOALA annotation data for all proteins predicted from P. maydicus long read assembly.

Supplementary Results Complete output from the antiSMASH6 analysis of the P. maydicus long read assembly.
r
MIPS Ustilago maydis Database
rrid.site
dknet.org
+1more
Updated Jan 29, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). MIPS Ustilago maydis Database [Dataset]. http://identifiers.org/RRID:SCR_007563
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_007563 https://identifiers.org/RRID:SCR_007563/resolver?q=*&i=rrid
Dataset updated
Jan 29, 2022
Description
The MIPS Ustilago maydis Genome Database aims to present information on the molecular structure and functional network of the entirely sequenced, filamentous fungus Ustilago maydis. The underlying sequence is the initial release of the high quality draft sequence of the Broad Institute. The goal of the MIPS database is to provide a comprehensive genome database in the Genome Research Environment in parallel with other fungal genomes to enable in depth fungal comparative analysis. The specific aims are to: 1. Generate and assemble Whole Genome Shotgun sequence reads yielding 10X coverage of the U. maydis genome 2. Integrate the genomic sequence assembly with physical maps generated by Bayer CropScience 3. Perform automated annotation of the sequence assembly 4. Align the strain 521 assembly with the FB1 assembly provided by Exelixis 5. Release the sequence assembly and results of our annotation and analysis to public Ustilago maydis is a basidiomycete fungal pathogen of maize and teosinte. The genome size is approximately 20 Mb. The fungus induces tumors on host plants and forms masses of diploid teliospores. These spores germinate and form haploid meiotic products that can be propagated in culture as yeast-like cells. Haploid strains of opposite mating type fuse and form a filamentous, dikaryotic cell type that invades plant tissue to reinitiate infection. Ustilago maydis is an important model system for studying pathogen-host interactions and has been studied for more than 100 years by plant pathologists. Molecular genetic research with U. maydis focuses on recombination, the role of mating in pathogenesis, and signaling pathways that influence virulence. Recently, the fungus has emerged as an excellent experimental model for the molecular genetic analysis of phytopathogenesis, particularly in the characterization of infection-specific morphogenesis in response to signals from host plants. Ustilago maydis also serves as an important model for other basidiomycete plant pathogens that are more difficult to work with in the laboratory, such as the rust and bunt fungi. Genomic sequence of U. maydis will also be valuable for comparative analysis of other fungal genomes, especially with respect to understanding the host range of fungal phytopathogens. The analysis of U. maydis would provide a framework for studying the hundreds of other Ustilago species that attack important crops, such as barley, wheat, sorghum, and sugarcane. Comparisons would also be possible with other basidiomycete fungi, such as the important human pathogen C. neoformans. Commercially, U. maydis is an excellent model for the discovery of antifungal drugs. In addition, maize tumors caused by U. maydis are prized in Hispanic cuisine and there is interest in improving commercial production. The complete putative gene set of the Broad Institute''s second release is loaded into the database and in addition all deviating putative genes from a putative gene set produced by MIPS with different gene prediction parameters are also loaded. The complete dataset will then be analysed, gene predictions will be manually corrected due to combined information derived from different gene prediction algorithms and, more important, protein and EST comparisons. Gene prediction will be restricted to ORFs larger than 50 codons; smaller ORFs will be included only if similarities to other proteins or EST matches confirm their existence or if a coding region was postulated by all prediction programs used. The resulting proteins will be annotated. They will be classified according to the MIPS classification catalogue receiving appropriate descriptions. All proteins with a known, characterized homolog will be automatically assigned to functional categories using the MIPS functional catalog. All extracted proteins are in addition automatically analysed and annotated by the PEDANT suite.
RemEff: sequences used for clustering
figshare.com
application/gzip
Updated Aug 21, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darcy Jones (2020). RemEff: sequences used for clustering [Dataset]. http://doi.org/10.6084/m9.figshare.12833678.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12833678.v1
Dataset updated
Aug 21, 2020
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Darcy Jones
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data were collected from fungal sequences in the NCBI IPG and the UniParc databases, and from numerous curated published fungal genomes.fungal_seqs.tsv.gz contains mappings of the non-redundant sequence names to identifiers in original databases.fungal_seqs_curated_genomes.tsv.gz contains details of which sequences came from the curated genomes.fungal_seqs_uniparc_signatures.tsv.gz gives details about InterPro terms present for sequences in the UniParc dataset.fungal_seqs_uniparc_xrefs.tsv.gz maps ids from uniparc to references in other databases.fungal_seqs_ipg.tsv gives details about sequences taken from the IPG database, mapping to other database identifiers.
I
Data from Species Distribution, Phylogenetic Structure, and Functional Roles...
databank.illinois.edu
aws-databank-alb.library.illinois.edu
Updated Nov 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrew Miller; Daniel Raudabaugh (2020). Data from Species Distribution, Phylogenetic Structure, and Functional Roles of Detritius Inhabiting Fungi Across Contrasting Aquatic Environments. [Dataset]. http://doi.org/10.13012/B2IDB-6862941_V2
Explore at:
Unique identifier
https://doi.org/10.13012/B2IDB-6862941_V2
Dataset updated
Nov 5, 2020
Authors
Andrew Miller; Daniel Raudabaugh
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This version 2 dataset contains 34 files in total with one (1) additional file, called "Culture-dependent Isolate table with taxonomic determination and sequence data.csv". The remaining files (33) are identical to version 1. The following is the information about the new file and its variables: Culture-dependent Isolate table with taxonomic determination and sequence data.csv: Culture table with assigned taxonomy from NCBI. Single direction sequence for each isolate is include if one could be obtained. Sequence is derived from ITS1F-ITS4 PCR amplicons, with Sanger sequencing in one direction using ITS5. The files contains 20 variables with explanation as below: IsolateNumber : unique number identify each isolate cultured Time: season in which the sample was collected Location: the specific name of the location Habitat: type of habitat : either stream or peatland State: state in the USA in which the specific location is located Incubation_pH ID: pH of the medium during isolation of fungal cultures Genus: phylogenetic genus of the fungal isolates (determined by sequence similarity) Sequence_quality: base call quality of the entire sequence used for blast analysis, if known %_coverage: sequence coverage reported from GenBank %_ID: sequence similarity reported from GenBank Life_style : ecological life style if known Phylum: phylogenetic phylum as indicated by Index Fungorum Subphylum: phylogenetic subphylum as indicated by Index Fungorum Class: phylogenetic class as indicated by Index Fungorum Subclass: phylogenetic subclass as indicated by Index Fungorum Order: phylogenetic order as indicated by Index Fungorum Family: phylogenetic Family as indicated by Index Fungorum ITS5_Sequence: single direction sequence used for sequence similarity match using blastn. Primer ITS5 Fasta: sequence with nomenclature in a fasta format for easy cut and paste into phylogenetic software Note: blank cells mean no data is available or unknown.
Data from:...
osdr.nasa.gov
s.cnmilf.com
+1more
Updated May 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kasthuri Venkateswaran; Atul Chander (2022). Fungal-Diversity-associated-with-space-craft-assembly-facility-SAF [Dataset]. https://osdr.nasa.gov/bio/repo/data/studies/OSD-497
Explore at:
Dataset updated
May 26, 2022
Dataset provided by
NASAhttp://nasa.gov/
Authors
Kasthuri Venkateswaran; Atul Chander
License
Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
License information was derived automatically
Description
Draft genome sequences of Fungi isolated from Mars 2020 Spacecraft assembly facility are reported. The fungal strains were isolated from samples collected from cleanroom surfaces of Kennedy Space Center-Payload Hazardous Servicing Facility and Jet Propulsion Laboratory-Spacecraft Assembly Facility. Whole genome sequencing (WGS) of these isolates was carried out.
r
Fungal Genome Initiative
rrid.site
scicrunch.org
+2more
Updated Jun 24, 2005
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2005). Fungal Genome Initiative [Dataset]. http://identifiers.org/RRID:SCR_003169
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_003169
Dataset updated
Jun 24, 2005
Description
Produces and analyzes sequence data from fungal organisms that are important to medicine, agriculture and industry. The FGI is a partnership between the Broad Institute and the wider fungal research community, with the selection of target genomes governed by a steering committee of fungal scientists. Organisms are selected for sequencing as part of a cohesive strategy that considers the value of data from each organism, given their role in basic research, health, agriculture and industry, as well as their value in comparative genomics.
Fungal 28S Ribosomal RNA (LSU) RefSeq Targeted Loci Project.
gbif.org
canadensys.net
Updated Oct 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Barbara Robbertse; Barbara Robbertse (2025). Fungal 28S Ribosomal RNA (LSU) RefSeq Targeted Loci Project. [Dataset]. http://doi.org/10.15468/jzfdew
Explore at:
Unique identifier
https://doi.org/10.15468/jzfdew
Dataset updated
Oct 18, 2025
Dataset provided by
National Center for Biotechnology Informationhttp://www.ncbi.nlm.nih.gov/
Global Biodiversity Information Facilityhttps://www.gbif.org/
Authors
Barbara Robbertse; Barbara Robbertse
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The 28S ribosomal RNA targeted loci project is a RefSeq curated data set sourced from INSDC records. At a minimum the sequence contains the hyper variable D1/D2 region and each record contain a collection identifier (predominantly type material) from a public collection. The presence of the 28S signature has been verified by the ribovore pipeline (https://github.com/nawrockie/ribovore) using hidden Markov and covariance models. Other verification steps for example checking for vector sequences, too many ambiguous nucleotides, and misassembled sequences are also included. LSU RefSeq accessions (NG_ ) include sequences mostly obtained from type specimens and a few from reference specimens. Type and reference identifiers are curated by NCBI Taxonomy. The collection source of type material is indicated in each record and collection acronyms follows the collection codes maintained at https://www.ncbi.nlm.nih.gov/biocollections/. All sequences will have the same project ID and can be found as such. Database URL: http://www.ncbi.nlm.nih.gov/bioproject/PRJNA51803.

Facebook

Twitter

Click to copy link

Link copied

Cite

Elizabeth McKenzie (2019). ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx [Dataset]. http://doi.org/10.17608/k6.auckland.8142947.v1

ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx

Explore at:

xlsxAvailable download formats

Unique identifier

https://doi.org/10.17608/k6.auckland.8142947.v1

Dataset updated

May 17, 2019

Dataset provided by

The University of Auckland

Authors

Elizabeth McKenzie

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

DNA sequences used to identify fungi cultured from human faeces.The ITS1‑5.8s‑ITS2 region of the extracted rDNA of fungal isolates was chosen to be amplified based on its success in identifying a wide range of fungal species [53]. For DNA amplification, 10.0 mL of REDExtract-N-Amp™ PCR Ready Mix; 7.8 mL of PCR-grade H2O; 0.8 mL of 10 mM forward primer (ITS1, sequence TCCGTAGGTGAACCTGCGG); 0.8 mL of 10 mM reverse primer (ITS4, sequence TCCTCCGCTTATTGATATGC); and 1.0 mL of extracted fungal DNA sample were added to a 200 mL Eppendorf PCR tube. The same method was used to prepare the negative control. PCR amplification was performed with a preliminary step of polymerase activation at 94 oC for 2 minutes; 35 cycles of denaturation at 94 oC for 30 seconds, annealing at 51 oC for 20 seconds, and extension at 77 oC for 1 minute; and a final extension step at 72 oC for 8 minutes, using the Eppendorf Vapo. Protect ™ Mastercycler® Pro S.

To confirm a successful fungal DNA extraction and amplification, 4 mL of the amplified fungal rDNA product of the PCR reaction was loaded onto a 1 % (w/v) agarose gel in a 1x Tris/Borate/EDTA (TBE) buffer, and 1 mL cyanide dye SYBR® DNA gel stain was added for visualisation purposes. One kilobase (1kb) plus DNA ladder (5 mL) and 5 mL of the negative control were also loaded onto the agarose gel. Following the completion of gel electrophoresis, PCR products were visualised with the GelDocTM XR Plus System (BIO‑RAD, USA). The 1kb plus DNA ladder was used to determine the size of the amplified fungal DNA fragments using the Gelanalyzer 2010a quantification programme. The fungal rDNA fragments of the ITS1‑5.8s‑ITS2 region obtained from PCR were then transferred to the Centre of Genomics, Proteomics and Metabolomics DNA sequencing facility for sequencing.

Capillary Electrophoresis DNA Sequencing (Sanger Sequencing) was used to obtain the DNA sequences of the amplified ITS1‑5.8s‑ITS2 region. Each sample containing fungal DNA template had two reactions performed, one for each primer and were mixed with the ABI PRISMTM BIG DYE Terminator Sequencing Kit version 3.1 (ThermoFisher Scientific) containing DNA polymerase enzyme, a buffer, four DNA nucleotides and four chain-terminating dideoxy nucleotides with fluorescent dyes. The samples were then subjected to cycle sequencing on the thermal cycler Applied Biosystems GeneAmp® PCR System 9700 using standard cycling conditions: a preliminary step of polymerase activation at 96 oC for 1 minute; 25 cycles of denaturation at 96 oC for 10 seconds, annealing at 50 oC for 5 seconds, and extension at 60 oC for 4 minutes. Following the cycle sequencing, the samples were purified using Agencourt® CleanSEQ® magnetic beads in order to remove the excess fluorescent dyes, nucleotides, salts and other contaminants. The remaining purified DNA samples were then separated by size by capillary electrophoresis with the ABI PRISMTM 3130XL Genetic Analyzer using 50 cm capillaries and POP7 polymer. The final data output of the ITS‑5.8s‑ITS2 region DNA sequences was based on the detection of the attached fluorescent dyes excited by a laser.

Geneious programme version 11.1.5 (www.geneious.com) was used to analyse the raw data [54]. The data included both forward and reverse rDNA sequences for each fungal isolate. These sequences were aligned and ends showing poor quality reads were trimmed, to obtain a consensus sequence. A tool within the Geneious programme, BLAST (Basic Local Alignment Search Tool) developed by Altschul et al. [55], optimised for fast and high similarity search (MegaBLAST version), was used to compare the consensus query sequence with known DNA sequences in GenBank (NCBI genetic sequence database), EMBL (European Molecular Biology Laboratory), DDBJ (DNA DataBank of Japan) and PDB (Protein Data Bank, Worldwide). The search results included: grade percentage score showing combinatorial results of the query input sequence coverage, expectation-value (e-value) and identity value for each hit against the database; identities match and percentage score indicating the extent to which the query DNA sequence matched the database nucleotide sequence; and bit-score showing the quality of alignment and measuring sequence similarity [56]. The higher the score of each result, the higher the certainty of identification of the fungal species. Grade percentage score of >98 % was considered as correct genomic identification.

Clear search

Close search

Google apps

Main menu

ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx

Corn Fungal Resistance Associated Sequences Database

Fungi Sequencing Projects

UNITE - Unified system for the DNA based fungal species linked to the...

Data from: Cogeme Phytopathogenic Fungi and Oomycete EST Database

Molecular database for the identification of fungi

PROTAX-fungi: a web-based tool for probabilistic taxonomic placement of...

Fungal 18S Ribosomal RNA (SSU) RefSeq Targeted Loci Project

Table_1_Whole Genome Shotgun Sequencing Detects Greater Lichen Fungal...

Next-gen sequencing and metadata analyses of Great Lakes fungal data

UNITE

MycoMobilome: A non-redundant database of transposable element consensus...

Fungal strains and genotypes.

Supplementary data related to draft genome of the ascomycotal fungal species...

MIPS Ustilago maydis Database

RemEff: sequences used for clustering

Data from Species Distribution, Phylogenetic Structure, and Functional Roles...

Data from:...

Fungal Genome Initiative

Fungal 28S Ribosomal RNA (LSU) RefSeq Targeted Loci Project.

ITS1-5.8s_ITS2 Fungal Sequences and Search Results .xlsx