NCBI Datasets is one-stop shop for finding, browsing, and downloading genomic data. Find and download taxonomy, genome, gene, transcript, protein data, including installation of NCBI Datasets command-line tools.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In a recent manuscript, we report a draft genome of the ascomycotal fungal species Pseudopithomyces maydicus (isolate name SBW1) obtained using a culture isolate from brewery wastewater. From a 22 contig assembly, we predict 13502 protein coding gene models, of which 4389 (32.5%) were annotated to KEGG Orthology and identify 39 biosynthetic gene clusters. Here we provide supplementary data from our analysis:
Supplementary Figure 1
Sequence alignment between Sanger-sequenced partial 28S LSU-rRNA sequence and the top ranked BLASTN hit from NCBI nr/nt database.
Supplementary Figure 2
Pairs plot for contig GC-content, contig coverage and contig length from the P. maydicus assembly.
Supplementary Data File 1
Table listing properties of contigs from the P. maydicus assembly.
Supplementary Data File 2
Summary of taxonomic classification analysis of recovered 18S SSU-rRNA sequences to the SILVA 138 database.
Supplementary Data File 3
Alignment of Sanger-sequenced partial 28S LSU-rRNA sequence against three 28S LSU-rRNA gene sequences recovered from the P. maydicus long read genome assembly and a set of 62 28S LSU-rRNA sequences from members of genus Psuedopithomyces (NCBI Nucleotide searched for “Pseudopithomyces AND 28S" on 30th May 2022).
Supplementary Data File 4
MASH similarity statistics obtained by comparing the P. maydicus long read genome assembly sequence to 9563 fungal genomes obtained from NCBI. The reference genomes from NCBI were downloaded using the NCBI ‘dataset’ (version 13.6.0) command line tool (datasets_13.6.0 download genome taxon 4751 --filename fungi.zip --assembly-level complete_genome,chromosome,scaffold,contig --exclude-gff3 --exclude-protein --exclude-rna).
Supplementary Data File 5
BlastKOALA annotation data for all proteins predicted from P. maydicus long read assembly.
Supplementary Results
Complete output from the antiSMASH6 analysis of the P. maydicus long read assembly.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Updated database to identify Mycobacteriaceae using mashID (https://github.com/duceppemo/mashID).26,101 genomes were downloaded using the NCBI datasets cli tool (https://www.ncbi.nlm.nih.gov/datasets/docs/v2/command-line-tools/download-and-install/).Downloaded genomes were renamed and binned by species (https://github.com/duceppemo/ncbi/blob/master/rename_and_bin_fasta.py)Genomes were dereplicated by species using a 0.1% similarity threshold (https://github.com/rrwick/Assembly-Dereplicator)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Updated database to identify Listeria using mashID (https://github.com/duceppemo/mashID).75,767 genomes were downloaded using the NCBI datasets cli tool (https://www.ncbi.nlm.nih.gov/datasets/docs/v2/command-line-tools/download-and-install/).Downloaded genomes were renamed and binned by species (https://github.com/duceppemo/ncbi/blob/master/rename_and_bin_fasta.py)Genomes were dereplicated by species using a 0.1% similarity threshold (https://github.com/rrwick/Assembly-Dereplicator)
Not seeing a result you expected?
Learn how you can add new datasets to our index.
NCBI Datasets is one-stop shop for finding, browsing, and downloading genomic data. Find and download taxonomy, genome, gene, transcript, protein data, including installation of NCBI Datasets command-line tools.