4 datasets found
  1. d

    NCBI Datasets

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). NCBI Datasets [Dataset]. https://catalog.data.gov/dataset/ncbi-datasets-beta
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    NCBI Datasets is one-stop shop for finding, browsing, and downloading genomic data. Find and download taxonomy, genome, gene, transcript, protein data, including installation of NCBI Datasets command-line tools.

  2. Supplementary data for draft genome of a member of the ascomycotal fungal...

    • zenodo.org
    • data.niaid.nih.gov
    bin, pdf, zip
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krithika Arumugam; Sherilyn Ho; Irina Bessarab; Falicia Goh; Mindia Haryono; Ezequiel Santillan; Stefan Wuertz; Yvonne Chow; Rohan Williams; Krithika Arumugam; Sherilyn Ho; Irina Bessarab; Falicia Goh; Mindia Haryono; Ezequiel Santillan; Stefan Wuertz; Yvonne Chow; Rohan Williams (2025). Supplementary data for draft genome of a member of the ascomycotal fungal genus Pseudopithomyces (family Didymosphaeriaceae) [Dataset]. http://doi.org/10.5281/zenodo.7374666
    Explore at:
    pdf, bin, zipAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Krithika Arumugam; Sherilyn Ho; Irina Bessarab; Falicia Goh; Mindia Haryono; Ezequiel Santillan; Stefan Wuertz; Yvonne Chow; Rohan Williams; Krithika Arumugam; Sherilyn Ho; Irina Bessarab; Falicia Goh; Mindia Haryono; Ezequiel Santillan; Stefan Wuertz; Yvonne Chow; Rohan Williams
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In a recent manuscript, we report a draft genome of the ascomycotal fungal species Pseudopithomyces maydicus (isolate name SBW1) obtained using a culture isolate from brewery wastewater. From a 22 contig assembly, we predict 13502 protein coding gene models, of which 4389 (32.5%) were annotated to KEGG Orthology and identify 39 biosynthetic gene clusters. Here we provide supplementary data from our analysis:

    Supplementary Figure 1
    Sequence alignment between Sanger-sequenced partial 28S LSU-rRNA sequence and the top ranked BLASTN hit from NCBI nr/nt database.

    Supplementary Figure 2
    Pairs plot for contig GC-content, contig coverage and contig length from the P. maydicus assembly.

    Supplementary Data File 1
    Table listing properties of contigs from the P. maydicus assembly.

    Supplementary Data File 2
    Summary of taxonomic classification analysis of recovered 18S SSU-rRNA sequences to the SILVA 138 database.

    Supplementary Data File 3
    Alignment of Sanger-sequenced partial 28S LSU-rRNA sequence against three 28S LSU-rRNA gene sequences recovered from the P. maydicus long read genome assembly and a set of 62 28S LSU-rRNA sequences from members of genus Psuedopithomyces (NCBI Nucleotide searched for “Pseudopithomyces AND 28S" on 30th May 2022).

    Supplementary Data File 4
    MASH similarity statistics obtained by comparing the P. maydicus long read genome assembly sequence to 9563 fungal genomes obtained from NCBI. The reference genomes from NCBI were downloaded using the NCBI ‘dataset’ (version 13.6.0) command line tool (datasets_13.6.0 download genome taxon 4751 --filename fungi.zip --assembly-level complete_genome,chromosome,scaffold,contig --exclude-gff3 --exclude-protein --exclude-rna).

    Supplementary Data File 5
    BlastKOALA annotation data for all proteins predicted from P. maydicus long read assembly.

    Supplementary Results
    Complete output from the antiSMASH6 analysis of the P. maydicus long read assembly.

  3. f

    Mycobacteriaceae database for mashID - 2025-02-20 update

    • figshare.com
    bin
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc-Olivier Duceppe (2025). Mycobacteriaceae database for mashID - 2025-02-20 update [Dataset]. http://doi.org/10.6084/m9.figshare.28489304.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    figshare
    Authors
    Marc-Olivier Duceppe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Updated database to identify Mycobacteriaceae using mashID (https://github.com/duceppemo/mashID).26,101 genomes were downloaded using the NCBI datasets cli tool (https://www.ncbi.nlm.nih.gov/datasets/docs/v2/command-line-tools/download-and-install/).Downloaded genomes were renamed and binned by species (https://github.com/duceppemo/ncbi/blob/master/rename_and_bin_fasta.py)Genomes were dereplicated by species using a 0.1% similarity threshold (https://github.com/rrwick/Assembly-Dereplicator)

  4. f

    Listeria database for mashID - 2025-02-18 update

    • figshare.com
    bin
    Updated Feb 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marc-Olivier Duceppe (2025). Listeria database for mashID - 2025-02-18 update [Dataset]. http://doi.org/10.6084/m9.figshare.28489262.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Feb 25, 2025
    Dataset provided by
    figshare
    Authors
    Marc-Olivier Duceppe
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Updated database to identify Listeria using mashID (https://github.com/duceppemo/mashID).75,767 genomes were downloaded using the NCBI datasets cli tool (https://www.ncbi.nlm.nih.gov/datasets/docs/v2/command-line-tools/download-and-install/).Downloaded genomes were renamed and binned by species (https://github.com/duceppemo/ncbi/blob/master/rename_and_bin_fasta.py)Genomes were dereplicated by species using a 0.1% similarity threshold (https://github.com/rrwick/Assembly-Dereplicator)

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Library of Medicine (2025). NCBI Datasets [Dataset]. https://catalog.data.gov/dataset/ncbi-datasets-beta

NCBI Datasets

Explore at:
Dataset updated
Jun 19, 2025
Dataset provided by
National Library of Medicine
Description

NCBI Datasets is one-stop shop for finding, browsing, and downloading genomic data. Find and download taxonomy, genome, gene, transcript, protein data, including installation of NCBI Datasets command-line tools.

Search
Clear search
Close search
Google apps
Main menu