The Imperfect SSR Finder is an online tool to help geneticists find Simple Sequence Repeats (SSR), aka microsatellites or Short Tandem Repeats (STR), in uploaded FASTA sequences. The Imperfect SSR Finder is an interactive website to help you find imperfect (and perfect) SSRs. You can test small snippets or upload large files, change the lengths and types of the SSRs your are looking for, and create output with SSRs in inverted case and/or color highlights. A tabular information file is also created in .CSV format, for easy import into any spreadsheet program. Resources in this dataset:Resource Title: Imperfect SSR Finder. File Name: Web Page, url: https://ssr.nwisrl.ars.usda.gov/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Microsatellites, known as simple sequence repeats (SSRs), are short tandem repeats of 1 to 6 nucleotide motifs found in all genomes, particularly eukaryotes. They are widely used as co-dominant markers in genetic analyses and molecular breeding. Triticeae, a tribe of grasses, includes major cereal crops such as bread wheat, barley, and rye, as well as abundant forage and lawn grasses, playing a crucial role in global food production and agriculture. To enhance genetic work and expedite the improvement of Triticeae crops, we have developed TriticeaeSSRdb, an integrated and user-friendly database. It contains 3,891,705 SSRs from 21 species and offers browsing options based on genomic regions, chromosomes, motif types, and repeat motif sequences. Advanced search functions allow personalized searches based on chromosome location and length of SSR. Users can also explore the genes associated with SSRs, design customized primer pairs for PCR validation, and utilize practical tools for whole-genome browsing, sequence alignment, and in silico SSR prediction from local sequences. We continually update TriticeaeSSRdb with additional species and practical utilities. We anticipate that this database will greatly facilitate trait genetic analyses and enhance molecular breeding strategies for Triticeae crops. Researchers can freely access the database at http://triticeaessrdb.com/.
This dataset contains the KML files of the various SRR (search and rescue region) of the French MRCC (maritime rescue co-ordination centre).
The extraction was carried out on 2018-10-15 from the IMO site dedicated to French RRS: https://gisis.imo.org/Public/COMSAR/RCC.aspx?CID=FRA.
This data is transcribed speech data, in Wolof, Pulaar and Sereer. The recordings are about agriculture. The recorded consist of farmers, agricultural advisers, and agri-food business managers. Type of recordings comprise interactive radio programmes, focus groups, voice messages, push messages and interviews. Therefore, spontaneous speech is prevailing. Quality of audio may vary depending on the type of programme. Content description : speech_dataset_wol.tar.gz: Wolof (ISO Code 639-2: wol) speech dataset contains 55 hours of transcribed speech, including almost 13 hours of validated content check by an expert. It also contains a XSAMPA lexicon (49,132 phonetised entries) and a text corpus (1,140,508 words). speech_dataset_fuc.tar.gz: Pulaar (ISO Code 639-2: fuc) speech dataset contains nearly 32 hours of transcribed speech, including around 11 hours of validated content check by an expert. It also contains a text corpus (742,024 words). speech_dataset_srr.tar.gz: Sereer (ISO Code 639-2: srr) speech dataset contains 38 hours of transcribed speech, including nearly 11 hours of validated content check by an expert.In total, these resources provide 125 hours of transcribed speech in the 3 most widely spoken languages in Senegal, including 35 hours of checked transcriptions. This work is a result of the Kallaama project, funded by Lacuna Fund for 1 year, in 2023. See the GitHub repository for more details about the dataset.
C_atl_RADseq_de_novo_assemblySpecies: Cedrus atlantica Manetti
Number of individuals: one single adult individual coming from the Luberon forest(43°47' N / 5°12' E, France)
Plant tissue: needles (diploid)
Sequencing method: Restriction site Associated DNA sequencing, paired-end sequencing using the Illumina HiSeq2000 platform (Illumina Inc., San Diego, CA) (2 x 101 bp)
Data description: 66,656 contigs generated by de novo assembly
Assembly tool: Velvet 1.2.06 (Zerbino & Birney 2008)C_atl_mRNAseq_de_novo_assemblySpecies: Cedrus atlantica Manetti
Number of indiivudals: one single 2-year-old seedling that originated from the the Luberon forest (43°47' N / 5°12' E, France) and was grown in a greenhouse in a pot
Plant tissues: needles and roots (diploid)
Sequencing method: transcriptome sequencing, paired-end sequencing using the Illumina HiSeq2000 platform (Illumina Inc., San Diego, CA) (2 x 101 bp)
Data description: 130,184 contigs generated by de novo assembly
Assembly tool...
BACKGROUND: The isolation of microsatellite markers remains laborious and expensive. For some taxa, such as Lepidoptera, development of microsatellite markers has been particularly difficult, as many markers appear to be located in repetitive DNA and have nearly identical flanking regions. We attempted to circumvent this problem by bioinformatic mining of microsatellite sequences from a de novo-sequenced transcriptome of a butterfly (Euphydryas editha). PRINCIPAL FINDINGS: By searching the assembled sequence data for perfect microsatellite repeats we found 10 polymorphic loci. Although, like many expressed sequence tag-derived microsatellites, our markers show strong deviations from Hardy-Weinberg equilibrium in many populations, and, in some cases, a high incidence of null alleles, we show that they nonetheless provide measures of population differentiation consistent with those obtained by amplified fragment length polymorphism analysis. Estimates of pairwise population differentiatio...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
As an important nut crop species, macadamia continues to gain increased amounts of attention worldwide. Nevertheless, with the vast increase in macadamia omic data, it is becoming difficult for researchers to effectively process and utilize the information. In this work, we developed the first integrated germplasm and genomic database for macadamia (MacadamiaGGD), which includes five genomes of four species; three chloroplast and mitochondrial genomes; genome annotations; transcriptomic data for three macadamia varieties, germplasm data for four species and 262 main varieties; nine genetic linkage maps; and 35 single-nucleotide polymorphisms (SNPs). The database serves as a valuable collection of simple sequence repeat (SSR) markers, including both markers that are based on macadamia genomic sequences and developed in this study and markers developed previously. MacadamiaGGD is also integrated with multiple bioinformatic tools, such as search, JBrowse, BLAST, primer designer, sequence fetch, enrichment analysis, multiple sequence alignment, genome alignment, and gene homology annotation, which allows users to conveniently analyze their data of interest. MacadamiaGGD is freely available online (http://MacadamiaGGD.net). We believe that the database and additional information of the SSR markers can help scientists better understand the genomic sequence information of macadamia and further facilitate molecular breeding efforts of this species.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Microsatellite DNA families (MDF) are stretches of DNA that share similar or identical sequences beside nuclear simple-sequence repeat (nSSR) motifs, potentially causing problems during nSSR marker development. Primers positioned within MDFs can bind several times within the genome and might result in multiple banding patterns. It is therefore common practice to exclude MDF loci in the course of marker development. Here, we propose an approach to deal with multiple primer binding sites by purposefully positioning primers within the detected repetitive element. We developed a new protocol to determine the family type and the primer position in relation to MDFs using the software packages RepARK and RepeatMasker together with an in-house R script. We re-evaluated newly developed nSSR markers for the lepidopteran Marbled White (Melanargia galathea) and explored the implications of our results with regard to published data sets of the butterfly Ephydryas aurinia, the grasshopper Stethophyma grossum, the conifer Pinus cembra, and the crucifer Arabis alpina. For M. galathea, we show that it is not only possible to develop reliable nSSR markers for MDF loci, but even to benefit from their presence in some cases: We used one unlabeled primer, successfully binding within an MDF, for two different loci in a multiplex PCR, combining this family primer with uniquely binding and fluorescently labeled primers outside of MDFs, respectively. As MDFs are abundant in many taxa, we propose to consider these during nSSR marker development in taxa concerned. Our new approach might help in reducing the number of tested primers during nSSR marker development.
East African highland bananas (EAHB) are staple food crop in Uganda, Tanzania, Burundi and other countries in the African Great Lakes region. Even though several morphologically different types exist, all EAHB are triploid and display minimal genetic variation. To provide more insights into the genetic variation within EAHB, genotyping using simple sequence repeat (SSR) markers, molecular analysis of ITS1-5.8S-ITS2 region of ribosomal DNA locus and the analysis of chromosomal distribution of ribosomal DNA sequences was done. A total of 40 triploid EAHB accessions available in the Musa germplasm collection (International Transit Centre, Leuven, Belgium) were characterized. Six diploid accessions of Musa acuminata ssp. zebrina, ssp. banksii and ssp. malaccensis representing putative parents of EAHB were included in the study. Flow cytometric estimation of 2C nuclear DNA content revealed small differences (max ~ 6.5 %) in genome size among the EAHB clones. While no differences in the numbe...
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Imperfect SSR Finder is an online tool to help geneticists find Simple Sequence Repeats (SSR), aka microsatellites or Short Tandem Repeats (STR), in uploaded FASTA sequences. The Imperfect SSR Finder is an interactive website to help you find imperfect (and perfect) SSRs. You can test small snippets or upload large files, change the lengths and types of the SSRs your are looking for, and create output with SSRs in inverted case and/or color highlights. A tabular information file is also created in .CSV format, for easy import into any spreadsheet program. Resources in this dataset:Resource Title: Imperfect SSR Finder. File Name: Web Page, url: https://ssr.nwisrl.ars.usda.gov/