Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The genome annotation data of Qiancha 1 (QC1) tea cultivar. The genome sequencing data of this cultivar has been made available in the National Genomics Data Center (https://ngdc.cncb.ac.cn/) under project number PRJCA028918. Related papers have been published in Horticultural Research (DOI:10.1093/hr/uhaf064).
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Genome sequences and metadata for the accessions in the .tsv.gz (gzip-compressed tab-separated text) files are freely available from their corresponding sources:
GISAID data are subject to restrictions on sharing described in https://gisaid.org/terms-of-use/. Genome sequences and metadata are available to registered GISAID users as part of EPI_SET_231106ax at https://doi.org/10.55876/gis8.231106ax (7,718,061 accessions used on 2023-08-01).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains the genome and annotation file of Niphotrichum japonicum. The raw data have been deposited in National Genomics Data Center (NGDC; https://ngdc.cncb.ac.cn/bioproject/) under the BioProject accession number PRJCA017860.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
chdgene_table.csv -->
Cotney_CirRes_316709_online_table_v.xlsx -->
E-MTAB-6814.sdrf.txt -->
LncBook_id_conversion.csv and LncBookv1.9_GENCODEv33_GRCh38.gtf.gz -->
lncExpDB_E-MTABGeneTPM.tsv -->
RNACentral_ensembl.tsv and RNACentral_lncbook.tsv -->
UCSC_hg19ToHg38.over.chain -->
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The annotation includes protein-coding genes, ncRNAs, and repeat elements, which are stored in three separate files in gff format. The corresponding genome assembly has been deposited in the GWH database under accession number GWHEUVB00000000.1 and is publicly available at https://ngdc.cncb.ac.cn/gwh.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Platycodon grandifloras genome annotationpasa2.longest.filter-update-v2.gff3 corresponds to the reference genome publicly available at https://ngdc.cncb.ac.cn/gwh (GWH: GWHARYT00000000.1).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Genome annotation of the assembly for Fulvetta ruficapilla (Fruf_v1).This annotation includes aonnotation for protein-coding genes, ncRNAs, and repetitive elements in three different files in the gff format. The associated genome assembly has been deposited in the GWH database under the accession number GWHETLV00000000.1, publicly available at https://ngdc.cncb.ac.cn/gwh/Assembly/85202/show as well as in the GenBank under the accession JBGGOM000000000, publicly available at https://identifiers.org/ncbi/insdc.gca:GCA_042477295.1This version of dataset is provided a natural picture of the species Fulvetta ruficapilla as the cover image, which is taken by authors of the dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Due to the explosion of cancer genome data and the urgent needs for cancer treatment, it is becoming increasingly important and necessary to easily and timely analyze and annotate cancer genomes. However, tumor heterogeneity is recognized as a serious barrier to annotate cancer genomes at the individual patient level. In addition, the interpretation and analysis of cancer multi-omics data rely heavily on existing database resources that are often located in different data centers or research institutions, which poses a huge challenge for data parsing. Here we present CCAS (Cancer genome Consensus Annotation System, https://ngdc.cncb.ac.cn/ccas/#/home), a one-stop and comprehensive annotation system for the individual patient at multi-omics level. CCAS integrates 20 widely recognized resources in the field to support data annotation of 10 categories of cancers covering 395 subtypes. Data from each resource are manually curated and standardized by using ontology frameworks. CCAS accepts data on single nucleotide variant/insertion or deletion, expression, copy number variation, and methylation level as input files to build a consensus annotation. Outputs are arranged in the forms of tables or figures and can be searched, sorted, and downloaded. Expanded panels with additional information are used for conciseness, and most figures are interactive to show additional information. Moreover, CCAS offers multidimensional annotation information, including mutation signature pattern, gene set enrichment analysis, pathways and clinical trial related information. These are helpful for intuitively understanding the molecular mechanisms of tumors and discovering key functional genes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data repository contains the gene annotation files that we generated for five cephalochordate species:
The Alu genome was sequenced by our team (www.evomicslab.org) with collaboration with Dr. Jr-Kai Yu, Dr. Sung-Jin Cho, and Dr. Linda Holland. Two haplotype-decoupled assemblies were produced corresponding to the reference (filename: Asymmetron_lucayanum.ref.*) and alternative (filename: Asymmetron_lucayanum.alt.*). The assembly file of these two genome assemblies have been deposited in Genome Warehouse of National Genomics Data Center (https://ngdc.cncb.ac.cn/gwh) under the accession numbers of GWHFWAS00000000.1 and GWHFWAV00000000.1 respectively.
The Bbe, Bfl, Bja, and Bla genome assemblies were generated by previous studies:
Species | NCBI Genbank accession number | Reference |
Bbe | GCA_019207075.1 | Huang et al. (2023) Three amphioxus reference genomes reveal gene and chromosome evolution of chordates. PNAS, 120 (10), e2201504120 |
Bfl | GCA_019207045.1 | Huang et al. (2023) Three amphioxus reference genomes reveal gene and chromosome evolution of chordates. PNAS, 120 (10), e2201504120 |
Bja | GCA_013266295.2 | Huang et al. (2023) Three amphioxus reference genomes reveal gene and chromosome evolution of chordates. PNAS, 120 (10), e2201504120 |
Bla | GCA_927797965.1 | Brasó-Vives et al. (2022) Parallel evolution of amphioxus and vertebrate small-scale gene duplications. Genome Biology, 23 (1), 243 |
All 6 genome assemblies were annotated with the same annotation pipeline in our study: RepeatMasker (v4.1.1) and EDTA (v2.0.1) for repeats/TE annotation, FunAnnotate (v1.8.15) for protein coding gene and tRNA gene annotation, RNAmmer (v1.2) for rRNA gene annotation.
Explanation for each files:
Species_name.gff3.gz # protein-coding gene and tRNA gene annotation in GFF3 format (compressed by gzip)
Species_name.cds-transcripts.fa.gz # CDS/transcript sequences of annotated protein-coding genes in FASTA format (compressed by gzip)
Species_name.proteins.fa.gz # protein sequences of annotated protein-coding genes in FASTA format (compressed by gzip)
Species_name.rRNA.gff2.gz # rRNA gene annotation in GFF2 format (compressed by gzip)
Species_name.TEanno.gff3.gz # TE annotation in GFF3 format (compressed by gzip)
In addition, for this project, we curated the multi-stage bulk RNA-seq datasets (both from this study and published studies) of five cephalochordate species and mapped them to the corresponding annotated genomes described above. The gene expression profile was summarized by TPM and provided with the following file:cephalochordate_TPM.xlsx
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We performed integrative analyses in a Chinese cohort of peri-/post-menopausal women (n = 517) with metagenomics (n = 499), targeted metabolomics (n = 500) and whole-genome sequencing (n = 500) to identify novel microbiome-related biomarkers for bone health. We also performed metagenomics in a US whites cohort for validation (n = 59).The characteristics of the cohorts, genome-wide association study (GWAS) data, concentrations of serum short chain fatty acids (SCFAs), relative abundance of gut microbiota and gut microbiota-associated functional KEGG modules are shown in this dataset.The sequencing data of the Chinese cohort (whole genome sequencing and metagenomics) can be found in “Genome Sequence Archive for Human” (https://ngdc.cncb.ac.cn/gsa-human, accession No. HRA004900). The metagenomic sequencing data can be found in “Sequence Read Archive” (https://www.ncbi.nlm.nih.gov/sra, accession No. PRJNA986283 and PRJNA1011937).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes the expression profiles of RpLoap1-10 in different parts of the female reproductive tract, RNAi efficiency, the effects of RpLoap1-10 on fecundity of female insect (egg numbers and hatchability) after RNAi, and daily count of shriveled eggs and hatching eggs under 20% relative humidity. The original data for the proteomic analysis of the secretions located in lateral oviduct lumen and secretions adhering to egg surface are available at https://ngdc.cncb.ac.cn/omix, under BioProject PRJCA028368, accession numbers OMIX006943 and OMIX006948, respectively.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The genome annotation data of Qiancha 1 (QC1) tea cultivar. The genome sequencing data of this cultivar has been made available in the National Genomics Data Center (https://ngdc.cncb.ac.cn/) under project number PRJCA028918. Related papers have been published in Horticultural Research (DOI:10.1093/hr/uhaf064).