Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description:
A set of transcriptomics studies on the Gene Expression Omnibus (GEO) platform somehow related to infectious or neurodegenerative diseases.
Columns:
Column names followed by "_QID" hold the Wikidata IDs relative to that column.
Google Sheets:
https://docs.google.com/spreadsheets/d/1LjF4h8n6Sy4PgTJoC-fJ7mGnqGCvqmWiatf2zD-5RM8
Funding:
This curation and release were supported by the grants #2018/10257-2 and #2019/26284-1 from the São Paulo Research Foundation (FAPESP).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The PubMed database offers an extensive set of publication data that can be useful, yet inherently complex to use without automated computational techniques. Data repositories such as the Genomic Data Commons (GDC) and the Gene Expression Omnibus (GEO) offer experimental data storage and retrieval as well as curated gene expression profiles. Genetic interaction databases, including Reactome and Ingenuity Pathway Analysis, offer pathway and experiment data analysis using data curated from these publications and data repositories. We have created a method to generate and analyze consensus networks, inferring potential gene interactions, using large numbers of Bayesian networks generated by data mining publications in the PubMed database. Through the concept of network resolution, these consensus networks can be tailored to represent possible genetic interactions. We designed a set of experiments to confirm that our method is stable across variation in both sample and topological input sizes. Using gene product interactions from the KEGG pathway database and data mining PubMed publication abstracts, we verify that regardless of the network resolution or the inferred consensus network, our method is capable of inferring meaningful gene interactions through consensus Bayesian network generation with multiple, randomized topological orderings. Our method can not only confirm the existence of currently accepted interactions, but has the potential to hypothesize new ones as well. We show our method confirms the existence of known gene interactions such as JAK-STAT-PI3K-AKT-mTOR, infers novel gene interactions such as RAS- Bcl-2 and RAS-AKT, and found significant pathway-pathway interactions between the JAK-STAT signaling and Cardiac Muscle Contraction KEGG pathways.
AITL, angioimmunoblastic T-cell lymphoma; ALCL, anaplastic large cell lymphoma; ATLL, adult T-cell leukemia/lymphoma; HSTL, hepatosplenic T-cell lymphoma; PTCL-NOS, peripheral T-cell lymphoma, not otherwise specified; PMID, PubMed identifierPublically available, chip-matched GEO DataSets of mature T-cell lymphomas and healthy CD4+ and CD8+ T cells utilized for gene expression profiling.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Funding agencies are reluctant to support data archiving, even though large research funders such as the National Science Foundation (NSF) and the National Institutes of Health acknowledge its importance for scientific progress. Our quantitative estimates of data reuse indicate that ongoing financial investment in data-archiving infrastructure provides a high scientific return.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Nguyen DD (2018):Integrative Bioinformatics and Functional Analyses of GEO, ENCODE, and TCGA Reveal FADD as a Direct Target of the Tumor Suppressor BRCA1. curated by BioGRID (https://thebiogrid.org); ABSTRACT: BRCA1 is a multifunctional tumor suppressor involved in several essential cellular processes. Although many of these functions are driven by or related to its transcriptional/epigenetic regulator activity, there has been no genome-wide study to reveal the transcriptional/epigenetic targets of BRCA1. Therefore, we conducted a comprehensive analysis of genomics/transcriptomics data to identify novel BRCA1 target genes. We first analyzed ENCODE data with BRCA1 chromatin immunoprecipitation (ChIP)-sequencing results and identified a set of genes with a promoter occupied by BRCA1. We collected 3085 loci with a BRCA1 ChIP signal from four cell lines and calculated the distance between the loci and the nearest gene transcription start site (TSS). Overall, 66.5% of the BRCA1-bound loci fell into a 2-kb region around the TSS, suggesting a role in transcriptional regulation. We selected 45 candidate genes based on gene expression correlation data, obtained from two GEO (Gene Expression Omnibus) datasets and TCGA data of human breast cancer, compared to BRCA1 expression levels. Among them, we further tested three genes (MEIS2, CKS1B and FADD) and verified FADD as a novel direct target of BRCA1 by ChIP, RT-PCR, and a luciferase reporter assay. Collectively, our data demonstrate genome-wide transcriptional regulation by BRCA1 and suggest target genes as biomarker candidates for BRCA1-associated breast cancer.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
AbstractFunding agencies are reluctant to support data archiving, even though large research funders such as the National Science Foundation (NSF) and the National Institutes of Health acknowledge its importance for scientific progress. Our quantitative estimates of data reuse indicate that ongoing financial investment in data-archiving infrastructure provides a high scientific return. Usage notesPubMed Central reuse of GEO datasets deposited in 2007This is the raw data behind the analysis. It contains one row for every mention of a 2007 GEO dataset in PubMed Central. Each row identifies the mentioned GEO dataset, the PubMed Central article that mentions the dataset's accession number, whether the authors of the dataset and the attributing article overlap, and whether this is considered an instance of third-party data reuse.PMC_reuse_of_2007_GEO_datasets.csvAggregate Table DataAggregate table data behind the figures and results in the README associated with the main dataset. Includes Baseline metrics used for extrapolating PubMed Central (PMC) results to PubMed, Number of mentions of a 2007 GEO dataset by authors who submitted the dataset, and Number of mentions of a dataset by authors who DID NOT submit the dataset across 2007-2010.tables.csv
BackgroundExisting microarray studies of bone mineral density (BMD) have been critical for understanding the pathophysiology of osteoporosis, and have identified a number of candidate genes. However, these studies were limited by their relatively small sample sizes and were usually analyzed individually. Here, we propose a novel network-based meta-analysis approach that combines data across six microarray studies to identify functional modules from human protein-protein interaction (PPI) data, and highlight several differentially expressed genes (DEGs) and a functional module that may play an important role in BMD regulation in women.MethodsExpression profiling studies were identified by searching PubMed, Gene Expression Omnibus (GEO) and ArrayExpress. Two meta-analysis methods were applied across different gene expression profiling studies. The first, a nonparametric Fisher’s method, combined p-values from individual experiments to identify genes with large effect sizes. The second method combined effect sizes from individual datasets into a meta-effect size to gain a higher precision of effect size estimation across all datasets. Genes with Q test’s p-values < 0.05 or I2 values > 50% were assessed by a random effects model and the remainder by a fixed effects model. Using Fisher’s combined p-values, functional modules were identified through an integrated analysis of microarray data in the context of large protein–protein interaction (PPI) networks. Two previously published meta-analysis studies of genome-wide association (GWA) datasets were used to determine whether these module genes were genetically associated with BMD. Pathway enrichment analysis was performed with a hypergeometric test.ResultsSix gene expression datasets were identified, which included a total of 249 (129 high BMD and 120 low BMD) female subjects. Using a network-based meta-analysis, a consensus module containing 58 genes (nodes) and 83 edges was detected. Pathway enrichment analysis of the 58 module genes revealed that these genes were enriched in several important KEGG pathways including Osteoclast differentiation, B cell receptor signaling pathway, MAPK signaling pathway, Chemokine signaling pathway and Insulin signaling pathway. The importance of module genes was replicated by demonstrating that most module genes were genetically associated with BMD in the GWAS data sets. Meta-analyses were performed at the individual gene level by combining p-values and effect sizes. Five candidate genes (ESR1, MAP3K3, PYGM, RAC1 and SYK) were identified based on gene expression meta-analysis, and their associations with BMD were also replicated by two BMD meta-analysis studies.ConclusionsIn summary, our network-based meta-analysis not only identified important differentially expressed genes but also discovered biologically meaningful functional modules for BMD determination. Our study may provide novel therapeutic targets for osteoporosis in women.
https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/
OmicIDX Data
A comprehensive collection of biological and medical research metadata from major public databases, providing structured access to bioproject, biosample, GEO (Gene Expression Omnibus), and PubMed data.
Dataset Summary
This dataset aggregates metadata from key biological research repositories to enable large-scale analysis and discovery of genomics and biomedical research. It includes standardized information from NCBI's BioProject, BioSample, Gene Expression… See the full description on the dataset page: https://huggingface.co/datasets/seandavis/omicidx-data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Epigenome-wide association studies (EWAS) of 40 traits were conducted using publically available data from the Gene Expression Omnibus (GEO) online repository. These results were uploaded to The EWAS Catalog: http://www.ewascatalog.org/. The table presented here represents the underlying data in The EWAS Catalog manuscript and contains the traits along with the corresponding GEO accession numbers and PubMed IDs.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Promoter hypermethylation in death-associated protein kinase 1 (DAPK1) gene has been long linked to cervical neoplasia, but the established results remained controversial. Here, we performed a meta-analysis to assess the associations of DAPK1 promoter hypermethylation with low-grade intra-epithelial lesion (HSIL), high-grade intra-epithelial lesion (HSIL), cervical cancer (CC), and clinicopathological features of CC.Methods: Published studies with qualitative methylation data were initially searched from PubMed, Web of Science, EMBASE, and China National Knowledge Infrastructure databases (up to March 2018). Then, quantitative methylation datasets, retrieved from the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO) databases, were pooled to validate the results of published studies.Results: In a meta-analysis of 37 published studies, DAPK1 promoter hypermethylation progressively increased the risk of LSIL by 2.41-fold (P = 0.012), HSIL by 7.62-fold (P < 0.001), and CC by 23.17-fold (P < 0.001). Summary receiver operating characteristic curves suggested a potential diagnostic value of DAPK1 promoter hypermethylation in CC, with a large area-under-the-curve of 0.83, a high specificity of 97%, and a moderate sensitivity of 59%. There were significant impacts of DAPK1 promoter hypermethylation on histological type (odds ratio (OR) = 3.53, P < 0.001) and FIGO stage of CC (OR = 2.15, P = 0.003). Then, a pooled analysis of nine TCGA and GEO datasets, covering 13 CPG sites within DAPK1 promoter, identified eight CC-associated sites, six sites with diagnostic values for CC (pooled specificities: 74–90%; pooled sensitivities: 70–81%), nine loci associated with the histological type of CC, and all 13 loci with down-regulated effects on DAPK1 mRNA expression.Conclusion: The meta-analysis suggests that DAPK1 promoter hypermethylation is significantly associated with the disease severity of cervical neoplasia. DAPK1 methylation detection exhibits a promising ability to discriminate CC from cancer-free controls.
A virtual database of annotations made by 50 database providers (April 2014) - and growing (see below), that map data to publication information. All NIF Data Federation sources can be part of this virtual database as long as they indicate the publications that correspond to data records. The format that NIF accepts is the PubMed Identifier, category or type of data that is being linked to, and a data record identifier. A subset of this data is passed to NCBI, as LinkOuts (links at the bottom of PubMed abstracts), however due to NCBI policies the full data records are not currently associated with PubMed records. Database providers can use this mechanism to link to other NCBI databases including gene and protein, however these are not included in the current data set at this time. (To view databases available for linking see, http://www.ncbi.nlm.nih.gov/books/NBK3807/#files.Databases_Available_for_Linking ) The categories that NIF uses have been standardized to the following types: * Resource: Registry * Resource: Software * Reagent: Plasmid * Reagent: Antibodies * Data: Clinical Trials * Data: Gene Expression * Data: Drugs * Data: Taxonomy * Data: Images * Data: Animal Model * Data: Microarray * Data: Brain connectivity * Data: Volumetric observation * Data: Value observation * Data: Activation Foci * Data: Neuronal properties * Data: Neuronal reconstruction * Data: Chemosensory receptor * Data: Electrophysiology * Data: Computational model * Data: Brain anatomy * Data: Gene annotation * Data: Disease annotation * Data: Cell Model * Data: Chemical * Data: Pathways For more information refer to Create a LinkOut file, http://neuinfo.org/nif_components/disco/interoperation.shtm Participating resources ( http://disco.neuinfo.org/webportal/discoLinkoutServiceSummary.do?id=4 ): * Addgene http://www.addgene.org/pgvec1 * Animal Imaging Database http://aidb.crbs.ucsd.edu * Antibody Registry http://www.neuinfo.org/products/antibodyregistry/ * Avian Brain Circuitry Database http://www.behav.org/abcd/abcd.php * BAMS Connectivity http://brancusi.usc.edu/ * Beta Cell Biology Consortium http://www.betacell.org/ * bioDBcore http://biodbcore.org/ * BioGRID http://thebiogrid.org/ * BioNumbers http://bionumbers.hms.harvard.edu/ * Brain Architecture Management System http://brancusi.usc.edu/bkms/ * Brede Database http://hendrix.imm.dtu.dk/services/jerne/brede/ * Cell Centered Database http://ccdb.ucsd.edu * CellML Model Repository http://www.cellml.org/models * CHEBI http://www.ebi.ac.uk/chebi/ * Clinical Trials Network (CTN) Data Share http://www.ctndatashare.org/ * Comparative Toxicogenomics Database http://ctdbase.org/ * Coriell Cell Repositories http://ccr.coriell.org/ * CRCNS - Collaborative Research in Computational Neuroscience - Data sharing http://crcns.org * Drug Related Gene Database https://confluence.crbs.ucsd.edu/display/NIF/DRG * DrugBank http://www.drugbank.ca/ * FLYBASE http://flybase.org/ * Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo/ * Gene Ontology Tools http://www.geneontology.org/GO.tools.shtml * Gene Weaver http://www.GeneWeaver.org * GeneDB http://www.genedb.org/Homepage * Glomerular Activity Response Archive http://gara.bio.uci.edu * GO http://www.geneontology.org/ * Internet Brain Volume Database http://www.cma.mgh.harvard.edu/ibvd/ * ModelDB http://senselab.med.yale.edu/modeldb/ * Mouse Genome Informatics Transgenes ftp://ftp.informatics.jax.org/pub/reports/MGI_PhenotypicAllele.rpt * NCBI Taxonomy Browser http://www.ncbi.nlm.nih.gov/Taxonomy/taxonomyhome.html * NeuroMorpho.Org http://neuromorpho.org/neuroMorpho * NeuronDB http://senselab.med.yale.edu/neurondb * SciCrunch Registry http://neuinfo.org/nif/nifgwt.html?tab=registry * NIF Registry Automated Crawl Data http://lucene1.neuinfo.org/nif_resource/current/ * NITRC http://www.nitrc.org/ * Nuclear Receptor Signaling Atlas http://www.nursa.org * Olfactory Receptor DataBase http://senselab.med.yale.edu/ordb/ * OMIM http://omim.org * OpenfMRI http://openfmri.org * PeptideAtlas http://www.peptideatlas.org * RGD http://rgd.mcw.edu * SFARI Gene: AutDB https://gene.sfari.org/autdb/Welcome.do * SumsDB http://sumsdb.wustl.edu/sums/ * Temporal-Lobe: Hippocampal - Parahippocampal Neuroanatomy of the Rat http://www.temporal-lobe.com/ * The Cell: An Image Library http://www.cellimagelibrary.org/ * Visiome Platform http://platform.visiome.neuroinf.jp/ * WormBase http://www.wormbase.org * YPED http://medicine.yale.edu/keck/nida/yped.aspx * ZFIN http://zfin.org
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Raw sequencing data to "Comparative Analysis of Single-Cell RNA Sequencing Methods".
https://www.ncbi.nlm.nih.gov/pubmed/28212749
In addition to the GEO submission https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75790, you can find here raw bam files for UMI-methods tagged with cell barcode and UMI sequences.
MD5 checksum: f10825509952fffd9c4dc0c1dcb9eb8e
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
FooDrugs database is a development done by the Computational Biology Group at IMDEA Food Institute (Madrid, Spain), in the context of the Food Nutrition Security Cloud (FNS-Cloud) project. Food Nutrition Security Cloud (FNS-Cloud) has received funding from the European Union's Horizon 2020 Research and Innovation programme (H2020-EU.3.2.2.3. – A sustainable and competitive agri-food industry) under Grant Agreement No. 863059 – www.fns-cloud.eu (See more details about FNS-Cloud below) FooDrugs stores information extracted from transcriptomics and text documents for foo-drug interactiosn and it is part of a demonstrator to be done in the FNS-Cloud project. The database was built using MySQL, an open source relational database management system. FooDrugs host information for a total of 161 transcriptomics GEO series with 585 conditions for food or bioactive compounds. Each condition is defined as a food/biocomponent per time point, per concentration, per cell line, primary culture or biopsy per study. FooDrugs includes information about a bipartite network with 510 nodes and their similarity scores (tau score; https://clue.io/connectopedia/connectivity_scores) related with possible drug interactions with drugs assayed in conectivity map (https://www.broadinstitute.org/connectivity-map-cmap). The information is stored in eight tables: Table “study” : This table contains basic information about study identifiers from GEO, pubmed or platform, study type, title and abstract Table “sample”: This table contains basic information about the different experiments in a study, like the identifier of the sample, treatment, origin type, time point or concentration. Table “misc_study”: This table contains additional information about different attributes of the study. Table “misc_sample”: This table contains additional information about different attributes of the sample. Table “cmap”: This table contains information about 70895 nodes, compromising drugs, foods or bioactives, overexpressed and knockdown genes (see section 3.4). The information includes cell line, compound and perturbation type. Table “cmap_foodrugs”: This table contains information about the tau score (see section 3.4) that relates food with drugs or genes and the node identifier in the FooDrugs network. Table “topTable”: This table contains information about 150 over and underexpressed genes from each GEO study condition, used to calculate the tau score (see section 3.4). The information stored is the logarithmic fold change, average expression, t-statistic, p-value, adjusted p-value and if the gene is up or downregulated. Table “nodes”: This table stores the information about the identification of the sample and the node in the bipartite network connecting the tables “sample”, “cmap_foodrugs” and “topTable”. In addition, FooDrugs database stores a total of 6422 food/drug interactions from 2849 text documents, obtained from three different sources: 2312 documents from PubMed, 285 from DrugBank, and 252 from drugs.com. These documents describe potential interactions between 1464 food/bioactive compounds and 3009 drugs. The information is stored in two tables: Table “texts”: This table contains all the documents with its identifiers where interactions have been identified with strategy described in section 4. Table “TM_interactions”: This table contains information about interaction identifiers, the food and drug entities, and the start and the end positions of the context for the interaction in the document. FNS-Cloud will overcome fragmentation problems by integrating existing FNS data, which is essential for high-end, pan-European FNS research, addressing FNS, diet, health, and consumer behaviours as well as on sustainable agriculture and the bio-economy. Current fragmented FNS resources not only result in knowledge gaps that inhibit public health and agricultural policy, and the food industry from developing effective solutions, making production sustainable and consumption healthier, but also do not enable exploitation of FNS knowledge for the benefit of European citizens. FNS-Cloud will, through three Demonstrators; Agri-Food, Nutrition & Lifestyle and NCDs & the Microbiome to facilitate: (1) Analyses of regional and country-specific differences in diet including nutrition, (epi)genetics, microbiota, consumer behaviours, culture and lifestyle and their effects on health (obesity, NCDs, ethnic and traditional foods), which are essential for public health and agri-food and health policies; (2) Improved understanding agricultural differences within Europe and what these means in terms of creating a sustainable, resilient food systems for healthy diets; and (3) Clear definitions of boundaries and how these affect the compositions of foods and consumer choices and, ultimately, personal and public health in the future. Long-term sustainability of the FNS-Cloud will be based on Services that have the capacity to link with new resources and enable cross
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A total of 11 expression profiles comparing RA, OA and NTs samples were collected in this study. Their GEO accession number, PubMed ID, publish date, tissue type, expression platform and number of samples were listed.
Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev
Remark 2: See same data at: https://www.kaggle.com/datasets/alexandervc/scrnaseq-exposed-to-multiple-compounds extracted pieces from huge file here - more easy to load and work.
Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics
Data - scRNA expressions for several cell lines affected by drugs with different doses/durations.
The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.
The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell
The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.
The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds
Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"
A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136
Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6
Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles
(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)
Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)
Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Raw Data provided from Geo Database. Research Article: Identification of rheumatoid arthritis and osteoarthritis patients by transcriptome-based rule set generation Link: https://pubmed.ncbi.nlm.nih.gov/24690414/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundIn this research, an meta-analysis was performed for assessment of the associations between O6-methyguanine-DNA methyltransferase (MGMT) promoter hypermethylation possessing low-grade intraepithelial lesion (LSIL), high-grade intraepithelial lesion (HSIL), cervical cancer (CC), and clinicopathological characters of CC.MethodsLiterature selection were conducted through searching PubMed, Web of science, EMBASE, China National Knowledge Infrastructure and Wanfang databases (up to November 2018). An assessment of associations between MGMT methylation and LSIL, HSIL, CC risk and clinicopathological characteristics was performed through pooled odds ratios (ORs) with relevant 95% confidence intervals (CIs). Subgroup analyses, meta-regressions and Galbraith plots were conducted to conduct an exploration on the possible sources of heterogeneity. The genome-wide DNA methylation array studies were extracted from Gene Expression Omnibus (GEO) databases for validation of these outcomes.ResultsIn this meta-analysis of 25 published articles, MGMT hypermethylation gradually elevated the rates among control group (12.16%), LSIL (20.92%), HSIL (36.33%) and CC (41.50%) specimens. MGMT promoter methylation was significant associated with the increased risk of LSIL by 1.74-fold (P
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Protein-Protein, Genetic, and Chemical Interactions for Lu T (2023):Bioinformatics analysis and single-cell RNA sequencing: elucidating the ubiquitination pathways and key enzymes in lung adenocarcinoma. curated by BioGRID (https://thebiogrid.org); ABSTRACT: Lung adenocarcinoma (LUAD) is a prevalent subtype of lung cancer associated with high mortality rates. We aimed to utilize single-cell multiomics analysis to identify the key molecules involved in ubiquitination modification, which plays a role in LUAD development and progression.We use a systematic approach to analyze LUAD-related single-cell and bulk transcriptome datasets from Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. Single-cell RNA sequencing (scRNA-seq) data were normalized, clustered, and annotated with the Seurat package in R. InferCNV was used to distinguish malignant from epithelial cells, and AUCell evaluated the area under the curve (AUC) score of ubiquitination-related enzymes. Survival and differential analyses identified significant molecular markers associated with ubiquitination. PSMD14 expression was confirmed using reverse-transcription quantitative polymerase chain reaction (RT-qPCR) and Western blot assays, and its knockdown cell lines were assessed for effects on cellular processes and tumor formation in mice. PSMD14's interacting proteins were predicted, and its impact on AGR2 protein half-life and ubiquitination was evaluated. Rescue experiments involving PSMD14 overexpression and AGR2 silencing assessed their impact on malignant behaviors.By means of single-cell sequencing analysis, we probed the ubiquitination modification landscape in the LUAD microenvironment. Malignant cells had elevated scores for enzymes and ubiquitin-binding domains compared to normal epithelial cells, with 53 ubiquitination-related molecules showing prognostic disparities. FGR, PSMD14, and ZBTB16 were identified as genes with prognostic significance, with PSMD14 showing higher expression in epithelial and malignant cells. Two missense mutation sites were identified in PSMD14, which had a high copy number amplification ratio and positive correlation with messenger RNA (mRNA) expression. PSMD14 expression and tumor stage were found to be independent prognostic factors, and interfering with PSMD14 expression reduced the malignant behavior of LUAD cells. PSMD14 was found to bind to AGR2 protein and reduce its ubiquitination, leading to increased AGR2 stability. Knockdown of AGR2 inhibited the enhancement of cell viability, invasion, and migration resulting from PSMD14 overexpression.This study examined ubiquitination modifications in LUAD using sequencing data, identifying PSMD14's critical role in malignancy regulation and its potential as a prognostic and therapeutic biomarker. These insights enhance understanding of LUAD mechanisms and treatment.
Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (csv file is vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics
Paper: "Droplet barcoding for single cell transcriptomics applied to embryonic stem cells" Cell. 2015 May 21;161(5):1187-1201. doi: 10.1016/j.cell.2015.04.044. Allon M Klein 1, Linas Mazutis 2, Ilke Akartuna 3, Naren Tallapragada 1, Adrian Veres 4, Victor Li 1, Leonid Peshkin 1, David A Weitz 5, Marc W Kirschner https://pubmed.ncbi.nlm.nih.gov/26000487/ Data: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE65525 Or: https://hemberg-lab.github.io/scRNA.seq.datasets/mouse/esc/
Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6
Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description:
A set of transcriptomics studies on the Gene Expression Omnibus (GEO) platform somehow related to infectious or neurodegenerative diseases.
Columns:
Column names followed by "_QID" hold the Wikidata IDs relative to that column.
Google Sheets:
https://docs.google.com/spreadsheets/d/1LjF4h8n6Sy4PgTJoC-fJ7mGnqGCvqmWiatf2zD-5RM8
Funding:
This curation and release were supported by the grants #2018/10257-2 and #2019/26284-1 from the São Paulo Research Foundation (FAPESP).