Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the key challenges for transcriptomics-based research is not only the processing of large data but also modeling the complexity of features that are sources of variation across samples, which is required for an accurate statistical analysis. Therefore, our goal is to foster access for wet lab researchers to bioinformatics tools, in order to enhance their ability to explore biological aspects and validate hypotheses with robust analysis. In this context, user-friendly interfaces can enable researchers to apply computational biology methods without requiring bioinformatics expertise. Such bespoke platforms can improve the quality of the findings by allowing the researcher to freely explore the data and test a new hypothesis with independence. Simplicity DiffExpress is a data-driven software platform dedicated to enabling non-bioinformaticians to take ownership of the differential expression analysis (DEA) step in a transcriptomics experiment while presenting the results in a comprehensible layout, which supports an efficient results exploration, information storage, and reproducibility. Simplicity DiffExpress’ key component is the bespoke statistical model validation that guides the user through any necessary alteration in the dataset or model, tackling the challenges behind complex data analysis. The software utilizes edgeR, and it is implemented as part of the SimplicityTM platform, providing a dynamic interface, with well-organized results that are easy to navigate and are shareable. Computational biologists and bioinformaticians can also benefit from its use since the data validation is more informative than the usual DEA resources. Wet-lab collaborators can benefit from receiving their results in an organized interface. Simplicity DiffExpress is freely available for academic use, and it is cloud-based (https://simplicity.nsilico.com/dea).
Facebook
Twitterhttps://www.immport.org/agreementhttps://www.immport.org/agreement
A number of factors influence vaccination effectiveness, including age, sex, and comorbidities. A transcriptome analysis was performed via RNA sequencing. The genes with immunological functions are increased in expression in individuals with high pre-existing immunity. Based on the transcriptome analysis, the set of genes can be used to predict a vaccine response.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EdgeR results from MMGs. Differential expression results calculated by edgeR for MMG counts produced by the stage 2 analysis. Can be downloaded from [43]. (XLSX 428 kb)
Facebook
TwitterFigure S1, Venn diagram showing the number of differentially expressed genes identified by two versions of Cuffdiff2. Figure S2, The effects of biological replicates on the differential expression analysis for Cuffdiff v2.0.2. Figure S3, The detected fold changes of all the differentially expressed genes identified by three tools were compared and shown, including DESeq vs. edgeR (top panel), DESeq vs. Cuffdiff2 (middle panel) and edgeR vs. Cuffdiff2 (bottom panel). File S1, Analysis pipelines, methods and examples of commands for differential expression analysis, subsampling fastq files and generating SAM/BAM files based on simulated count values. File S2, The raw count values for genes with high fold changes were picked up by edgeR but not by DESeq. Genes with high fold changes (the absolute value of log2 fold changes larger than 2) identified as DEGs by edgeR but not by DESeq are listed in the file. The gene ID, the log2 fold changes (logFC) and FDR from DESeq, the logFC and FDR from edgeR, the raw count values for the four replicates of sample K (K1–K4) and sample N (N1–N4) are shown in each of the columns. Table S1, Numbers of reads for the human hbr and uhr samples from the MAQC dataset. Table S2, Numbers of reads for the mouse neurosphere samples for treatment groups of K and N (the K_N dataset). Table S3, The number of reads for each individual sample of the LCL3 dataset. Table S4, The definition for TP, FP, TN, FN, TPR and FPR. Table S5, The false positive rate for Cuffdiff2, DESeq and edgeR based on the LCL1 dataset. (ZIP)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Warden and Wu Preprint: v1
In general, this primarily focuses on the following types of comparisons:
Cell line experiments with over-expression or knock-down to define a known causal gene, with processing starting with public reads.
Processed TCGA (The Cancer Genome Atlas) data for breast cancer (BRCA) to compare gene expression by immunohistochemistry status (ER/ESR1, PR/PGR, or HER2/ERBB2).
Differential expression methods include the following:
edgeR (GLM)
edgeR-robust (GLM)
edgeR (QL)
edgeR-robust (QL)
DESeq1
DESeq2
limma-voom
limma-trend (CPM)
limma-trend (FPKM/RPKM)
ANOVA (log2 FRPKM/RPKM)
The most common preprocessing strategies include STAR, TopHat2, and Salmon. However, a limited amount of additional processing with HISAT2, kallisto, Bowtie2 (+eXpress), and Bowtie1 (+RSEM) is also provided.
Most STAR and TopHat2 alignments use htseq-count for quantification, as well as running cuffdiff (for single variable 2-group comparisons). However, a limited amount of additional processing with featureCounts is also provided.
Most STAR and TopHat2 alignments start with the public forward reads, even if paired-end data was available.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Tables OverviewThe supplementary tables provide detailed results from our scRNA-seq analysis. Table S1 outlines cluster membership and cell type annotations across three major immune and non-immune subsets. Table S2 lists top marker genes per cluster based on distinctiveness of gene expression profile. Table S3 summarizes the distribution of biological replicates across conditions and comparisons. Table S4 presents differential expression results (edgeR) for each cluster within four comparison pairs. Table S5 reports significantly enriched KEGG pathways identified via GSEA using ranked gene lists. Together, these tables support cell type classification, condition-specific expression changes, and functional interpretation of the dataset.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data set 1. Transcript expression across human RNA-Seq samples: estimated read counts. The file contains estimated read counts, generated by kallisto (https://pachterlab.github.io/kallisto/), for human transcripts and RNA-Seq samples used in this study (see Additional file 2 of the accompanying publication). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. Ensembl transcript identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively. Data set 2. Transcript expression across murine RNA-Seq samples: estimated read counts. As in Data set 1, but for mouse transcripts. Data set 3. Transcript expression across simian RNA-Seq samples: estimated read counts. As in Data set 1, but for chimpanzee transcripts. Data set 4. Transcript expression across across human RNA-Seq samples: estimated transcript abundances. As in Data set 1, but instead of read counts, transcript abundances in transcripts per million (TPM), as estimated by kallisto (https://pachterlab.github.io/kallisto/), are listed. Format, column and row names as in Data set 1. Data set 5. Transcript expression across murine RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for mouse transcripts. Data set 6. Transcript expression across simian RNA-Seq samples: estimated transcript abundances. As in Data set 4, but for chimpanzee transcripts. Data set 7. Differential expression analyses across human RNA-Seq sample groups: log fold changes. The file contains log fold changes, inferred by edgeR (http://bioconductor.org/packages/release/bioc/html/edgeR.html), for human genes and the RNA-Seq sample group contrasts listed in Additional file 3 of the accompanying publication in a compressed (GZIP) TSV gene-by-comparison matrix. Ensembl gene identifiers and a descriptive contrast identifier serve as row and column names, respectively. Data set 8. Differential expression analyses across murine RNA-Seq sample groups: log fold changes. As in Data set 7, but for mouse genes. Data set 9. Differential expression analyses across simian RNA-Seq sample groups: log fold changes. As in Data set 7, but for chimpanzee genes. Data set 10. Differential expression analyses across human RNA-Seq sample groups: false discovery rates. The file contains false discovery rates (FDR) for the differential expression analyses summarized in Data set 7. Format, column and row names as in Data set 7. Data set 11. Differential expression analyses across murine RNA-Seq sample groups: false discovery rates. As in Data set 10, but for mouse genes. Data set 12. Differential expression analyses across simian RNA-Seq sample groups: false discovery rates. As in Data set 10, but for chimpanzee genes. Data set 13. Quantification of alternative splicing events across human RNA-Seq samples. The file contains ‘percent spliced in’ (PSI) values computed by SUPPA (https://github.com/comprna/SUPPA) for annotated alternative splicing events (inferred from the transcript annotation of the human genome, Ensembl release 84; http://www.ensembl.org/). The format is a compressed (GZIP) tab-separated transcript-by-sample matrix. SUPPA-provided event identifiers and a combined Sequence Read Archive study/sample name identifier serve as row and column names, respectively. Data set 14. Quantification of alternative splicing events across murine RNA-Seq samples. As in Data set 13, but for mouse alternative splicing events. Data set 15. Differential splicing analyses across human RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). The file contains ΔPSI values for human alternative splicing events (as in Data set 13). The RNA-Seq sample group contrasts are listed in Additional file 3 of the accompanying publication. Values were inferred by SUPPA’s diffSplice functionality (https://github.com/comprna/SUPPA). The format is a compressed (GZIP) tab-separated gene-by-comparison matrix. SUPPA event identifiers and a descriptive contrast identifier serve as row and column names, respectively. Data set 16. Differential splicing analyses across murine RNA-Seq sample groups: differences in ‘percent spliced in’ (ΔPSI). As in Data set 15, but for mouse alternative splicing events. Data set 17. Differential splicing analyses across human RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of human alternative splicing events summarized in Data set 15. Format, column and row names as in Data set 15. Data set 18. Differential splicing analyses across murine RNA-Seq sample groups: P values. The file contains P values for the differential splicing analysis of mouse alternative splicing events summarized in Data set 16. Format, column and row names as in Data set 15. Data set 19. Transcript expression across murine RNA-Seq time course data: estimated read counts. As in Data set 2, but for the time course data generated for the accompanying publication. Data set 20. Trans
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Becker muscular dystrophy (BMD) is a rare X-linked recessive neuromuscular disorder, frequently caused by in-frame deletions in the DMD gene that result in the production of a truncated, yet functional, dystrophin protein. The consequences of BMD-causing in-frame deletions on the organism are difficult to predict, especially in regard to long-term prognosis. Here, we used CRISPR-Cas9 to generate a new Dmd Δ52-55 mouse model by deleting exons 52-55 in the Dmd gene, resulting in a BMD-like in-frame deletion. To delineate the long-term effects of this deletion, we studied these mice over 52 weeks by performing histology and echocardiography analyses and assessing motor functions. To further delineate the effects of the exons 52-55 in-frame deletion, we performed RNA-Seq pre- and post-exercise and identified several differentially expressed pathways that could explain the abnormal muscle phenotype observed at 52 weeks in the BMD model.
This dataset shows the results and raw data of the RNA-sequencing and transcriptomic analysis for 52-week-old exercised and non-exercised mice (4 BMD, 4 WT and 4 DMD, as mentioned on the names of each file).
1. Due to size restrictions, this RNA-Seq dataset will be published on Zenodo in 3 parts. This first part contains the data for the exercised mice, including the fastq (R1 and R2) and associated (md5) files for the 4 BMD mice (15315-15318) and 2 DMD mice (15319 and 15320), all the raw gene counts (txt files), and all the differentially expressed genes (tsv files).
Workflow (performed by TCAG at SickKids):
2. RNA-Seq Library and Reference Genome Information
Type of library: stranded, paired end
Genome reference sequence: GRCm39, M31 Gencode gene models.
3. Read Pre-processing, Alignment and Obtaining Gene Counts
3.1 Read Pre-processing
The sequencing data is in FASTQ format. The quality of the data is assessed using FastQC v.0.11.5 (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/).
Adaptors are trimmed using Trim Galore (http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/) v. 0.5.0. Trim Galore is running Cutadapt (https://cutadapt.readthedocs.org/en/stable/) v. 1.10. Trim Galore is run with the following parameters:
-q 25 – the reads are trimmed from the 3' end base by base, trimming stops if the quality of the base is greater than 25;
--clip_R1 6, --clip_R2 6 – clip the first 6 nucleotides from the 5' ends of read 1 and read 2;
--stringency 5 – at least 5 nucleotides overlap with the Illumina primer sequence are needed for trimming;
--length 40 – any read that is shorter than 40 nucleotides as a result of trimming is discarded;
--paired – only pairs of reads are retained (for paired-end reads only, not for single reads).
The type of adaptor is automatically detected by screening the first 1 million sequences of the first specified file for the first 12/13 nucleotides of the standard Illumina or Nextera primers and the sequence from the start of the primer to the 3' end of the read is trimmed.
The quality of the trimmed reads is re-assessed with FastQC.
The trimmed reads are also screened for presence of rRNA and mtRNA sequences using FastQ-Screen v.0.10.0 (http://www.bioinformatics.babraham.ac.uk/projects/fastq_screen/).
To assess the read distribution, positional read duplication and to confirm the strandedness of the alignments we use the RSeQC package (http://rseqc.sourceforge.net/), v. 2.6.2. The distribution of reads across exonic, intronic and intergenic sequences is assessed by the read_distribution.py program, infer_experiment.py is used for confirming strandedness, and read_duplication.py is used to obtain the positional read duplication (percentage of reads mapping to exactly the same genomic location). Sufficient proportion of reads should map to the exonic sequences (ideally > 70-80%). Large amounts of reads mapping to intronic sequences in a poly-A mRNA library will suggest significant presence of pre-mRNA or other issues with RNA preparation. For stranded RNA-seq experiments the majority of the reads should map exclusively to one strand, same or opposite to the transcript, depending on the library preparation method. For non-stranded experiments the reads should be equally distributed to both strands.
3.2. Read Alignment
The raw trimmed reads are aligned to the reference genome using the STAR aligner, v.2.6.0c. (https://github.com/alexdobin/STAR, https://academic.oup.com/bioinformatics/article/29/1/15/272537). The alignments are contained in the .bam files. The “.bam” together with the “.bai” files can be used for viewing of the alignments in the Integrative Genomics Viewer (IGV, http://software.broadinstitute.org/software/igv/).
3.3. Obtaining Gene Counts
The filtered STAR alignments are processed to extract raw read counts for genes using htseq-count v.0.6.1p2 (HTSeq, http://www-huber.embl.de/users/anders/HTSeq/doc/overview.html). Assigning reads to genes by htseq-count is done in the mode “intersection_nonempty”, i.e. if a read overlaps with two overlapping genes and the overlap to gene A is greater than the overlap to gene B, the read is counted towards gene A, while if a read overlaps equally with gene A and gene B, then it is not counted towards either gene. Htseq_count does not count reads with multiple alignments to avoid introducing bias in the expression results. Only uniquely mapping reads are counted.
4. Pre-processing, Alignment and Gene Counts QC
MultiQC (https://multiqc.info/) is a reporting tool that aggregates statistics generated by bioinformatics analyses across multiple samples. MultiQC v. 1.14 was used to generate a consolidated report from FastQC screening of both untrimmed and trimmed reads, and from RSeQC, FastQ Screen, STAR and htseq-count results. The MultiQC report is contained in MultiQC_Report_*.html file.
5. DGE Analysis with edgeR
Differential expression was done with the edgeR R package v.3.28.1, using R v.3.6.1 (http://www.bioconductor.org/packages/release/bioc/html/edgeR.html). The data set was filtered to retain only genes whose gene counts were >50 in at least 3 samples. This is intended to remove genes that are notexpressed, or expressed at a very low level.
The method used for normalizing the data was TMM, implemented by the calcNormFactors(y) function. All samples were normalized and filtered together. The glmLRT functionality in edgeR was used for the differential expression tests, with sample group taken into account.
EdgeR Results Legend:
· GeneID – Ensembl Gene ID;
· Chr.Start.End - gene coordinates;
· GeneName, GeneType, etc. – Gene attributes, derived from the genome annotation;
· logFC - Log2 Fold Change (use this column for selection of DEGs);
· logCPM - Log2 Counts Per Million, average for all libraries;
· LR – Statistic calculated by the LR-Test;
· PValue - Differential expression P value;
· FDR – Differential expression False Discovery Rate, calculated by the Benjamini-Hochberg method (use this column for selection of DEGs);
· (columns labeled with sample names) – Fragments Per Kilobase of transcript per Million mapped reads (FPKMs) for the given samples.
Facebook
TwitterSupplementary table 2, containing full results from edgeR RNA-sequencing differential expression analysis of B. malayi microfilariae controls compared to treatment with tetracyclines, used in the associated manuscript for drawing conclusions. Full details on data generation are available in the related manuscript.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
EdgeR results from unique counts. Differential expression results calculated by edgeR for gene counts produced by the stage 1 analysis. Can be downloaded from [43]. (XLSX 2159 kb)
Facebook
TwitterMitochondrial functions are intimately reliant on proteins and RNAs encoded in both the nuclear and mitochondrial genomes, leading to inter-genomic coevolution within taxa. Hybridization can break apart coevolved mitonuclear genotypes, resulting in decreased mitochondrial performance and reduced fitness. This hybrid breakdown is an important component of outbreeding depression and early-stage reproductive isolation. However, the mechanisms contributing to mitonuclear interactions remain poorly resolved. Here we scored variation in developmental rate (a proxy for fitness) among reciprocal F2 inter-population hybrids of the intertidal copepod Tigriopus californicus, and used RNA sequencing to assess differences in gene expression between fast- and slow-developing hybrids. In total, differences in expression associated with developmental rate were detected for 2,925 genes, whereas only 135 genes were differentially expressed as a result of differences in mitochondrial genotype. Up-regulate..., Developmental rate data - collected by daily monitoring naupliar (larval) development of individual Tigriopus californicus until stage 1 copepodid metamorphosis was observed RNA-seq count data - collected by isolated RNA from pools of fast- or slow-developing Tigriopus californicus copepodids from reciprocal hybrids lines; RNA was sequenced on a NovaSeq 6000; reads were mapped to a hybrid reference genome and counted using STAR and featureCounts; differences in gene expression were tested with edgeR; gene ontology enrichments among differentially expressed genes were assessed with goseq., All files can be opened in text editors, spreadsheet programs or the statistical software R.
Facebook
TwitterSample type: SRA
Source name: bone marrow derived
Organism: Mus musculus
Characteristics
strain: C57BL/6
Sex: male
age: 8 to 10 weeks
Growth protocol: After 1 day in 0.6ng/ml CSF1 in alpha+ MEM /15% FCS, non-adherent cells were incubated for two days in a fresh dish containing 12ng/ml CSF1 in alpha+ MEM /10% FCS and then for 7 days in 120ng/ml CSF1 in alpha+ MEM /10% FCS. Cells were incubated for a further two days in fresh alpha+ MEM /10% FCS containing 120ng/ml CSF1 and 20ng/ml IL4.
Extracted molecule: total RNA
Extraction protocol: mRNA was harvested using RNeasy kit( QIAGEN) with DNase treatment on column. 1 ug of total RNA was used for the construction of sequencing libraries.
RNA libraries were prepared for sequencing using standard Ion Torrent protocols
Library strategy: RNA-Seq Library source: transcriptomic Library selection: cDNA Instrument model: Ion Torrent S5
Description: IL4_48h_M26_BMM Wt_vs_IL4_allprobes_reads.txt Wt_vs_IL4_allprobes_log2_RPM.txt Data processing: Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence- returning a fastq file (raw data) Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Ampliseq Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence. Returning a fastq file (raw data) of reads associated with each of the 16000 barcoded primer pairs. Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Genome_build: Genome Reference Consortium mouse genome (GRCm39.p6) Supplementary_files_format_and_content: tab-delimited text files include reads or log2 RPM for each sample showing all genes or differential expression between conditions.
Facebook
Twitterhttps://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Meloidogyne incognita protein-coding genes differentially expressed at four stages during the parasitic life cycle: - eggs - pre-parasitic second stage juvenile (J2) - mix of parasitic second, third and fourth stage juveniles (J3) - adult females The genes show significantly differential expression between the stages according to three methods EBseq + EdgeR + DEseq2. The following thresholds were used to consider differential expression as significant: - Log2 fold change >2 - false discovery rate (fdr) <0.05 The six first files correspond to the six comparisons between the four developmental life stages as follows: eggs--(1)-->J2--(2)-->J3--(3)-->female--(4)-->eggs With the two remaining comparisons being: J2--(5)-->female eggs--(6)-->J3 Each of these files are in tab-separated values format (tsv) and present expression values and statistics of DEseq2, EdgeR and EBseq (from Rsem). A seventh file in XLSX Excel format groups all the above information and adds some statistics on genes known to be specifically expressed in dorsal gland (DG) or subventral gland (SvG) cells. Finally, the last files provide a graphical representation of the genes differentially expressed during the M. incognita developmental life cycle as well the gene ontology (GO) 'Molecular Function' terms overrepresented at the different transitions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.
R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html
Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Facebook
TwitterDetermining the mechanistic and genetic basis of animal coloration is essential to understand the costs and constraints on colour production, and the evolution and maintenance of phenotypic variation. However, genes underlying structural colour and widespread pigment classes apart from melanin remain largely uncharacterised, in part due to restricted taxonomic focus. We combined liquid chromatography-mass spectrometry and RNA-seq gene expression analyses to characterise the pigments and genes associated with skin colour in the polymorphic lizard, Ctenophorus decresii. Throat coloration in male C. decresii may be a combination of orange, yellow, grey or ultra-violet blue. We confirmed the presence of two biochemically different pigment classes, pteridines (self-synthesised) and carotenoids (acquired through the diet), in all skin colours. Orange skin had the highest levels of pteridine pigments while yellow skin tended to have higher levels of carotenoids, of which the vitamin A precurso...
Facebook
TwitterTranscriptome Shotgun Sequencing (RNA-seq) has been readily embraced by geneticists and molecular ecologists alike. As with all high-throughput technologies, it is critical to understand which analytic strategies are best suited and which parameters may bias the interpretation of the data. Here we use a comprehensive simulation approach to explore how various features of the transcriptome (complexity, degree of polymorphism π, alternative splicing), technological processing (sequencing error ε, library normalization) and bioinformatic workflow (de novo vs. mapping assembly, reference genome quality) impact transcriptome quality and inference of differential gene expression (DE). We find that transcriptome assembly and gene expression profiling (edgeR vs. baySeq software) works well even in the absence of a reference genome, and is robust across a broad range of parameters. We advise against library normalization, and in most situations advocate mapping assemblies to an annotated genome ...
Facebook
TwitterSample type: SRA
Source name: bone marrow derived
Organism: Mus musculus
Characteristics
strain: C57BL/6
Sex: male
age: 8 to 10 weeks
Growth protocol: After 1 day in 0.6ng/ml CSF1 in alpha+ MEM /15% FCS, non-adherent cells were incubated for two days in a fresh dish containing 12ng/ml CSF1 in alpha+ MEM /10% FCS and then for 7 days in 120ng/ml CSF1 in alpha+ MEM /10% FCS. Cells were incubated for a further two days in fresh alpha+ MEM /10% FCS containing 120ng/ml CSF1 and 20ng/ml IL4.
Extracted molecule: total RNA
Extraction protocol: mRNA was harvested using RNeasy kit( QIAGEN) with DNase treatment on column. 1 ug of total RNA was used for the construction of sequencing libraries.
RNA libraries were prepared for sequencing using standard Ion Torrent protocols
Library strategy: RNA-Seq Library source: transcriptomic Library selection: cDNA Instrument model: Ion Torrent S5
Description: IL4_48h_M10_BMM Wt_vs_IL4_allprobes_reads.txt Wt_vs_IL4_allprobes_log2_RPM.txt Data processing: Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence- returning a fastq file (raw data) Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Ampliseq Torrent Suite Software 5.10 used for basecalling and sequenced reads were trimmed for adaptor sequence, and masked for low-complexity or low-quality sequence. Returning a fastq file (raw data) of reads associated with each of the 16000 barcoded primer pairs. Reads were then mapped to the GRCm38.p6 genome using the open source Hisat2-2.0.5 aligner. The Hisat2 generated BAM files were uploaded into SeqMonk (version 1.42) with minimum mapping quality set to 60 The edgeR platform, within SequeMonk, was uesed to generate lists of differential gene expression from the raw reads as is required in analysis of negative binomial distributions Tab-delimited text files of all genes and differentially expressed genes (at p<0.05, p<0.01 and p<0.001) showing raw reads or log2 RPM were output (processed files) Genome_build: Genome Reference Consortium mouse genome (GRCm39.p6) Supplementary_files_format_and_content: tab-delimited text files include reads or log2 RPM for each sample showing all genes or differential expression between conditions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Differential gene expression analysis between HFD and LFD mice for pancreatic alpha cells using EdgeR.
Facebook
TwitterPositive: higher expression in second condition. Negative: lower expression in second condition. Differential expression analysis was conducted in edgeR [70].
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains data and R script for analysis on life history and transcriptomic responses to single dimensional changes in resource (chickpea-27°C) and temperature (cowpea-35°C) and multi-dimensional environmental changes in resource and temperature (chickpea-35°C) in a pest beetle, Callosobruchus maculatus (control treatment = cowpea-27°C). Dataset contains life history data collected in laboratory conditions (tab 1), logFC data (RNA-sequencing; Novogene Co. Ltd.) for Spearman rank correlation tests between treatments (tabs 2-4), read count data (RNA-sequencing; Novogene Co. Ltd.) for differential expression analysis using edgeR (R1-5 = Four samples at cowpea-27°C; R7-12 = Four samples at cowpea-35°C; R13-17 = Four samples at chickpea-27°C; R25-28 = Four samples at chickpea-35°C; tab 5) and edgeR output data for plotting in R (tab 6).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
One of the key challenges for transcriptomics-based research is not only the processing of large data but also modeling the complexity of features that are sources of variation across samples, which is required for an accurate statistical analysis. Therefore, our goal is to foster access for wet lab researchers to bioinformatics tools, in order to enhance their ability to explore biological aspects and validate hypotheses with robust analysis. In this context, user-friendly interfaces can enable researchers to apply computational biology methods without requiring bioinformatics expertise. Such bespoke platforms can improve the quality of the findings by allowing the researcher to freely explore the data and test a new hypothesis with independence. Simplicity DiffExpress is a data-driven software platform dedicated to enabling non-bioinformaticians to take ownership of the differential expression analysis (DEA) step in a transcriptomics experiment while presenting the results in a comprehensible layout, which supports an efficient results exploration, information storage, and reproducibility. Simplicity DiffExpress’ key component is the bespoke statistical model validation that guides the user through any necessary alteration in the dataset or model, tackling the challenges behind complex data analysis. The software utilizes edgeR, and it is implemented as part of the SimplicityTM platform, providing a dynamic interface, with well-organized results that are easy to navigate and are shareable. Computational biologists and bioinformaticians can also benefit from its use since the data validation is more informative than the usual DEA resources. Wet-lab collaborators can benefit from receiving their results in an organized interface. Simplicity DiffExpress is freely available for academic use, and it is cloud-based (https://simplicity.nsilico.com/dea).