Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of similar but different presentations I've made aimed at introducing bioinformatics to bench biologists.
Facebook
Twitter“Bioinformatics: Introduction and Methods,” a Bilingual Massive Open Online Course (MOOC) as a New Example for Global Bioinformatics Education
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
"Synthetic protein dataset with sequences, physical properties, and functional classification for machine learning tasks."
This synthetic dataset was created to explore and develop machine learning models in bioinformatics. It contains 20,000 synthetic proteins, each with an amino acid sequence, calculated physicochemical properties, and a functional classification.
While this is a simulated dataset, it was inspired by patterns observed in real protein datasets, such as: - UniProt: A comprehensive database of protein sequences and annotations. - Kyte-Doolittle Scale: Calculations of hydrophobicity. - Biopython: A tool for analyzing biological sequences.
This dataset is ideal for: - Training classification models for proteins. - Exploratory analysis of physicochemical properties of proteins. - Building machine learning pipelines in bioinformatics.
The dataset is divided into two subsets:
- Training: 16,000 samples (proteinas_train.csv).
- Testing: 4,000 samples (proteinas_test.csv).
This dataset was inspired by real bioinformatics challenges and designed to help researchers and developers explore machine learning applications in protein analysis.
Facebook
TwitterModules showing how the NCBI database classifies and organizes information on DNA sequences, evolutionary relationships, and scientific publications. And a module working to identify a nucleotide sequence from an insect endosymbiont by using BLAST
Facebook
TwitterPresentation on teaching introductory bioinformatics with Jupyter notebook-based active learning at the 2019 Great Lakes Bioinformatics Conference
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Syllabus of the MOOC “Bioinformatics: Introduction and Methods.”
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and conda software environment file for the chapter 'Introduction to R and the Tidyverse' of the SPAAM Community's textbook: Introduction to Ancient Metagenomics (https://www.spaam-community.org/intro-to-ancient-metagenomics-book).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Familiarity with genome-scale data and the bioinformatic skills to analyze it have become essential for understanding and advancing modern biology and human health, yet many undergraduate biology majors are never exposed to hands-on bioinformatics. This paper presents a module that introduces students to applied bioinformatic analysis within the context of a research-based microbiology lab course. One of the most commonly used genomic analyses in biology is resequencing: determining the sequence of DNA bases in a derived strain of some organism, and comparing it to the known ancestral genome of that organism to better understand the phenotypic differences between them. Many existing CUREs — Course Based Undergraduate Research Experiences — evolve or select new strains of bacteria and compare them phenotypically to ancestral strains. This paper covers standardized strategies and procedures, accessible to undergraduates, for preparing and analyzing microbial whole-genome resequencing data to examine the genotypic differences between such strains. Wet-lab protocols and computational tutorials are provided, along with additional guidelines for educators, providing instructors without a next-generation sequencing or bioinformatics background the necessary information to incorporate whole-genome sequencing and command-line analysis into their class. This module introduces novice students to running software at the command-line, giving them exposure and familiarity with the types of tools that make up the vast majority of open-source scientific software used in contemporary biology. Completion of the module improves student attitudes toward computing, which may make them more likely to pursue further bioinformatics study.
Facebook
TwitterContemporary biology is moving towards heavy reliance on computational methods to manage, find patterns, and derive meaning from large-scale data, such as genomic sequences. Biology teachers are increasingly compelled to prepare students with skills to meet these challenges. However, introducing biology students to more abstract concepts associated with computational thinking remains a major challenge. Analogies have long been used in science classrooms to help students comprehend complex concepts by relating them to familiar processes. Here I present a multi-step procedure for introducing students to large-scale data analysis (bioinformatics workflows) by asking them to describe a common daily task: making toast. First, students describe the main steps associated with this procedure. Next, students are presented with alternative scenarios for materials and equipment and are asked to extend the analogy to accommodate them. Finally, students are led through examples of how the analogy breaks down, or fails to accurately represent, a bioinformatics analysis. This structured approach to student exploration of analogies related to computational biology capitalizes on diverse student experiences to both clarify concepts and ameliorate possible misconceptions. Similar methods can be used to introduce many abstract concepts in both biology and computer science.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data and conda software environment file for the chapter 'Introduction to the Command Line' of the SPAAM Community's textbook: Introduction to Ancient Metagenomics (https://www.spaam-community.org/intro-to-ancient-metagenomics-book).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is about book subjects. It has 1 row and is filtered where the books is Statistical methods in bioinformatics : an introduction. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introductory slides for the UCSC Genome Browser. Part of a set of materials available for training on the UCSC tools. Also available is a recording of the same material as a video. Exercises to practice additional skills can also be used for the training. The full training suite is available: http://openhelix.com/ucsc and there is an additional set of materials with more advanced topics: http://www.openhelix.com/ucscadv . BTW: there is a full script in the "notes" area of the slides, but that is not visible in the viewer.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The key role of bioinformatics in explaining biological phenomena calls for the need to rethink didactic approaches at high school aligned with a new scientific reality. Despite several initiatives to introduce bioinformatics in the classroom, there is still a lack of knowledge on their impact on students’ learning gains, engagement, and motivation. In this study, we detail the effects of four bioinformatics laboratories tailored for high school biology classes named “Mining the Genome: Using Bioinformatics Tools in the Classroom to Support Student Discovery of Genes” on literacy, interest, and attitudes on 387 high school students. By exploring these laboratories, students get acquainted with bioinformatics and acknowledge that many bioinformatics tools can be intuitive for beginners. Furthermore, introducing comparative genomics in their learning practices contributed for a better understanding of curricular contents regarding the identification of genes, their regulation, and how to make evolutionary assumptions. Following the intervention, students were able to pinpoint bioinformatics tools required to identify genes in a genomics sequence, and most importantly, they were able to solve genomics-related misconceptions. Overall, students revealed a positive attitude regarding the integration of bioinformatics-based approaches in their learning practices, reinforcing their added value in educational approaches.
Facebook
TwitterThe table provides a short description of the major components of the model employed by each course, highlighting any differences between the two (deviations are indicated by an asterisk (*)).
Facebook
TwitterIntroductory bioinformatics exercises often walk students through the use of computational tools, but often provide little understanding of what a computational tool does "under the hood." A solid understanding of how a bioinformatics computational algorithm functions, including its limitations, is key for interpreting the output in a biologically relevant context. This introductory bioinformatics exercise integrates an introduction to web-based sequence alignment algorithms with models to facilitate student reflection and appreciation for how computational tools provide similarity output data. The exercise concludes with a set of inquiry-based questions in which students may apply computational tools to solve a real biological problem.
In the module, students first define sequence similarity and then investigate how similarity can be quantitatively compared between two similar length proteins using a Blocks Substitution Matrix (BLOSUM) scoring matrix. Students then look for local regions of similarity between a sequence query and subjects within a large database using Basic Local Alignment Search Tool (BLAST). Lastly, students access text-based FASTA-formatted sequence information via National Center for Biotechnology Information (NCBI) databases as they collect sequences for a multiple sequence alignment using Clustal Omega to generate a phylogram and evaluate evolutionary relationships. The combination of diverse, inquiry-based questions, paper models, and web-based computational resources provides students with a solid basis for more advanced bioinformatics topics and an appreciation for the importance of bioinformatics tools across the discipline of biology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vampirium setup testing
This archive contains materials (datasets, exercises and slides, etc) used for the Introduction to bulk RNAseq analysis workshop taught at the University of Copenhagen by the Center for Health Data Science (HeaDS). The course repo can be found on Github:
Assignments.zip contains exercises for the preprocessing part of the course, like fastqc and multiqc examples of bulk RNAseq experiments
Data.zip contains count matrices (both traditional counts and salmon pseudocounts), as well as sample metadata (samplesheet.csv) and backup results from the preprocessing pipeline.
Notes.zip contains supplementary materials such as extra pdfs for more information on bulk RNAseq technology.
Slides and raw_reads will be released in a later version.
Slides.zip contains all the slides used in the workshop.
Raw_reads.zip contains the raw reads from the bulk RNAseq experiment (10.1016/j.celrep.2014.10.054) used in this course.
Facebook
TwitterIn order to introduce students to the concept of molecular diversity, we developed a short, engaging online lesson using basic bioinformatics techniques. Students were introduced to basic bioinformatics while learning about local on-campus species diversity by 1) identifying species based on a given sequence (performing Basic Local Alignment Search Tool [BLAST] analysis) and 2) researching and documenting the natural history of each species identified in a concise write-up. To assess the student’s perception of this lesson, we surveyed students using a Likert scale and asking them to elaborate in written reflection on this activity. When combined, student responses indicated that 94% of students agreed this lesson helped them understand DNA barcoding and how it is used to identify species. The majority of students, 89.5%, reported they enjoyed the lesson and mainly provided positive feedback, including “It really opened my eyes to different species on campus by looking at DNA sequences”, “I loved searching information and discovering all this new information from a DNA sequence”, and finally, “the database was fun to navigate and identifying species felt like a cool puzzle.” Our results indicate this lesson both engaged and informed students on the use of DNA barcoding as a tool to identify local species biodiversity.
Primary Image: DNA Barcoded Specimens. Crane fly, dragonfly, ant, and spider identified using DNA barcoding.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This record contains the data files used in exercises in the NBIS course "Introduction to Data Management Practices".
Facebook
TwitterPrognostic biomarkers are of great significance to predict the outcome of patients with cancer, to guide the clinical treatments, to elucidate tumorigenesis mechanisms, and offer the opportunity of identifying therapeutic targets. To screen and develop prognostic biomarkers, high throughput profiling methods including gene microarray and next-generation sequencing have been widely applied and shown great success. However, due to the lack of independent validation, only very few prognostic biomarkers have been applied for clinical practice. In order to cross-validate the reliability of potential prognostic biomarkers, some groups have collected the omics datasets (i.e., epigenetics/transcriptome/proteome) with relative follow-up data (such as OS/DSS/PFS) of clinical samples from different cohorts, and developed the easy-to-use online bioinformatics tools and web servers to assist the biomarker screening and validation. These tools and web servers provide great convenience for the development of prognostic biomarkers, for the study of molecular mechanisms of tumorigenesis and progression, and even for the discovery of important therapeutic targets. Aim to help researchers to get a quick learning and understand the function of these tools, the current review delves into the introduction of the usage, characteristics and algorithms of tools, and web servers, such as LOGpc, KM plotter, GEPIA, TCPA, OncoLnc, PrognoScan, MethSurv, SurvExpress, UALCAN, etc., and further help researchers to select more suitable tools for their own research. In addition, all the tools introduced in this review can be reached at http://bioinfo.henu.edu.cn/WebServiceList.html.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introductory curriculum for high school students (grades 9-12) that explores genetic research and bioinformatics. Posted on-line October 2012. Funded by NSF grant DRL-0833779
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A collection of similar but different presentations I've made aimed at introducing bioinformatics to bench biologists.