Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Implementation of my idea on biostar "Where/how to assess which bioinformatics tools/databases are most used/accessed?" : http://www.biostars.org/p/60334/#60341 . The script gets the projects tagged bioinformatics on google-code and get the number of downloads. Of course, that doesn't give the number of "checkout' of each project, etc...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Scientific advancement is hindered without proper genome annotation because biologists lack a complete understanding of cellular protein functions. In bacterial cells, hypothetical proteins (HPs) are open reading frames with unknown functions. HPs result from either an outdated database or insufficient experimental evidence (i.e., indeterminate annotation). While automated annotation reviews help keep genome annotation up to date, often manual reviews are needed to verify proper annotation. Students can provide the manual review necessary to improve genome annotation. This paper outlines an innovative classroom project that determines if HPs have outdated or indeterminate annotation. The Hypothetical Protein Characterization Project uses multiple well-documented, freely available, web-based, bioinformatics resources that analyze an amino acid sequence to (1) detect sequence similarities to other proteins, (2) identify domains, (3) predict tertiary structure including active site characterization and potential binding ligands, and (4) determine cellular location. Enough evidence can be generated from these analyses to support re-annotation of HPs or prioritize HPs for experimental examinations such as structural determination via X-ray crystallography. Additionally, this paper details several approaches for selecting HPs to characterize using the Hypothetical Protein Characterization Project. These approaches include student- and instructor-directed random selection, selection using differential gene expression from mRNA expression data, and selection based on phylogenetic relations. This paper also provides additional resources to support instructional use of the Hypothetical Protein Characterization Project, such as example assignment instructions with grading rubrics, links to training videos in YouTube, and several step-by-step example projects to demonstrate and interpret the range of achievable results that students might encounter. Educational use of the Hypothetical Protein Characterization Project provides students with an opportunity to learn and apply knowledge of bioinformatic programs to address scientific questions. The project is highly customizable in that HP selection and analysis can be specifically formulated based on the scope and purpose of each student’s investigations. Programs used for HP analysis can be easily adapted to course learning objectives. The project can be used in both online and in-seat instruction for a wide variety of undergraduate and graduate classes as well as undergraduate capstone, honor’s, and experiential learning projects.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Paired Omics Data Platform project of MSV000079284 metabolome with 48 (Meta)Genome - Metabolome links and 2 BGC - MS/MS links
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data corresponding to the scientific initiation project in bioinformatics.
The data in the parcial_report_input folder are equivalent to raw counts, metadata and data of FPKM values of genes from the analysis performed during the first six months of the project. They correspond the counts and metadata from a previous study from renal cell carcinoma.
As for the Final_report_input, it also contains counts and metadata, but from a previous study of ostesarcoma. The metadata and raw data files can be found under the accesion number hs000699.v1.p1 in dbGAP.
The scripts wrote to perform pre-processing of samples, differential expression analysis, network analysis and functional annotation can be found in GitHub repository.
Facebook
TwitterThe project contains raw and result files from a comparative proteomic analysis of malignant [primary breast tumor (PT) and axillary metastatic lymph nodes (LN)] and non-tumor [contralateral (NCT) and adjacent breast (ANT)] tissues of patients diagnosed with invasive ductal carcinoma. A label-free mass spectrometry was conducted using nano-liquid chromatography coupled to electrospray ionization–mass spectrometry (LC-ESI-MS/MS) followed by functional annotation to reveal differentially expressed proteins and their predicted impacts on pathways and cellular functions in breast cancer. A total of 462 proteins was observed as differentially expressed (DEPs) among the groups of samples analyzed. Ingenuity Pathway Analysis software version 2.3 (QIAGEN Inc.) was employed to identify the most relevant signaling and metabolic pathways, diseases, biological functions and interaction networks affected by the deregulated proteins. Upstream regulator and biomarker analyses were also performed by IPA’s tools. Altogether, our findings revealed differential proteomic profiles that affected the associated and interconnected cancer signaling processes.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Paired Omics Data Platform project of MSV000078836 metabolome with 360 (Meta)Genome - Metabolome links and 9 BGC - MS/MS links
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented on 8/12/13. An expanded version of the Alternative Splicing Annotation Project (ASAP) database with a new interface and integration of comparative features using UCSC BLASTZ multiple alignments. It supports 9 vertebrate species, 4 insects, and nematodes, and provides with extensive alternative splicing analysis and their splicing variants. As for human alternative splicing data, newly added EST libraries were classified and included into previous tissue and cancer classification, and lists of tissue and cancer (normal) specific alternatively spliced genes are re-calculated and updated. They have created a novel orthologous exon and intron databases and their splice variants based on multiple alignment among several species. These orthologous exon and intron database can give more comprehensive homologous gene information than protein similarity based method. Furthermore, splice junction and exon identity among species can be valuable resources to elucidate species-specific genes. ASAP II database can be easily integrated with pygr (unpublished, the Python Graph Database Framework for Bioinformatics) and its powerful features such as graph query, multi-genome alignment query and etc. ASAP II can be searched by several different criteria such as gene symbol, gene name and ID (UniGene, GenBank etc.). The web interface provides 7 different kinds of views: (I) user query, UniGene annotation, orthologous genes and genome browsers; (II) genome alignment; (III) exons and orthologous exons; (IV) introns and orthologous introns; (V) alternative splicing; (IV) isoform and protein sequences; (VII) tissue and cancer vs. normal specificity. ASAP II shows genome alignments of isoforms, exons, and introns in UCSC-like genome browser. All alternative splicing relationships with supporting evidence information, types of alternative splicing patterns, and inclusion rate for skipped exons are listed in separate tables. Users can also search human data for tissue- and cancer-specific splice forms at the bottom of the gene summary page. The p-values for tissue-specificity as log-odds (LOD) scores, and highlight the results for LOD >= 3 and at least 3 EST sequences are all also reported.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The application of project-based learning in bioinformatics training
Facebook
TwitterThe Paired Omics Data Platform is a community-based initiative standardizing links between genomic and metabolomics data in a computer readable format to further the field of natural products discovery. The goals are to link molecules to their producers, find large scale genome-metabolome associations, use genomic data to assist in structural elucidation of molecules, and provide a centralized database for paired datasets. This dataset contains the projects in http://pairedomicsdata.bioinformatics.nl/. The JSON documents adhere to the http://pairedomicsdata.bioinformatics.nl/schema.json JSON schema.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
all vcf files that I was able to provide in BLG348 Intro to Bioinformatics course term project. Mutect variantCaller didn't work properly so I didn't add them. NotFıltered vcf's indicates previos version of vcf's that contains different filters (not only PASS ones) You can also check my profile to see the plots that I used for my project report & presentation.
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global bioinformatics in healthcare market size reached USD 12.4 billion in 2024, reflecting robust adoption across clinical, research, and pharmaceutical domains. The market is expected to expand at a CAGR of 13.2% from 2025 to 2033, reaching a projected value of USD 36.6 billion by 2033. This impressive growth trajectory is fueled by escalating investments in genomics, rising demand for personalized medicine, and the integration of advanced computational tools in healthcare. The bioinformatics in healthcare market is witnessing a paradigm shift as organizations increasingly leverage data-driven insights to accelerate drug discovery, improve diagnostics, and enhance patient outcomes.
A primary driver for the rapid expansion of the bioinformatics in healthcare market is the surging volume of biological and clinical data being generated worldwide. The proliferation of next-generation sequencing (NGS) technologies, coupled with decreasing costs of genome sequencing, has resulted in an unprecedented influx of genetic information. This wealth of data demands sophisticated bioinformatics solutions to manage, analyze, and interpret complex datasets efficiently. As a result, healthcare institutions, research centers, and pharmaceutical companies are investing heavily in advanced bioinformatics platforms and software to unlock actionable insights from vast genomic and proteomic repositories. This trend is further amplified by the growing recognition of the pivotal role bioinformatics plays in bridging the gap between raw biological data and clinical application.
Another significant growth factor is the expanding application of bioinformatics in personalized medicine and targeted therapeutics. With the healthcare industry shifting towards precision medicine, there is an urgent need for tools that can integrate and analyze multi-omics data—spanning genomics, transcriptomics, proteomics, and metabolomics. Bioinformatics enables the identification of disease biomarkers, prediction of drug responses, and customization of treatment regimens based on individual patient profiles. This has not only improved patient outcomes but has also optimized healthcare resource utilization. The increasing prevalence of chronic diseases, rising cancer incidence, and the demand for tailored therapies are propelling the adoption of bioinformatics in clinical diagnostics and drug development, thus driving overall market growth.
Strategic collaborations and investments by government agencies, academic institutions, and private enterprises are further catalyzing the bioinformatics in healthcare market. Initiatives such as the Human Genome Project and various national genomics programs have laid the foundation for large-scale data generation and sharing. Governments across North America, Europe, and Asia Pacific are launching funding programs to support bioinformatics infrastructure, skill development, and research. These efforts are enhancing data interoperability, standardization, and integration, thereby fostering innovation in the field. Moreover, the emergence of cloud-based bioinformatics platforms is democratizing access to computational resources, enabling smaller organizations and developing regions to participate in cutting-edge research and clinical applications.
From a regional perspective, North America continues to dominate the bioinformatics in healthcare market, accounting for the largest revenue share in 2024. This leadership position is attributed to the presence of advanced healthcare infrastructure, significant R&D investments, and a strong ecosystem of academic and commercial players. Europe follows closely, driven by robust government support and a vibrant biotech sector. Meanwhile, Asia Pacific is emerging as the fastest-growing region, fueled by expanding healthcare expenditure, increasing adoption of genomic medicine, and a burgeoning talent pool in computational biology. Latin America and the Middle East & Africa are also experiencing steady growth, supported by improving healthcare systems and international collaborations.
The bioinformatics in healthcare market is segmented by solution into software, services, and platforms, each playing a critical role in the ecosystem. Bioinformatics software forms the backbone of data analysis, enabling researchers and clinicians to process and interpret complex biologi
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Bioinformatic analyses and field data associated w. Morris et al. 2016, JPR
Served as links to the Morris contributed package and to GitHub
Facebook
Twitter
According to our latest research, the global translational bioinformatics market size reached USD 4.2 billion in 2024, driven by the increasing integration of computational technologies in biomedical research and healthcare. The market is exhibiting robust growth with a compound annual growth rate (CAGR) of 11.6% from 2025 to 2033. By 2033, the market is forecasted to reach USD 11.4 billion, reflecting the rising demand for data-driven solutions in drug discovery, clinical diagnostics, and personalized medicine. This surge is primarily fueled by the growing adoption of genomics and proteomics in clinical settings, the expansion of precision medicine initiatives, and the escalating need for advanced bioinformatics platforms to handle complex biological datasets.
One of the primary growth factors for the translational bioinformatics market is the exponential increase in biomedical data generated from next-generation sequencing (NGS), genomics, and proteomics research. The need to analyze, interpret, and translate this vast amount of data into clinically actionable insights has made translational bioinformatics solutions indispensable. Healthcare providers and research institutions are increasingly leveraging sophisticated bioinformatics software and platforms to accelerate drug discovery, identify novel biomarkers, and develop targeted therapies. The integration of artificial intelligence (AI) and machine learning (ML) algorithms into bioinformatics tools further enhances the ability to extract meaningful patterns from multidimensional datasets, thereby supporting the precision medicine paradigm and improving patient outcomes.
Another critical driver for the translational bioinformatics market is the growing emphasis on personalized medicine and tailored therapeutics. With the advent of genomics and proteomics, there is a heightened focus on individualized treatment strategies that consider a patientÂ’s genetic makeup, lifestyle, and environmental factors. Translational bioinformatics bridges the gap between basic research and clinical application by providing the computational infrastructure necessary to translate omics data into personalized diagnostics and therapies. The market is also benefiting from increased investments in biomedical research, government initiatives promoting precision healthcare, and strategic collaborations between pharmaceutical companies, academic institutions, and technology providers. These collaborations are fostering innovation and accelerating the adoption of translational bioinformatics solutions across the healthcare ecosystem.
The translational bioinformatics market is also witnessing significant growth due to the rising prevalence of chronic diseases and the urgent need for innovative diagnostic and therapeutic approaches. Chronic conditions such as cancer, cardiovascular diseases, and neurological disorders require comprehensive molecular profiling to inform treatment decisions. Translational bioinformatics enables the integration of diverse data sources, including genomics, proteomics, clinical records, and imaging data, to facilitate a holistic understanding of disease mechanisms. This integrative approach supports the development of novel biomarkers, enhances the efficiency of clinical trials, and expedites the translation of research findings into clinical practice. As a result, healthcare organizations are increasingly adopting translational bioinformatics solutions to improve disease management and patient care.
As the translational bioinformatics market continues to evolve, the concept of Bioinformatics Pipelines as a Service is gaining traction. These pipelines provide a comprehensive framework for processing and analyzing biological data, offering a seamless integration of various bioinformatics tools and resources. By leveraging cloud-based infrastructures, these services enable researchers to automate complex workflows, enhance data reproducibility, and scale their analyses according to project needs. The flexibility and efficiency of Bioinformatics Pipelines as a Service are particularly beneficial for organizations with limited in-house bioinformatics expertise, allowing them to focus on their core research objectives while accessing cutting-edge computational resources. This approach not only accelerates the pace of discovery but also democratizes access to advanced bioinformatics capabilities
Facebook
TwitterThe Genome Solver was an NSF-funded project developed as a way to train undergraduate life science faculty in basic web-based tools for bioinformatics. As part of the project we developed a one-day workshop consisting of bioinformatics modules on the theme of bacterial genomics, which we delivered to faculty at colleges and universities around the country. All of our workshop material can be accessed on the QUBESHub website: https://qubeshub.org/community/groups/genomesolver/
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Paired Omics Data Platform project of MSV000084771 metabolome with 11 (Meta)Genome - Metabolome links and 9 BGC - MS/MS links
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Reproducibility Project: Cancer Biology (https://osf.io/e81xl/wiki/home/) aims to reproduce the key experiments from 50 landmark papers in cancer research. As a follow up to the previously published study, which showed a lack of indentifiability of research resources in the published biomedical literature (Vasilevsky, et al. 2014, PeerJ 1:e148), we analyzed 6 resource types reported in these papers to determine the identifiability of these resources. The resource types included antibodies, cell lines, constructs, knockdown reagents, model organisms and software. The results showed an average 85% of the resources were identifiable, and the ability to identify the resources varied amongst the resource types.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Paired Omics Data Platform project of MSV000085179 metabolome with 9 (Meta)Genome - Metabolome links and 0 BGC - MS/MS links
Facebook
TwitterA dataset containing the full genomic sequence of 1,700 individuals, freely available for research use. The 1000 Genomes Project is an international research effort coordinated by a consortium of 75 companies and organizations to establish the most detailed catalogue of human genetic variation. The project has grown to 200 terabytes of genomic data including DNA sequenced from more than 1,700 individuals that researchers can now access on AWS for use in disease research free of charge. The dataset containing the full genomic sequence of 1,700 individuals is now available to all via Amazon S3. The data can be found at: http://s3.amazonaws.com/1000genomes The 1000 Genomes Project aims to include the genomes of more than 2,662 individuals from 26 populations around the world, and the NIH will continue to add the remaining genome samples to the data collection this year. Public Data Sets on AWS provide a centralized repository of public data hosted on Amazon Simple Storage Service (Amazon S3). The data can be seamlessly accessed from AWS services such Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Elastic MapReduce (Amazon EMR), which provide organizations with the highly scalable compute resources needed to take advantage of these large data collections. AWS is storing the public data sets at no charge to the community. Researchers pay only for the additional AWS resources they need for further processing or analysis of the data. All 200 TB of the latest 1000 Genomes Project data is available in a publicly available Amazon S3 bucket. You can access the data via simple HTTP requests, or take advantage of the AWS SDKs in languages such as Ruby, Java, Python, .NET and PHP. Researchers can use the Amazon EC2 utility computing service to dive into this data without the usual capital investment required to work with data at this scale. AWS also provides a number of orchestration and automation services to help teams make their research available to others to remix and reuse. Making the data available via a bucket in Amazon S3 also means that customers can crunch the information using Hadoop via Amazon Elastic MapReduce, and take advantage of the growing collection of tools for running bioinformatics job flows, such as CloudBurst and Crossbow.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.
R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html
Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Facebook
Twitterhttps://www.bco-dmo.org/dataset/813173/licensehttps://www.bco-dmo.org/dataset/813173/license
Supplementary Table 4C: Metatranscriptome data summary for cellular activities presented and statistics on sequencing and removal of potential contaminant sequences: Statistics of reads retained through bioinformatic processing of iTAG data for the 11 samples and control samples and metatranscriptome data. Samples taken on board of the R/V JOIDES Resolution between November 30, 2015 and January 30, 2016 access_formats=.htmlTable,.csv,.json,.mat,.nc,.tsv,.esriCsv,.geoJson acquisition_description=Rock material was crushed while still frozen in a Progressive Exploration Jaw Crusher (Model 150) whose surfaces were sterilized with 70% ethanol and RNase AWAY (Thermo Fisher Scientific, USA) inside a laminar flow hood. Powdered rock material was returned to the -80\u00b0C freezer until extraction.
DNA was extracted from 20, 30, or 40 grams of powdered rock material, depending on the quantity of rock available. A DNeasy PowerMax Soil Kit (Qiagen, USA) was used following the manufacturer\u2019s protocol modified to included three freeze/thaw treatments prior to the addition of Soil Kit solution C1. Each treatment consisted of 1 minute in liquid nitrogen followed by 5 minutes at 65 \u00b0C. DNA extracts were concentrated by isopropanol precipitation overnight at 4\u00b0C.
The low biomass in our samples required whole genome amplification (WGA) prior to PCR amplification of marker genes. Genomic DNA was amplified by Multiple Displacement Amplification (MDA) using the REPLI-g Single Cell Kit (Qiagen) as directed. MDA bias was minimized by splitting each WGA sample into triplicate 16 \u03bcL reactions after 1 hr of amplification and then resuming amplification for the manufacturer-specified 7 hrs (8 hrs total).
DNA was also recovered from samples of drilling mud and drilling fluid (surface water collected during the coring process) for negative controls, as well as two \u201ckit control\u201d samples, in which no sample was added, to account for any contaminants originating from either the DNeasy PowerMax Soil Kit or the REPLI-g Single Cell Kit.
Bacterial SSU rRNA gene fragments were PCR amplified from MDA samples and sequenced at Georgia Genomics and Bioinformatics Core (Univ. of Georgia). The primers used were: Bac515-Y and Bac926R. Dual-indexed libraries were prepared with (HT) iTruS (Kappa Biosystems) chemistry and sequencing was performed on an Illumina MiSeq 2 x 300 bp system with all samples combined equally on a single flow cell.
Raw sequence reads were processed through Trim Galore [http://www.bioinformatics.babraham.ac.uk/projects/trim_galore/], FLASH (ccb.jhu.edu/software/FLASH/) and FASTX Toolkit [http://hannonlab.cshl.edu/fastx_toolkit/] for trimming and removal of low quality/short reads.
Quality filtering included requiring a minimum average quality of 25 and rejection of paired reads less than 250 nucleotides.
Operational Taxonomic Unit (OTU) clusters were constructed at 99% similarity
with the script pick_otus.py within the Quantitative Insights Into Microbial
Ecology (QIIME) v.1.9.1 software and \u2018uclust\u2019. Any OTU that matched
an OTU in one of our control samples (drilling fluids, drilling mud,
extraction and WGA controls) was removed (using filter_otus_from_otu_table.py)
along with any sequences of land plants and human pathogens that may have
survived the control filtering due to clustering at 99%
(filter_taxa_from_otu_table.py). As an additional quality control measure,
genera that are commonly identified as PCR contaminants were removed.
Unclassified OTUs were queried using BLAST against the GenBank nr database and
further information about these OTUs is provided in the Supplementary
Discussion text under the section \u201cTaxonomic diversity information from
iTAGs.\u201d OTUs that could not be assigned to Bacteria or Archaea were
removed from further analysis. For downstream analyses, any OTUs not
representing more than 0.01% of relative abundance of sequences overall were
removed as those are unlikely to contribute significantly to in situ
communities. The OTU data table was transformed to a presence/absence table
and the Jaccard method was used to generate a distance matrix using the
dist.binary() function in the R package ade4.
awards_0_award_nid=709555
awards_0_award_number=OCE-1658031
awards_0_data_url=http://www.nsf.gov/awardsearch/showAward.do?AwardNumber=1658031
awards_0_funder_name=NSF Division of Ocean Sciences
awards_0_funding_acronym=NSF OCE
awards_0_funding_source_nid=355
awards_0_program_manager=David L. Garrison
awards_0_program_manager_nid=50534
cdm_data_type=Other
comment=Supplementary Table 4C: iTAG
PI: Virginia Edgcomb
Data Version 1: 2020-05-28
Conventions=COARDS, CF-1.6, ACDD-1.3
data_source=extract_data_as_tsv version 2.3 19 Dec 2019
dataset_current_state=Final and no updates
defaultDataQuery=&time<now
doi=10.26008/1912/bco-dmo.813173.1
Easternmost_Easting=57.278183
geospatial_lat_max=-32.70567
geospatial_lat_min=-32.70567
geospatial_lat_units=degrees_north
geospatial_lon_max=57.278183
geospatial_lon_min=57.278183
geospatial_lon_units=degrees_east
geospatial_vertical_max=747.7
geospatial_vertical_min=10.7
geospatial_vertical_positive=down
geospatial_vertical_units=m
infoUrl=https://www.bco-dmo.org/dataset/813173
institution=BCO-DMO
instruments_0_acronym=Automated Sequencer
instruments_0_dataset_instrument_description=DNA sequencing performed using the Illumina MiSeq 2 x 300 bp platform (Univ. of Georgia)
instruments_0_dataset_instrument_nid=813183
instruments_0_description=General term for a laboratory instrument used for deciphering the order of bases in a strand of DNA. Sanger sequencers detect fluorescence from different dyes that are used to identify the A, C, G, and T extension reactions. Contemporary or Pyrosequencer methods are based on detecting the activity of DNA polymerase (a DNA synthesizing enzyme) with another chemoluminescent enzyme. Essentially, the method allows sequencing of a single strand of DNA by synthesizing the complementary strand along it, one base pair at a time, and detecting which base was actually added at each step.
instruments_0_instrument_name=Automated DNA Sequencer
instruments_0_instrument_nid=649
instruments_0_supplied_name=Illumina MiSeq 2 x 300 bp platform
metadata_source=https://www.bco-dmo.org/api/dataset/813173
Northernmost_Northing=-32.70567
param_mapping={'813173': {'Latitude': 'flag - latitude', 'Depth': 'flag - depth', 'Longitude': 'flag - longitude'}}
parameter_source=https://www.bco-dmo.org/mapserver/dataset/813173/parameters
people_0_affiliation=Woods Hole Oceanographic Institution
people_0_affiliation_acronym=WHOI
people_0_person_name=Virginia P. Edgcomb
people_0_person_nid=51284
people_0_role=Principal Investigator
people_0_role_type=originator
people_1_affiliation=Woods Hole Oceanographic Institution
people_1_affiliation_acronym=WHOI
people_1_person_name=Virginia P. Edgcomb
people_1_person_nid=51284
people_1_role=Contact
people_1_role_type=related
people_2_affiliation=Woods Hole Oceanographic Institution
people_2_affiliation_acronym=WHOI BCO-DMO
people_2_person_name=Karen Soenen
people_2_person_nid=748773
people_2_role=BCO-DMO Data Manager
people_2_role_type=related
project=Subseafloor Lower Crust Microbiology
projects_0_acronym=Subseafloor Lower Crust Microbiology
projects_0_description=NSF abstract:
The lower ocean crust has remained largely unexplored and represents one of the last frontiers for biological exploration on Earth. Preliminary data indicate an active subsurface biosphere in samples of the lower oceanic crust collected from Atlantis Bank in the SW Indian Ocean as deep as 790 m below the seafloor. Even if life exists in only a fraction of the habitable volume where temperatures permit and fluid flow can deliver carbon and energy sources, an active lower oceanic crust biosphere would have implications for deep carbon budgets and yield insights into microbiota that may have existed on early Earth. This is all of great interest to other research disciplines, educators, and students alike. A K-12 education program will capitalize on groundwork laid by outreach collaborator, A. Martinez, a 7th grade teacher in Eagle Pass, TX, who sailed as outreach expert on Drilling Expedition 360. Martinez works at a Title 1 school with ~98% Hispanic and ~2% Native American students and a high number of English Language Learners and migrants. Annual school visits occur during which the project investigators present hands on-activities introducing students to microbiology, and talks on marine microbiology, the project, and how to pursue science related careers. In addition, monthly Skype meetings with students and PIs update them on project progress. Students travel to the University of Texas Marine Science Institute annually, where they get a campus tour and a 3-hour cruise on the R/V Katy, during which they learn about and help with different oceanographic sampling approaches. The project partially supports two graduate students, a Woods Hole undergraduate summer student, the participation of multiple Texas A+M undergraduate students, and 3 principal investigators at two institutions, including one early career researcher who has not previously received NSF support of his own.
Given the dearth of knowledge of the lower oceanic crust, this project is poised to transform our understanding of life in this vast environment. The project assesses metabolic functions within all three domains of life in this crustal biosphere, with a focus on nutrient cycling and evaluation of connections to other deep marine microbial habitats. The lower ocean crust represents a potentially vast biosphere whose microbial constituents and the biogeochemical cycles they mediate are likely linked to deep ocean processes through faulting and subsurface fluid flow. Atlantis Bank represents a tectonic
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Implementation of my idea on biostar "Where/how to assess which bioinformatics tools/databases are most used/accessed?" : http://www.biostars.org/p/60334/#60341 . The script gets the projects tagged bioinformatics on google-code and get the number of downloads. Of course, that doesn't give the number of "checkout' of each project, etc...