Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New concepts for parallel object-relational query processing is a book. It was written by Michael Jaedicke and published by : Springer in 2001.
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This codebase is used in the ADCS 2017 paper "Early Termination Heuristics for Score-at-a-Time Index Traversal".
You first need to build ATIRE, and then point to the ATIRE directory in the GNUMakefile. The invocation is the same as with the regular Jass program, but you may wish to specify the number of threads to use when building the code.
Abstract
Score-at-a-Time index traversal is a query processing approach which supports early termination in order to balance efficiency and effectiveness trade-offs. In this work, we explore new techniques which extend a modern Score-at-a-Time traversal algorithm to allow for parallel postings traversal. We show that careful integration of parallel traversal can improve both efficiency and effectiveness when compared with current single threaded early termination approaches. In addition, we explore the various trade-offs for differing early termination heuristics, and propose hybrid systems which parallelize long running queries, while processing short running queries with only a single thread.
Dataset Card for "msmarco-query-en-id-parallel-sentences"
More Information needed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Algorithm configuration and parameters.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
The bowerbirds in New Guinea and Australia include species that build the largest and perhaps most elaborately decorated constructions outside of humans. The males use these courtship bowers, along with their displays, to attract females. In these species, the mating system is polygynous and the females alone incubate and feed the nestlings. The bowerbirds also include 10 species of the socially monogamous catbirds in which the male participates in most aspects of raising the young. How the bower-building behavior evolved has remained poorly understood, as no comprehensive phylogeny exists for the family. It has been assumed that the monogamous catbird clade is sister to all polygynous species. We here test this hypothesis using a newly developed pipeline for obtaining homologous alignments of thousands of exonic and intronic regions from genomic data to build a phylogeny. Our well-supported species tree shows that the polygynous, bower-building species are not monophyletic. The result suggests either that bower-building behavior is an ancestral condition in the family that was secondarily lost in the catbirds, or that it has arisen in parallel in two lineages of bowerbirds. We favor the latter hypothesis based on an ancestral character reconstruction showing that polygyny but not bower-building is ancestral in bowerbirds, and on the observation that Scenopoeetes dentirostris, the sister species to one of the bower-building clades, does not build a proper bower but constructs a court for male display. This species is also sexually monomorphic in plumage despite having a polygynous mating system. We argue that the relatively stable tropical and subtropical forest environment in combination with low predator pressure and rich food access (mostly fruit) facilitated the evolution of these unique life-history traits.
Methods This is supplementary material to the manuscript "Parallel evolution of bower-building behavior and polygyny in two groups of bowerbirds suggested by phylogenomics". We used the Birdscanner pipeline (available at github.com/Naturhistoriska/birdscanner.git) to obtain homologous alignments of 5653 exonic and 7020 intronic regions from whole-genome sequence data. The pipeline utilize probabilistic queries using hidden Markov models that were used to probe the mapped bowerbird genomes to find where they had their best fit. For each query and taxon we obtained genomic coordinates for the best hits that were then ranked according to their “sequence E-values”, i.e. the expected number of false positives (non-homologous sequences) that scored this well or better. For each query and taxon the sequences for the hits with the lowest values were parsed out using the genomic coordinates. These were then aligned in separate files for exonic and intronic loci. Poorly aligned sequences were identified, based on a calculated distance matrix using OD-Seq (github.com/PeterJehl/OD-Seq), and excluded from the further analyses. We also checked the alignments manually and removed those that included non-homologous sequences for some taxa (indicated by an extreme proportion of variable positions in the alignment) and those that contained no phylogenetically information. Individual trees were constructed using IQ-TREE that automatically selects the best substitution model for each loci alignment. We used ASTRAL-III to construct species trees from the gene trees both for the exonic and intronic loci separately and for all loci combined. ASTRAL estimates a species tree given a set of unrooted gene trees and branch support is calculated using local posterior probabilities. We assembled mitochondrial genomes from the resequenced data for each individual using MITObim , and used 12 of the 13 protein-coding genes to infer the phylogenetic tree. The aligned mitochondrial data set used in the analyses consists of 10,560 bp (3,520 codons). The phylogenetic analysis of the mitogenomic data set was performed with MEGA X . We estimated the maximum-likelihood tree for the mitochondrial data using 100 bootstrap replicates to assess the reliability of the branches. The data set was analyzed both with all codon positions present and with the third codon positions excluded.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Speed-up ratio comparison on OHSUMED dataset.
Reporter genes integrated into the genome are a powerful tool to reveal effects of regulatory elements and local chromatin context on gene expression. However, so far such reporter assays have been of low throughput. Here we describe a multiplexing approach for the parallel monitoring of transcriptional activity of thousands of randomly integrated reporters. More than 27,000 distinct reporter integrations in mouse embryonic stem cells, obtained with two different promoters, show ~1,000-fold variation in expression levels. Data analysis indicates that lamina-associated domains act as attenuators of transcription, likely by reducing access of transcription factors to binding sites. Furthermore, chromatin compaction is predictive of reporter activity. We also found evidence for cross-talk between neighboring genes, and estimate that enhancers can influence gene expression on average over ~20 kb. The multiplexed reporter assay is highly flexible in design and can be modified to query a wide range of aspects of gene regulation. TRIP assay of mPGK promotor; 11 clonal cell lines, 1 replicate each
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of existing studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Table S1. The basic compression performance comparison between GBC and alternative tools. Table S2. The comparison of GBC’s compression and decompression speed under multiple threads in the 1000GP3 dataset. Table S3. The data query performance comparison between GBC and alternative tools. Table S4. The comparison of LD calculation speed between GBC and alternative tools in the 1000GP3 and SG10K datasets. Table S5. The file management performance comparison between GBC and alternative tools. Table S6. BEG and MBEG coding tables for genotypes of diploid species.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
New concepts for parallel object-relational query processing is a book. It was written by Michael Jaedicke and published by : Springer in 2001.