Facebook
TwitterThe dataset used in this paper for elastic source imaging with very sparse data, both in space and time.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset was created by Wei Xie
Released under Community Data License Agreement - Sharing - Version 1.0
Facebook
TwitterSparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. This paper posits that these methods can be extremely useful for understanding large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multi-document text summarization and (b) comparative summarization of two corpora, both using parse regression or classifi?cation; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our approach using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. Citation: L. El Ghaoui, G. C. Li, V. Duong, V. Pham, A. N. Srivastava, and K. Bhaduri, “Sparse Machine Learning Methods for Understanding Large Text Corpora,” Proceedings of the Conference on Intelligent Data Understanding, 2011.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The first capture of the area North of the Floreat Surf Life Saving Club, these sand dunes were captured by UAV imagery on 17th Aug 2021 for the Cambridge Coastcare beach dune modelling and monitoring project. It was created as part of an initiative to innovatively monitor coastal dune erosion and visualize these changes over time for future management and mitigation. This data includes Orthomosaic, DSM, DTM, Elevation Contours, 3D Mesh, 3D Point Cloud and LiDAR constructed from over 500 images captured from UAV (drone) and processed in Pix4D. All datasets can be freely accessed through DataWA. Link to Animated video fly-through of this 3D data model Link to the Sketchfab visualisation of the 3D textured mesh The dataset is a Sparse 3D Point Cloud (i.e. a 3D set of points): the X,Y,Z position and colour information is stored for each point of the point cloud. This dataset is of the area North of Floreat SLSC (2021 Flight-2 project area).
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Sparse matrices for raw counts data for open-problems-multimodal competition. The script for generating sparse matrices was shared by Wei Xie and can be found here.
The similar dataset for normalized and log1p transformed counts for the same cells can be found here.
Each h5 file contains 5 arrays:
axis0 (row index from the original h5 file)
axis1 (column index from the original h5 file)
value_i (attribute i in dgCMatrix in R or index indices in csc_array in scipy.sparse)
value_p (attribute p in dgCMatrix in R or index indptr in csc_array in scipy.sparse)
value_x (attribute x in dgcMatrix in R or index data in csc_array in scipy.sparse.)
Facebook
Twitterhttps://dataverse.lib.nycu.edu.tw/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57770/CNNFTPhttps://dataverse.lib.nycu.edu.tw/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57770/CNNFTP
We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input. Given any pre-trained depth predic-tion model, our PnP module updates the intermediate feature map such that the model outputs new depths consistent with the given sparse depths. Our method requires no additional training and can be applied to practical applications such as leveraging both RGB and sparse LiDAR points to robustly estimate dense depth map. Our approach achieves consistent improvements on various state-of-the-art methods on indoor (i.e., NYU-v2) and outdoor (i.e., KITTI) datasets. Various types of LiDARs are also synthesized in our experiments to verify the general applicability of our PnP module in practice.
Facebook
TwitterIn this paper we propose an innovative learning algorithm - a variation of One-class ? Support Vector Machines (SVMs) learning algorithm to produce sparser solutions with much reduced computational complexities. The proposed technique returns an approximate solution, nearly as good as the solution set obtained by the classical approach, by minimizing the original risk function along with a regularization term. We introduce a bi-criterion optimization that helps guide the search towards the optimal set in much reduced time. The outcome of the proposed learning technique was compared with the benchmark one-class Support Vector machines algorithm which more often leads to solutions with redundant support vectors. Through out the analysis, the problem size for both optimization routines was kept consistent. We have tested the proposed algorithm on a variety of data sources under different conditions to demonstrate the effectiveness. In all cases the proposed algorithm closely preserves the accuracy of standard one-class ? SVMs while reducing both training time and test time by several factors.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used for the paper SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data
It contains over 1GB of high-quality motion capture data recorded with an Xsens Awinda system while using a variety of VR applications in Meta Quest devices.
Visit the paper website!
If you find our data useful, please cite our paper:
@article{10.1145/3625264, author = {Ponton, Jose Luis and Yun, Haoran and Aristidou, Andreas and Andujar, Carlos and Pelechano, Nuria}, title = {SparsePoser: Real-Time Full-Body Motion Reconstruction from Sparse Data}, year = {2023}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, issn = {0730-0301}, url = {https://doi.org/10.1145/3625264}, doi = {10.1145/3625264}, journal = {ACM Trans. Graph.}, month = {oct}}
Facebook
TwitterThis paper considers a large class of problems where we seek to recover a low rank matrix and/or sparse vector from some set of measurements.
Facebook
TwitterUsage of high-level intermediate representations promises the generation of fast code from a high-level description, improving the productivity of developers while achieving the performance traditionally only reached with low-level programming approaches.
High-level IRs come in two flavors: 1) domain-specific IRs designed to express only for a specific application area; or 2) generic high-level IRs that can be used to generate high-performance code across many domains. Developing generic IRs is more challenging but offers the advantage of reusing a common compiler infrastructure various applications.
In this paper, we extend a generic high-level IR to enable efficient computation with sparse data structures. Crucially, we encode sparse representation using reusable dense building blocks already present in the high-level IR. We use a form of dependent types to model sparse matrices in CSR format by expressing the relationship between multiple dense arrays explicitly separately storing ...
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
Facebook
TwitterUtilized sparse matrices from UF sparse matrix collection for 274-node systems testing.
Facebook
Twitterhttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txthttps://www.shibatadb.com/license/data/proprietary/v1.0/license.txt
Yearly citation counts for the publication titled "Effect of data representation on cost of sparse matrix operations".
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
augmented non-deterministic dataset through MCMC and the auxiliary SWAP model
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
This dataset is related to our ICCAD work "Optimized Data Reuse via Reordering for Sparse Matrix-Vector Multiplication on FPGAs".
Facebook
TwitterCalculated beam pattern in Fourier space of a unitary input given two sparsely sampled synthetic aperture arrays: 1. a regularly spaced array sampled at 2*lambda, where lambda is the wavelength of the 40 GHz signal, and 2. the regularly spaced array with random perturbations (of order ~<lambda) to the (x,y) spatial location of each sample point. This dataset is published in "An Overview of Advances in Signal Processing Techniques for Classical and Quantum Wideband Synthetic Apertures" by Vouras, et al. in IEEE Selected Topics in Signal Processing Recent Advances in Wideband Signal Processing for Classical and Quantum Synthetic Apertures.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Many problems in classification involve huge numbers of irrelevant features. Variable selection reveals the crucial features, reduces the dimensionality of feature space, and improves model interpretation. In the support vector machine literature, variable selection is achieved by l1 penalties. These convex relaxations seriously bias parameter estimates toward 0 and tend to admit too many irrelevant features. The current article presents an alternative that replaces penalties by sparse-set constraints. Penalties still appear, but serve a different purpose. The proximal distance principle takes a loss function L(β) and adds the penalty ρ2dist(β,Sk)2 capturing the squared Euclidean distance of the parameter vector β to the sparsity set Sk where at most k components of β are nonzero. If βρ represents the minimum of the objective fρ(β)=L(β)+ρ2dist(β,Sk)2, then βρ tends to the constrained minimum of L(β) over Sk as ρ tends to ∞. We derive two closely related algorithms to carry out this strategy. Our simulated and real examples vividly demonstrate how the algorithms achieve better sparsity without loss of classification power. Supplementary materials for this article are available online.
Facebook
TwitterAbstractIt is generally assumed that high-resolution movement data are needed to extract meaningful decision-making patterns of animals on the move. Here we propose a modified version of force matching (referred to here as direction matching), whereby sparse movement data (i.e., collected over minutes instead of seconds) can be used to test hypothesized forces acting on a focal animal based on their ability to explain observed movement. We first test the direction matching approach using simulated data from an agent-based model, and then go on to apply it to a sparse movement data set collected on a troop of baboons in the DeHoop Nature Reserve, South Africa. We use the baboon data set to test the hypothesis that an individual’s motion is influenced by the group as a whole or, alternatively, whether it is influenced by the location of specific individuals within the group. Our data provide support for both hypotheses, with stronger support for the latter. The focal animal showed consistent patterns of movement toward particular individuals when distance from these individuals increased beyond 5.6 m. Although the focal animal was also sensitive to the group movement on those occasions when the group as a whole was highly clustered, these conditions of isolation occurred infrequently. We suggest that specific social interactions may thus drive overall group cohesion. The results of the direction matching approach suggest that relatively sparse data, with low technical and economic costs, can be used to test between hypotheses on the factors driving movement decisions.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The code for the paper: Efficient FPGA-based sparse matrix-vector multiplication with data reuse-aware compression
Facebook
TwitterMATLAB code + demo to reproduce results for "Sparse Principal Component Analysis with Preserved Sparsity". This code calculates the principal loading vectors for any given high-dimensional data matrix. The advantage of this method over existing sparse-PCA methods is that it can produce principal loading vectors with the same sparsity pattern for any number of principal components. Please see Readme.md for more information.
Facebook
TwitterThe dataset used in this paper for elastic source imaging with very sparse data, both in space and time.