Facebook
TwitterThe dataset used in this paper for elastic source imaging with very sparse data, both in space and time.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
Sparse matrices for raw counts data for open-problems-multimodal competition. The script for generating sparse matrices was shared by Wei Xie and can be found here.
The similar dataset for normalized and log1p transformed counts for the same cells can be found here.
Each h5 file contains 5 arrays:
axis0 (row index from the original h5 file)
axis1 (column index from the original h5 file)
value_i (attribute i in dgCMatrix in R or index indices in csc_array in scipy.sparse)
value_p (attribute p in dgCMatrix in R or index indptr in csc_array in scipy.sparse)
value_x (attribute x in dgcMatrix in R or index data in csc_array in scipy.sparse.)
Facebook
TwitterIn this paper we propose an innovative learning algorithm - a variation of One-class ? Support Vector Machines (SVMs) learning algorithm to produce sparser solutions with much reduced computational complexities. The proposed technique returns an approximate solution, nearly as good as the solution set obtained by the classical approach, by minimizing the original risk function along with a regularization term. We introduce a bi-criterion optimization that helps guide the search towards the optimal set in much reduced time. The outcome of the proposed learning technique was compared with the benchmark one-class Support Vector machines algorithm which more often leads to solutions with redundant support vectors. Through out the analysis, the problem size for both optimization routines was kept consistent. We have tested the proposed algorithm on a variety of data sources under different conditions to demonstrate the effectiveness. In all cases the proposed algorithm closely preserves the accuracy of standard one-class ? SVMs while reducing both training time and test time by several factors.
Facebook
TwitterThis dataset is a raster format GeoTIFF representing the percentage of density in each pixel of sparse vegetation. It includes any geographic areas were the cover of natural vegetation is between 2% and 10%, including permanently or regularly flooded areas. Sparse vegetation dataset is part of the Global Land Cover-SHARE (GLC-SHARE) database at the global level created by FAO, Land and Water Division in partnership and with contribution from various partners and institutions.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The first capture of the area North of the Floreat Surf Life Saving Club, these sand dunes were captured by UAV imagery on 17th Aug 2021 for the Cambridge Coastcare beach dune modelling and monitoring project. It was created as part of an initiative to innovatively monitor coastal dune erosion and visualize these changes over time for future management and mitigation. This data includes Orthomosaic, DSM, DTM, Elevation Contours, 3D Mesh, 3D Point Cloud and LiDAR constructed from over 500 images captured from UAV (drone) and processed in Pix4D. All datasets can be freely accessed through DataWA. Link to Animated video fly-through of this 3D data model Link to the Sketchfab visualisation of the 3D textured mesh The dataset is a Sparse 3D Point Cloud (i.e. a 3D set of points): the X,Y,Z position and colour information is stored for each point of the point cloud. This dataset is of the area North of Floreat SLSC (2021 Flight-2 project area).
Facebook
Twitterhttps://www.apache.org/licenses/LICENSE-2.0https://www.apache.org/licenses/LICENSE-2.0
This dataset contains supplementary code for the paper Fast Sparse Grid Operations using the Unidirectional Principle: A Generalized and Unified Framework. The code is also provided on GitHub. Here, we additionally provide the runtime measurement data generated by the code, which was used to generate the runtime plot in the paper. For more details, we refer to the file README.md.
Facebook
TwitterSparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. This paper posits that these methods can be extremely useful for understanding large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multi-document text summarization and (b) comparative summarization of two corpora, both using parse regression or classifi?cation; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our approach using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. Citation: L. El Ghaoui, G. C. Li, V. Duong, V. Pham, A. N. Srivastava, and K. Bhaduri, “Sparse Machine Learning Methods for Understanding Large Text Corpora,” Proceedings of the Conference on Intelligent Data Understanding, 2011.
Facebook
Twitterhttps://dataverse.lib.nycu.edu.tw/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57770/CNNFTPhttps://dataverse.lib.nycu.edu.tw/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=doi:10.57770/CNNFTP
We propose a novel plug-and-play (PnP) module for improving depth prediction with taking arbitrary patterns of sparse depths as input. Given any pre-trained depth predic-tion model, our PnP module updates the intermediate feature map such that the model outputs new depths consistent with the given sparse depths. Our method requires no additional training and can be applied to practical applications such as leveraging both RGB and sparse LiDAR points to robustly estimate dense depth map. Our approach achieves consistent improvements on various state-of-the-art methods on indoor (i.e., NYU-v2) and outdoor (i.e., KITTI) datasets. Various types of LiDARs are also synthesized in our experiments to verify the general applicability of our PnP module in practice.
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset was created by Wei Xie
Released under Community Data License Agreement - Sharing - Version 1.0
Facebook
TwitterThis paper considers a large class of problems where we seek to recover a low rank matrix and/or sparse vector from some set of measurements.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tables S1-S13 and Text S1. Table S1. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the ORL face database (sub-image size is 32×32). Table S2. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the ORL face database (sub-image size is 21×32). Table S3. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the ORL face database (sub-image size is 16×32). Table S4. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the ORL face database (sub-image size is 16×21). Table S5. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the Extended YaleB face database (sub-image size is 32×32). Table S6. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the Extended YaleB face database (sub-image size is 21×32). Table S7. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database (sub-image size is 32×32). Table S8. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database (sub-image size is 21×32). Table S9. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database with sunglasses occlusion (sub-image size is 32×32). Table S10. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database with scarf occlusion (sub-image size is 32×32). Table S11. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database with sunglasses occlusion (sub-image size is 21×32). Table S12. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the AR face database with scarf occlusion (sub-image size is 21×32). Table S13. The average recognition rates (%) and the corresponding standard deviations (%) of LCJDSRC under different parameters on the validation set of the LFW face database (sub-image size is 32×32). Text S1. The derivation process of Equation (16). (DOC)
Facebook
TwitterThis dataset contains CSV files for the figures in the paper titled "Multi-frequency Antenna Metrology with Sparse Measurements", to be submitted to IEEE Transactions on Antennas and Propagation or IEEE Transactions on Signal Processing. In this paper, we derive and experiment with approaches to use compressive sensing for multifrequency antenna radiation pattern measurements when samples are taken on a spherical domain. In particular, we develop sparsity and low-rank compressive sensing approaches and compare them for a simulated horn antenna. This work has applications in antenna metrology.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Chile Land Cover: Sparse Vegetation: Total data was reported at 16.932 % in 2019. This records an increase from the previous number of 16.184 % for 2018. Chile Land Cover: Sparse Vegetation: Total data is updated yearly, averaging 15.865 % from Dec 1992 (Median) to 2019, with 5 observations. The data reached an all-time high of 16.932 % in 2019 and a record low of 15.502 % in 2004. Chile Land Cover: Sparse Vegetation: Total data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s Chile – Table CL.OECD.ESG: Environmental: Land Cover: OECD Member: Annual.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Replication matrerials for Ratkovic and Tingley (2016) “ Sparse Estimation and Uncertainty with Application to Subgroup Analysis.” All files, data, and scripts needed to generate the figures and results in the paper are in this archive. The zip file contains two sets of files, for the Bechtel and Scheve (2013) replication and files for replicating the simulation study.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Land Cover: Sparse Vegetation: Sparse Shrub: Less than 15% data was reported at 4.016 sq km th in 2019. This records a decrease from the previous number of 4.028 sq km th for 2018. United States Land Cover: Sparse Vegetation: Sparse Shrub: Less than 15% data is updated yearly, averaging 4.034 sq km th from Dec 1992 (Median) to 2019, with 5 observations. The data reached an all-time high of 4.061 sq km th in 2015 and a record low of 4.016 sq km th in 2019. United States Land Cover: Sparse Vegetation: Sparse Shrub: Less than 15% data remains active status in CEIC and is reported by Organisation for Economic Co-operation and Development. The data is categorized under Global Database’s United States – Table US.OECD.ESG: Environmental: Land Cover: OECD Member: Annual.
Facebook
Twitterhttps://www.nist.gov/open/licensehttps://www.nist.gov/open/license
In this paper we describe an enhanced three-antenna gain extrapolation technique that allows one to determine antenna gain with significantly fewer data points and at closer distances than with the well-established traditional three-antenna gain extrapolation technique that has been in use for over five decades. As opposed to the traditional gain extrapolation technique, where high-order scattering is purposely ignored so as to isolate only the direct antenna-to-antenna coupling, we show that by incorporating third-order scattering the enhanced gain extrapolation technique can be obtained. The theoretical foundation using third-order scattering is developed and experimental results are presented comparing the enhanced technique and traditional technique for two sets of internationally recognized NIST reference standard gain horn antennas at X-band and Ku-band. We show that with the enhanced technique gain values for these antennas are readily obtained to within stated uncertainties of ±0.07 dB using as few as 10 data points per antenna pair, as opposed to approximately 4000 -to- 8000 data points per antenna pair that is needed with the traditional technique. Furthermore, with the described enhanced technique, antenna-to-antenna distances can be reduced by a factor of three, and up a factor of six in some cases, compared to the traditional technique, a significant reduction in the overall size requirement of facilities used to perform gain extrapolation measurements.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used for the paper SparsePoser: Real-time Full-body Motion Reconstruction from Sparse Data
It contains over 1GB of high-quality motion capture data recorded with an Xsens Awinda system while using a variety of VR applications in Meta Quest devices.
Visit the paper website!
If you find our data useful, please cite our paper:
@article{10.1145/3625264, author = {Ponton, Jose Luis and Yun, Haoran and Aristidou, Andreas and Andujar, Carlos and Pelechano, Nuria}, title = {SparsePoser: Real-Time Full-Body Motion Reconstruction from Sparse Data}, year = {2023}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, issn = {0730-0301}, url = {https://doi.org/10.1145/3625264}, doi = {10.1145/3625264}, journal = {ACM Trans. Graph.}, month = {oct}}
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract: These are the experimental data for the paper Bach, Jakob. "Using Constraints to Discover Sparse and Alternative Subgroup Descriptions" published on arXiv in 2024. You can find the paper here and the code here. See the README for details. The datasets used in our study (which we also provide here) originate from PMLB. The corresponding GitHub repository is MIT-licensed ((c) 2016 Epistasis Lab at UPenn). Please see the file LICENSE in the folder datasets/ for the license text. TechnicalRemarks: # Experimental Data for the Paper "Using Constraints to Discover Sparse and Alternative Subgroup Descriptions" These are the experimental data for the paper Bach, Jakob. "Using Constraints to Discover Sparse and Alternative Subgroup Descriptions"
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The code for the paper: Efficient FPGA-based sparse matrix-vector multiplication with data reuse-aware compression
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Large-scale quantitative analyses of biological systems are often performed with few replicate experiments, leading to multiple nonidentical data sets due to missing values. For example, mass spectrometry driven proteomics experiments are frequently performed with few biological or technical replicates due to sample-scarcity or due to duty-cycle or sensitivity constraints, or limited capacity of the available instrumentation, leading to incomplete results where detection of significant feature changes becomes a challenge. This problem is further exacerbated for the detection of significant changes on the peptide level, for example, in phospho-proteomics experiments. In order to assess the extent of this problem and the implications for large-scale proteome analysis, we investigated and optimized the performance of three statistical approaches by using simulated and experimental data sets with varying numbers of missing values. We applied three tools, including standard t test, moderated t test, also known as limma, and rank products for the detection of significantly changing features in simulated and experimental proteomics data sets with missing values. The rank product method was improved to work with data sets containing missing values. Extensive analysis of simulated and experimental data sets revealed that the performance of the statistical analysis tools depended on simple properties of the data sets. High-confidence results were obtained by using the limma and rank products methods for analyses of triplicate data sets that exhibited more than 1000 features and more than 50% missing values. The maximum number of differentially represented features was identified by using limma and rank products methods in a complementary manner. We therefore recommend combined usage of these methods as a novel and optimal way to detect significantly changing features in these data sets. This approach is suitable for large quantitative data sets from stable isotope labeling and mass spectrometry experiments and should be applicable to large data sets of any type. An R script that implements the improved rank products algorithm and the combined analysis is available.
Facebook
TwitterThe dataset used in this paper for elastic source imaging with very sparse data, both in space and time.