1 dataset found

c
Data from: Sparse Machine Learning Methods for Understanding Large Text...
s.cnmilf.com
data.staging.idas-ds1.appdat.jsc.nasa.gov
+3more
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dashlink (2025). Sparse Machine Learning Methods for Understanding Large Text Corpora [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sparse-machine-learning-methods-for-understanding-large-text-corpora
Explore at:
Dataset updated
Apr 10, 2025
Dataset provided by
Dashlink
Description
Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. This paper posits that these methods can be extremely useful for understanding large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multi-document text summarization and (b) comparative summarization of two corpora, both using parse regression or classifi?cation; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our approach using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. Citation: L. El Ghaoui, G. C. Li, V. Duong, V. Pham, A. N. Srivastava, and K. Bhaduri, “Sparse Machine Learning Methods for Understanding Large Text Corpora,” Proceedings of the Conference on Intelligent Data Understanding, 2011.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Dashlink (2025). Sparse Machine Learning Methods for Understanding Large Text Corpora [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/sparse-machine-learning-methods-for-understanding-large-text-corpora

Data from: Sparse Machine Learning Methods for Understanding Large Text Corpora

Explore at:

Dataset updated

Apr 10, 2025

Dataset provided by

Dashlink

Description

Sparse machine learning has recently emerged as powerful tool to obtain models of high-dimensional data with high degree of interpretability, at low computational cost. This paper posits that these methods can be extremely useful for understanding large collections of text documents, without requiring user expertise in machine learning. Our approach relies on three main ingredients: (a) multi-document text summarization and (b) comparative summarization of two corpora, both using parse regression or classifi?cation; (c) sparse principal components and sparse graphical models for unsupervised analysis and visualization of large text corpora. We validate our approach using a corpus of Aviation Safety Reporting System (ASRS) reports and demonstrate that the methods can reveal causal and contributing factors in runway incursions. Furthermore, we show that the methods automatically discover four main tasks that pilots perform during flight, which can aid in further understanding the causal and contributing factors to runway incursions and other drivers for aviation safety incidents. Citation: L. El Ghaoui, G. C. Li, V. Duong, V. Pham, A. N. Srivastava, and K. Bhaduri, “Sparse Machine Learning Methods for Understanding Large Text Corpora,” Proceedings of the Conference on Intelligent Data Understanding, 2011.

Clear search

Close search

Google apps

Main menu

Data from: Sparse Machine Learning Methods for Understanding Large Text...

Data from: Sparse Machine Learning Methods for Understanding Large Text Corpora