2 datasets found

f
Feature statistical analysis.
plos.figshare.com
xls
Updated Jul 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif (2024). Feature statistical analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0302583.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302583.t002
Dataset updated
Jul 10, 2024
Dataset provided by
PLOS ONE
Authors
Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Social media platforms serve as communication tools where users freely share information regardless of its accuracy. Propaganda on these platforms refers to the dissemination of biased or deceptive information aimed at influencing public opinion, encompassing various forms such as political campaigns, fake news, and conspiracy theories. This study introduces a Hybrid Feature Engineering Approach for Propaganda Identification (HAPI), designed to detect propaganda in text-based content like news articles and social media posts. HAPI combines conventional feature engineering methods with machine learning techniques to achieve high accuracy in propaganda detection. This study is conducted on data collected from Twitter via its API, and an annotation scheme is proposed to categorize tweets into binary classes (propaganda and non-propaganda). Hybrid feature engineering entails the amalgamation of various features, including Term Frequency-Inverse Document Frequency (TF-IDF), Bag of Words (BoW), Sentimental features, and tweet length, among others. Multiple Machine Learning classifiers undergo training and evaluation utilizing the proposed methodology, leveraging a selection of 40 pertinent features identified through the hybrid feature selection technique. All the selected algorithms including Multinomial Naive Bayes (MNB), Support Vector Machine (SVM), Decision Tree (DT), and Logistic Regression (LR) achieved promising results. The SVM-based HaPi (SVM-HaPi) exhibits superior performance among traditional algorithms, achieving precision, recall, F-Measure, and overall accuracy of 0.69, 0.69, 0.69, and 69.2%, respectively. Furthermore, the proposed approach is compared to well-known existing approaches where it overperformed most of the studies on several evaluation metrics. This research contributes to the development of a comprehensive system tailored for propaganda identification in textual content. Nonetheless, the purview of propaganda detection transcends textual data alone. Deep learning algorithms like Artificial Neural Networks (ANN) offer the capability to manage multimodal data, incorporating text, images, audio, and video, thereby considering not only the content itself but also its presentation and contextual nuances during dissemination.
f
Recent work related to propaganda.
plos.figshare.com
xls
Updated Jul 10, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif (2024). Recent work related to propaganda. [Dataset]. http://doi.org/10.1371/journal.pone.0302583.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0302583.t001
Dataset updated
Jul 10, 2024
Dataset provided by
PLOS ONE
Authors
Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Social media platforms serve as communication tools where users freely share information regardless of its accuracy. Propaganda on these platforms refers to the dissemination of biased or deceptive information aimed at influencing public opinion, encompassing various forms such as political campaigns, fake news, and conspiracy theories. This study introduces a Hybrid Feature Engineering Approach for Propaganda Identification (HAPI), designed to detect propaganda in text-based content like news articles and social media posts. HAPI combines conventional feature engineering methods with machine learning techniques to achieve high accuracy in propaganda detection. This study is conducted on data collected from Twitter via its API, and an annotation scheme is proposed to categorize tweets into binary classes (propaganda and non-propaganda). Hybrid feature engineering entails the amalgamation of various features, including Term Frequency-Inverse Document Frequency (TF-IDF), Bag of Words (BoW), Sentimental features, and tweet length, among others. Multiple Machine Learning classifiers undergo training and evaluation utilizing the proposed methodology, leveraging a selection of 40 pertinent features identified through the hybrid feature selection technique. All the selected algorithms including Multinomial Naive Bayes (MNB), Support Vector Machine (SVM), Decision Tree (DT), and Logistic Regression (LR) achieved promising results. The SVM-based HaPi (SVM-HaPi) exhibits superior performance among traditional algorithms, achieving precision, recall, F-Measure, and overall accuracy of 0.69, 0.69, 0.69, and 69.2%, respectively. Furthermore, the proposed approach is compared to well-known existing approaches where it overperformed most of the studies on several evaluation metrics. This research contributes to the development of a comprehensive system tailored for propaganda identification in textual content. Nonetheless, the purview of propaganda detection transcends textual data alone. Deep learning algorithms like Artificial Neural Networks (ANN) offer the capability to manage multimodal data, incorporating text, images, audio, and video, thereby considering not only the content itself but also its presentation and contextual nuances during dissemination.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif (2024). Feature statistical analysis. [Dataset]. http://doi.org/10.1371/journal.pone.0302583.t002

Feature statistical analysis.

Explore at:

332 scholarly articles cite this dataset (View in Google Scholar)

xlsAvailable download formats

Unique identifier

https://doi.org/10.1371/journal.pone.0302583.t002

Dataset updated

Jul 10, 2024

Dataset provided by

PLOS ONE

Authors

Akib Mohi Ud Din Khanday; Mudasir Ahmad Wani; Syed Tanzeel Rabani; Qamar Rayees Khan; Ahmed A. Abd El-Latif

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Social media platforms serve as communication tools where users freely share information regardless of its accuracy. Propaganda on these platforms refers to the dissemination of biased or deceptive information aimed at influencing public opinion, encompassing various forms such as political campaigns, fake news, and conspiracy theories. This study introduces a Hybrid Feature Engineering Approach for Propaganda Identification (HAPI), designed to detect propaganda in text-based content like news articles and social media posts. HAPI combines conventional feature engineering methods with machine learning techniques to achieve high accuracy in propaganda detection. This study is conducted on data collected from Twitter via its API, and an annotation scheme is proposed to categorize tweets into binary classes (propaganda and non-propaganda). Hybrid feature engineering entails the amalgamation of various features, including Term Frequency-Inverse Document Frequency (TF-IDF), Bag of Words (BoW), Sentimental features, and tweet length, among others. Multiple Machine Learning classifiers undergo training and evaluation utilizing the proposed methodology, leveraging a selection of 40 pertinent features identified through the hybrid feature selection technique. All the selected algorithms including Multinomial Naive Bayes (MNB), Support Vector Machine (SVM), Decision Tree (DT), and Logistic Regression (LR) achieved promising results. The SVM-based HaPi (SVM-HaPi) exhibits superior performance among traditional algorithms, achieving precision, recall, F-Measure, and overall accuracy of 0.69, 0.69, 0.69, and 69.2%, respectively. Furthermore, the proposed approach is compared to well-known existing approaches where it overperformed most of the studies on several evaluation metrics. This research contributes to the development of a comprehensive system tailored for propaganda identification in textual content. Nonetheless, the purview of propaganda detection transcends textual data alone. Deep learning algorithms like Artificial Neural Networks (ANN) offer the capability to manage multimodal data, incorporating text, images, audio, and video, thereby considering not only the content itself but also its presentation and contextual nuances during dissemination.

Clear search

Close search

Google apps

Main menu

Feature statistical analysis.

Recent work related to propaganda.

Feature statistical analysis.