Fake News Detection Using Machine Learning Techniques

Authors:
DPID: 811

Abstract

In today’s digital era, the rapid spread of misinformation poses a significant challenge, influencing public perception and decision-making. This project focuses on developing an automated Fake News Detection System using machine learning techniques to differentiate between genuine and misleading news articles. The system follows a structured pipeline, beginning with data preprocessing, followed by feature extraction, and finally, classification using multiple machine learning algorithms. The dataset used consists of labeled news articles, categorized as either real or fake. Preprocessing techniques such as text normalization, removal of stop words, punctuation elimination, and lemmatization are applied to enhance data quality. To convert textual data into numerical representations, Term Frequency-Inverse Document Frequency (TF-IDF) is utilized, ensuring that significant terms contribute effectively to classification. Several machine learning models, including Logistic Regression, Passive Aggressive Classifier, Multinomial Naïve Bayes, and Support Vector Machine (SVM), are implemented to assess their effectiveness in detecting fake news. Among these models, the Passive Aggressive Classifier demonstrates the highest accuracy, making it a suitable choice for real-time fake news identification. Performance evaluation is conducted using metrics such as accuracy, precision, recall, and confusion matrix analysis to determine the efficiency of each model. The developed system offers a reliable solution to counter misinformation, with potential applications in journalism, social media platforms, and news verification tools. Future enhancements could involve deep learning models, real-time web scraping, and multilingual support to further improve detection capabilities. This project serves as a crucial step toward mitigating the negative impact of false information in the digital landscape.