Ensemble Feature Selection for Network Intrusion Detection Systems Using Explainable AI: A Frequency-Based Approach
Primary Investigator:
Mustafa Abdallah
Ismail Bibers and Mustafa Abdallah
Abstract
Feature selection is a crucial step in enhancing the performance, efficiency, and interpretability of machine learning models, especially for high-dimensional datasets like those in network security. This study introduces an ensemble-based feature selection framework leveraging
Explainable AI (XAI) methods, including SHAP, LOCO, Profiled Weighting (ProfWeight) ,PFI, and DALEX, to rank
features by importance. A frequency-based aggregation mechanism is employed in order to identify the most critical features, prioritizing those consistently ranked high across
methods. The proposed framework was evaluated using the CICIDS-2017 dataset, a benchmark for intrusion detection
research, and tested with multiple independent classifiers, including Random Forest, Logistic Regression, KNN, and AdaBoost. Evaluation results demonstrate significant improvements in classification accuracy, precision, and
computational efficiency. This work highlights the potential of integrating XAI with feature selection to tackle the
evolving challenges of network intrusion detection. It also paves the way for more accurate and interpretable intrusion detection systems. We made the implementation of our proposed feature ensemble framework used in this study available for the network security research community.