To search, Click below search items.


All Published Papers Search Service


Network Intrusion Data Analysis via Consistency Subset Evaluator with ID3, C4.5 and Best-First Trees


Shih Yin Ooi, Yew Meng Leong, Meng Foh Lim, Hong Kuan Tiew, Ying Han Pang


Vol. 13  No. 2  pp. 7-13


Intrusion Detection System (IDS) is widely used to verify the incoming traffic whether it is malicious or benign connection, but traditional IDS requires a lot of human efforts and costs vast amount of computational overhead to build the set of rules in order to distinguish the intruders connection (from suspicious traffic). In view of this limitation, many researchers are adopting and researching the potential data mining and machine learning techniques to assist the stated tasks in a quicker and semi-automated manner. One of the popular statistical models would be the decision tree. It builds a simpler and straightforward tree model based on existing pre-classified network traffic database. Through the tree generation and rule discovery from the tree (rules to classify normal and malicious traffic), it is able to predict the unknown network anomalies. This prediction is meaningful to supplement the honey pot analysis. In this paper, ID3, C4.5 and Best-First trees are tested and compared on the NSL-KDD dataset. Data engineering process (including data preprocessing and feature selection) is very important in data mining, so that the rightful data can be retained for building the hypothesis, while the meaningless data should be removed. Thus, numerous feature selection techniques are explored, tested and compared in this paper. Performances are represented by using Receiver Operating Characteristic (ROC) curve, and compared through McNemar tests.


Network Intrusion Analysis, Data Mining, Decision Tree, IDS, C45, Best-First Tree, Consistency Subset Evaluator