To search, Click below search items.

 

All Published Papers Search Service

Title

Batch -Incremental Classification of Stream Data Using Storage

Author

Parita Ponkiya, Rohit Srivastava

Citation

Vol. 15  No. 4  pp. 95-99

Abstract

Data mining is a technique that is used to extract useful knowledge from large amount of data. And classification is most important task of data mining. Now a day¡¯s in real world stream data is most important source of knowledge. Stream data is data that continuously arrives over the time i.e. growth of data is increasing faster and faster. Traditional classification algorithms are not suitable for such data. Continuous growth of the data makes previously constructed classification tree outdated and it is to be reconstructed from the scratch, which is very time consuming. Another major issue is the data-type, as each of them is to be treated separately, among which the continuous data produces major challenge in the tree building task, needs to be discretized. Out of many classifications algorithms, ID3 is a famous tree based classification algorithm which deals with only categorical data and uses information gain for attribute selection. In this paper the tree based Batch incremental classification algorithm is proposed for stream data that outputs tree same as ID3. It uses CAIM based discretization for continuous attributes and various attribute selection criterions along with storage structure for the strategic information of every node and the historical data to rebuild decision tree. CAIR, CAIU, CAIM criterions are used as attribute selection criterions and comparison is also provided between these attribute selection measures

Keywords

Classification, CAIR, CAIM, CAIU, Information Gain, Batch Incremental Classification

URL

http://paper.ijcsns.org/07_book/201504/20150416.pdf