To search, Click below search items.


All Published Papers Search Service


A Fast Parallel Association Rule Mining Algorithm Based on The Probability of Frequent Itemsets


Marghny H. Mohamed, Hosam E. Refaat


Vol. 11  No. 5  pp. 152-162


Frequent itemset finding is the most costly processing step in analyzing large transactional databases. At each stage in discovering frequent itemset a huge number of candidate itemsets are produced. Then, if we predict which candidate itemset will be frequent and which will not, we can reduce wastage of time in the processing unfrequent itemsets. In this paper we propose a new parallel algorithm for frequent itemset mining, called probability of frequent itemset (PFI) mining algorithm. The PFI algorithm can predict frequency of the candidate based on the probability of its subset and makes priority between candidate itemsets base on it's probability. Moreover, the PFI algorithm passes the database only one time by dividing the database horizontally and distributes it over the system nodes. Also, while finding the k-itemsets, the algorithm can start a new stage (finding k+1 itemsets) with the discovered frequent k-itemsets while some other itemsets in the same stage have not been finished yet. Moreover, we introduce a method for reducing the number of transactions. We present the result on the performance of our algorithm on various datasets, and compare it against well known algorithms.


Parallel Systems, Distributed shared memory, data mining, Association rule, Linda system, Tuple-space, Jini, JavaSpace