To search, Click below search items.


All Published Papers Search Service


Word Similarity for Document Grouping using Soft Computing


Masrah Azrifah Azmi Murad, Trevor Martin


Vol. 7  No. 8  pp. 20-28


The technology world has provided a more efficient and quicker way of accessing information through the web and databases in organizations that implement information systems in order to achieve a competitive edge. The simplest way of filtering information is to extract keywords in measuring the documents relevance. Nonetheless, getting to the right document is often a problem. Synonymy i.e., two words with the same meaning, for example, taxi and cab is a major problem in information searching. This work uses the soft computing techniques in the area of information retrieval and they encompass both fuzzy set theory and probability theory. We propose an algorithm for computing asymmetric word similarities (AWS) to overcome the synonymy problem. The algorithm is computed using mass assignment based on fuzzy sets of words. A key feature of our algorithm is that it is incremental, i.e. words (and documents) can be added or subtracted without extensive re-computation. AWS produced similarity measures of consistently 10% higher than tf.idf algorithm and performed successful document groupings.


fuzzy set, soft computing, asymmetric word similarity, information retrieval