1;3409;0c Mining semantic distance between corpus terms

Mining semantic distance between corpus terms

ACM first Ph.D. workshop in CIKM, 2007
Pages: 49-54DOI: 10.1145/1316874.1316883



In this paper, we face two problems in classical semantic similarity measures. Firstly, the context-dependency problem in knowledge-base measures since no one takes into account the context of the target domain. That is, a multisource context-dependent approach is presented. Secondly, the coverage problem with these measures since similarities can only be calculated between concepts included in a taxonomy. Moreover, "pure" corpus-based measures are still way from achieving performance reached by knowledge based measures. We present a more complex corpus-based approach using a taxonomy and data mining techniques in order to compute semantic distances between terms uncovered by the taxonomy. Experiments made show clearly the effectiveness of both proposed approaches.