Topic detection and tracking (TDT) algorithms have long been developed for the discovery of topics. However, most existing TDT algorithms suffer from paying less attention to: (1) temporal distance between a pair of topics; (2) the mutual effect between highly correlated topic terms. In this paper, we proposed a novel topic detection approach by applying hierarchical clustering on the constructed concept graph (HCCG), which is able to solve aforementioned shortcomings simultaneously. In this approach, the concept is first defined as well as the concept behavior curve. Then, the temporal graph is constructed with concept as vertexes and connected by the edges sharing the same topic terms. By performing hierarchical clustering on this concept graph, the highly correlated concept behavior curves will be grouped together as topics. The proposed approach is evaluated on a number of datasets and the promising experimental results show that our approach is superior to K-means, agglomerative hierarchical clustering algorithm(AGH), and LDA with respects to precision, recall and F-measure. Moreover, the proposed concept behavior curves can be used to track the topic change trend by monitoring on the peak frequency of the concept behavior curves.
Huang, Xiaohui; Zhang, Xiaofeng; Ye, Yunming; Deng, Shengchun; and Li, Xutao
"A Topic Detection Approach Through Hierarchical Clustering on Concept Graph,"
Applied Mathematics & Information Sciences: Vol. 07
, Article 19.
Available at: https://dc.naturalspublishing.com/amis/vol07/iss6/19