The major role of a Focused Crawler (FC) is to retrieve the related pages for the specific area and it avoids the unrelated links from the crawling queue. In this paper, we introduce the effective changes over the focused crawler by considering an area distiller. The domain distiller must check the links which are considered for embedding into the distiller queues. The proposed distiller is developed based on an Optimized Hidden Na ̈ıve Bayes (OHNB) classification algorithm which is the combination of the existing Hidden Na ̈ıve Bayes (HNB) and the Enhanced Multiclass Support Vector Machines (EMSVM). Here, the Genetic Algorithm (GA) is used for optimizing the soft margins of EMSVM. Then, this new optimized MSVM takes care to remove the outliers which are available in the training dataset. Experimental results prove that the efficiency of the proposed model when compared with other existing techniques.
Digital Object Identifier (DOI)
Ramachandran, A. and A. Sahaaya Arul Mary, S.
"An Intelligently-Focused Crawling for Filtering the e-Learning Documents Using Optimized Hidden Na ̈ıve Bayes Classifier,"
Applied Mathematics & Information Sciences: Vol. 13
, Article 12.
Available at: https://dc.naturalspublishing.com/amis/vol13/iss4/12