Supervised classification is a well-known task in data-mining and it is widely used in many real world domains. Classifiers are automatic prediction systems used to predict the class label of items described by a set of features. In many areas, it is important to take into account some extra knowledge and constraints in addition to the one learnt or encoded by the classifier. In this paper, we propose an approach allowing to exploit the available domain knowledge with the predictions of a classifier.More precisely, we propose to post-process the predictions of a classifier in order to take into account some domain knowledge. This approach can be applied with any classifier be it probabilistic or not.We propose post-processing criteria and methods to encode and exploit different kinds of domain knowledge. Finally, the paper provides extensive experimental studies on a representative set of benchmarks and classification problems including imbalanced datasets.We also provide a case study on two crucial problems in computer security which are intrusion detection and alert correlation. Interestingly enough, the results show that using only some available knowledge about the training datasets or the performances of the used classifiers can improve these classifiers’ efficiency while fitting the available domain knowledge.
Digital Object Identifier (DOI)
Kezih, Mouaad; Taibi, Mahmoud; Benferhat, Salem; and Tabia, Karim
"On Post-Processing the Outputs of Prediction Systems: Strategies, Empirical Evaluations and a Case Study in Computer Security,"
Applied Mathematics & Information Sciences: Vol. 10
, Article 42.
Available at: https://dc.naturalspublishing.com/amis/vol10/iss4/42