The performance of indexing systems is very important for a search engine. Usually, indexing systems on large-scale clusters can provide high search efficiency, but it brings expensive hardware costs. The costs would be greatly reduced if a distributed indexing system runs on small-scale clusters connected by the Internet. Two current inverted file partitioning schemes: document partitioning and term partitioning, have their merits individually. A two-tier distributed full-text indexing system is implemented, which uses document partitioning among the clusters and term partitioning inside each cluster. Our experiments show that the system performs well in search efficiency, resource consuming and load balance.
Digital Object Identifier (DOI)
Zhang, Wei-Zhe; Chen, Hui-Xiang; He, Hui; and Chen, Gui
"A Two-Tier Distributed Full-Text Indexing System,"
Applied Mathematics & Information Sciences: Vol. 08
, Article 39.
Available at: https://dc.naturalspublishing.com/amis/vol08/iss1/39