AusWeb 03 Banner

Distributed search using subject-specific Web search engines

Mikhail Bessonov, Wilhelm-Schickard-Institute for Informatics, Department for Computer Engineering (Prof. Dr. W. Rosenstiel), University of Tübingen, Sand 13, Tübingen 72076, Germany. Email: bessonov@informatik.uni-tuebingen.de


Keywords

search engine, distributed information retrieval, IR, collection selection, networked IR


Abstract

The Web of today offers thousands of smaller topic-oriented or regional search engines. The old ones continue to exist and new ones appear in spite of the progress achieved by their generic Web-wide competitors, because they produce better results in their areas of specialisation. However, finding and choosing the best specialised search engines for a particular information need is difficult. This paper suggests an approach to building statistical descriptions of topic-specific document collections that can be stored in and efficiently retrieved from a directory service like LDAP. An algorithmic framework is presented for parallel automatic propagation of queries to several most suitable collections. Two statistical models are compared experimentally within the framework. They differ in the amount of data needed for storage and in the resulting precision of collection selection. The conclusions are applicable to the problems of distributed information retrieval in general.


[ Full Paper ] [ Proceedings ] [ AusWeb Home Page ]



AusWeb04. The Tenth Australian World Wide Web Conference, Seaworld Nara Resort, Gold Coast, from 3rd to 7th July 2004 Contact: Norsearch Conference Services +61 2 66 20 3932 (from outside Australia) (02) 6620 3932 (from inside Australia) Fax (02) 6626 9317