AusWeb 03 Banner

Designing a Regional Crawler for Distributed and Centralized Search Engines

Milad Shokouhi, Department of Computer Engineering, Bu-Ali Sina University, Hamedan, Iran. Email: shokouhi@ce.basu.ac.ir

Pirooz Chubak, Department Computer Engineering, Sharif University of Technology, Tehran, Iran: chubak@ce.sharif.edu


Keywords

Regional Crawler, Web Crawler Architecture, Multi-agent Systems


Abstract

Today, by the growth of WWW, the significance and popularity of search engines are increasing day by day. However, today web crawlers are unable to update their huge search engine indexes concurrent to the growth in the information available on the web. Most of times they download some unimportant pages and ignore the pages that their probability of being searched is noticeable. This sometimes leads to incapability of search engines for giving up to date information to the users. Regional Crawler that we introduce as a new crawling strategy in this paper, improves the problem of updating and finding new pages to some extent by gathering users’ common needs and interests in a certain domain, which can be as small as a LAN in a department of a university or as huge as a country. It crawls the pages containing information about these interests at first, instead of crawling the web without any predefined order. In this paper, we design the Regional Crawler architecture and introduce its application in centralized and distributed search engines.


[ Full Paper ] [ Presentation ] [ Proceedings ] [ AusWeb Home Page ]



AusWeb04. The Tenth Australian World Wide Web Conference, Seaworld Nara Resort, Gold Coast, from 3rd to 7th July 2004 Contact: Norsearch Conference Services +61 2 66 20 3932 (from outside Australia) (02) 6620 3932 (from inside Australia) Fax (02) 6626 9317