Milad Shokouhi, Department of Computer Engineering, Bu-Ali Sina University, Hamedan, Iran. Email: shokouhi@ce.basu.ac.ir
Pirooz Chubak, Department Computer Engineering, Sharif University of Technology, Tehran, Iran: chubak@ce.sharif.edu
Regional Crawler, Web Crawler Architecture, Multi-agent Systems
Today, by the growth of WWW, the significance and popularity of search engines are increasing day by day. However, today web crawlers are unable to update their huge search engine indexes concurrent to the growth in the information available on the web. Most of times they download some unimportant pages and ignore the pages that their probability of being searched is noticeable. This sometimes leads to incapability of search engines for giving up to date information to the users. Regional Crawler that we introduce as a new crawling strategy in this paper, improves the problem of updating and finding new pages to some extent by gathering users’ common needs and interests in a certain domain, which can be as small as a LAN in a department of a university or as huge as a country. It crawls the pages containing information about these interests at first, instead of crawling the web without any predefined order. In this paper, we design the Regional Crawler architecture and introduce its application in centralized and distributed search engines.
[ Full Paper ] [ Presentation ] [ Proceedings ] [ AusWeb Home Page ]