Indexing the Net - A Review of Indexing Tools


Tony Barry, Head, Centre for Networked Access to Scholarly Information, Australian National University Library, Canberra A.C.T. 0200, Australia. Phone +61 6 249 4632. Fax +61 6 279 8120. Email: tony@info.anu.edu.au Home Page: http://snazzy.anu.edu.au/People/TonyB.html [HREF1]

Joanna Richardson, Information Technology Librarian, Bond University, Gold Coast, QLD 4229, Australia. Phone: +617 5595 1401 Fax: + 617 5595 1480 Email: richardj@bond.edu.au Home Page: http://www.bond.edu.au/Bond/Library/People/jpr/ [HREF2]


Keywords: Automated indexing, Indexing, Search engines

Introduction

The whole area of indexing the Internet is one of rapid evolution. This paper represents a snapshot in time of our thinking on this topic. A more recent version [HREF3] can be accessed remotely.

In the heady period after World War 2, buoyed by the success that research and development had achieved during the war, a number of very ambitious research projects were begun which continued and expanded with the funds made available for strategic R&D during the cold war. Some of these succeeded; some, being wildly overambitious but supported by post war confidence and lack of knowledge, did not. In the physical sciences these could be typified by the success achieved in the invention of semiconductor electronics as a replacement for clumsy vacuum tubes and the failure of the attempts to achieve controlled thermonuclear fusion.

In the field of natural language processing [1] a big failure thus far has been automatic translation, despite the continual injection of funds by the intelligence community. Two stories--probably mythical--illustrate the problem that language is complex and our theoretical understanding is still incomplete. The stories relate to attempts to achieve automated translations from Russian to English and the reverse. Taking the expression "Out of sight out of mind" and then translating it to Russian and back again is supposed to have produced "Invisible idiot". The expression "This spirit is willing but the flesh is weak" resulted in "The vodka is good but the meat is rotten". Even if these stories are not true, one feels nevertheless that they should have been, since they clearly show that simple matching as a method of translation will not work.

Indexers have long felt the same about their field and are well familiar with the slippery nature of language. A classic example is that the concurrent incidence of the word "blind" and "venetian" will just as well occur in works about ophthalmology in Venice as it will in those about coverings for windows. "Longshoremen", "Stevedores" and "wharfies" are all words which will occur singly in documents about materials handling at ports but rarely together. In the database Psychlit [HREF4], WAIS will retrieve items about psychological testing.

These problems have certainly induced a large theoretical literature in the manual indexing community and a certain smugness about the prospects for automatic indexing which as we will see has been misplaced. A quarter of a century ago, the late [HREF5] Gerald Salton [HREF6] and others, showed that automatic methods of generating indexes based on statistical matching algorithms performed as well as manual indexing if done within a restricted group of similar documents [2].

Research into automated indexing was not possible until an evaluation methodology was available, hence the contribution of the work of Cyril Clevedon begun in 1957 at the ASLIB Cranfied projects [3] and later extended [4]. While a variety of evaluative measures have been attempted, the two introduced in these studies have remained central. They are a measure of recall, which is the proportion of material relevant to the query that is retrieved out of the total in the database and the precision of the search. This is the proportion of the retrieved set that is relevant. The question of how relevance is judged is also a thorny problem and work in this area has recently been reviewed by Linda Schamber [7].

Gerald Salton based his work upon these evaluative measures, which are best summarised in his last monograph [5]. His approach was largely from the algorithmic and statistical side. While he did not neglect the complexities of language, that emphasis is more evident in the work of Karen Sparck-Jones [6].

There is also an Australian link with Gerald Salton. In about 1970, he was sponsored by the Special Libraries Association to give a one week intensive course on automated information retrieval. One of the authors of this paper (Barry) attended that course; a number of others attendees have, over the years, taken up significant positions in the library community in Australia. In the case of some of them this lecture course by Salton has had a significant effect on their thinking and approach to services on the internet.

These techniques were only an academic curiosity until computers became powerful enough to handle large bodies of information and word processing made available large bodies of text for processing. They then started to get picked up by commercial firms for application in large text search engines - and one assumes the intelligence community. A recent review of automatic indexing work by Kantor [8], written in 1993, was just too early to pick up these rapid developments on the internet.

Late in the 80's Thinking Machines Corporation, Apple, and Dow Jones commenced a project to develop search engines under the leadership of Brewster Kahle. The influence of these techniques and in particular the work of Salton is evident in Kahle's early work on Wide Area Information Servers (WAIS) [HREF7]. The wais server technology provides much of the underpinning for many of the search engines used on the network today. Wais as a search engine has become a default standard for text databases on the network but has also provided the impetus for further developments.

The rise of the internet provided an environment in which this type of technology could be deployed since suddenly there was wide availability of large bodies of text and simple access to it across the network. Some interesting class notes and links to references [HREF8] have been provided by the University of Southern California in their "CS586: Database Systems" unit [HREF9] which gives access to further information. As far as access to these services themselves are concerned, most university campus based web services provide gateways to these indexes such as the gateway at ANU [HREF10].

Attempts to provide global indexing across the internet began with archie, developed by Peter Deutsch when at McGill University. This service was designed to index file names on anonymous ftp archives and is still an essential tool for those looking for public domain and shareware software. This was followed by a succession of search tools for the gopher protocol, principally veronica and jughead. Veronica retrieved and indexed every word in text documents retrieved via the gopher protocol. This soon revealed the limitations of full text searching since without some thought in the choice of search terms, the precision of the retrieved set, (the proportion of relevant items of those retrieved) was often low. On the assumption that gopher menus might be more selective in their choice of words used, veronica was developed to index only those words and did appear to succeed in this aim, although detailed evaluative evidence seems to be lacking. However now this seems to only a matter of historical interest as gopher usage declines. An excellent summary of the older and current tools has been provided by John December [HREF11], particularly the section on networked information retrieval [HREF12]. This section has been continuously updated so it makes an instructive contrast with the static IETF RFC1689 [HREF13]

With the advent of the world wide web, the opportunity arose to enhance the quality of indexing because of the richer information provided via the tagging used in hypertext mark up language and the implicit information which can be gleaned about the relationships between documents implied by the links between them. During the last year there has been an explosion of indexing services on the network. Unfortunately this has continued through the preparation of this paper! Initially such services were developed as research projects but more recently there is clear commercial intent. Lycos, for example, not only offers a commercial service but also uses advertising to fund its non-commercial services. Alta Vista is used by DEC to promote their search engine.

Comparison and Evaluation of Services

Two of the most commonly used types of services are browsing through subject trees and keyword searching using search engines. A subject tree or directory is an organised hierarchy of categories for browsing by subject. Many of these subject trees also have their own indexes which are searchable by keyword. Yahoo [HREF14], EINet Galaxy [HREF15] and the WWW Virtual Library [HREF16] offer links with brief annotations. Other such as Magellan [HREF17] and GNN's Whole Internet Catalog [HREF18] provide commentaries and rating.

It is the second tool upon which we will focus in this paper: search engines. These use automated programs (robots, spiders, crawlers, wanderers, worms) to examine documents and index them, i.e. enter them into a database, principally on the basis of title, URL, and/or text.

Given the number of search engines, ca. 20 at the time of writing this paper, it has become fashionable for writers/researchers to attempt to answer the questions: Which tool is really the best? Where should I go first? Many of these tests/evaluations have been listed in this paper's annotated bibliography (Appendix A). The bibliography is certainly not intended to be comprehensive; however, it does point to several resources which attempt to keep track of all the current research in this area.

Needless to say, the above-mentioned tests have varied as to methodology. In an effort to compare, i.e. evaluate the performance of, search engines, researchers have used a single keyword (ferromagnetism, EXAFS), a phrase (freedom of information, to be or not to be) and/or a string (recipe wheat beer, water quality agriculture nitrates). In addition, some of the tests have been performed prior to the launch of the newer tools such as Alta Vista [HREF19] and Excite [HREF20]. At the time of writing, none have included HotBot [HREF21].

The table below indicates how the various search tools have fared:

Top Scoring Search Tools
Author Criteria Top Score Other High Scores Other High Scores Comments
CNet Education content SavvySearch Yahoo Open Text pre-dates Alta Vista, Excite, etc.
Lebedev Total hits Alta Vista Lycos Inktomi focus on scientific info
Leighton Relevance and precision Lycos InfoSeek --- pre-dates Alta Vista, Excite, etc.
Leita Large database, full-text indexing Open Text InfoSeek Lycos recommended for quick, pinpointed searches
Leonard Search engines Alta Vista --- --- MetaCrawler rated #1 overall
Leonard Meta-search engines MetaCrawler --- --- ---
Liu A number of factors Alta Vista --- --- our interpretation of ranking
Randall Usability, speed, precision InfoSeek WebCrawler WWWWorm pre-dates Alta Vista, etc.

Top Scoring Search Tools Cont:
Author Criteria Top Score Other High Scores Other High Scores Comments
Scoville Total no. of hits per query Lycos Open Text --- no mention of Alta Vista
Scoville Relevance of top 10 hits Lycos Excite InfoSeek ---
Steinberg Not identified Alta Vista --- --- our interpretation of ranking
Tillman Not identified InfoSeek Alta Vista --- ---
Tomaiuolo Average no. of relevant hits Alta Vista InfoSeek Lycos, Magellan 200 actual Reference Desk questions
UMichigan A number of factors Yahoo Alta Vista Lycos our interpretation of ranking
Venditto Relevance InfoSeek Guide Excite --- ---
Venditto Comprehensiveness Alta Vista --- --- ---
Winship Content, features, output, no. of hits Lycos --- --- pre-dates Alta Vista, etc.

In examining these evaluations/preferences as part of our literature survey, it wasn't really the results, i.e. the ranking, which proved to be of interest. After all, comparing some of these tests would be like comparing apples and oranges. It was the challenges identified in using the various tools that proved most relevant to our investigation. The more important limitations are listed below:

Echoing some of these concerns, librarians and others have observed that in order to obtain a truly exhaustive set of results on a topic, the user would generally need to grapple with concepts such as implied "or", nested Boolean statements, relevance ranking and term weighting. So, in some cases, they simply point the user towards subject directories/trees instead.

At the 5th International WWW Conference held in March 1996, in a panel session entitled "Efficiency of Internet Indexing", Nick Arnett and Darren Hardy [HREF22] noted the following issues:



In light of these challenges, it is not surprising that writers repeatedly urge clients to use more than one search tool--as evidenced in the table below:


Different Search Engines/Different Results
Author Comments
Conte Unfortunately, no single guide is familiar with every resource. What you need is a comprehensive set of tools for searching the Net.
Eagan ...because these search engines search in different ways and search different parts of the Internet, doing the same search using different search engines will often give you wildly differing results....try out a number of the search engines, and understand that the Internet and the search engines are changing daily.
Felt Because each robot is programmed to search the Web in a different way, the information stored in each database can be very different.


Different Search Engines/Different Results Cont:
Author Comments
Koster In the longer term complete Web-wide traversal by robots will become prohibitively slow, expensive, and ineffective for resource discovery.
Leita ...you should try other search engines, too. Each has its own strengths and weaknesses, and each has a chance of delivering just what you're looking for.
Randall ...one size doesn't fit all and needs vary widely ... [Search engines] all have their strengths and weaknesses,and your best bet is to learn how to use an entire arsenal of them.


Different Search Engines/Different Results Cont:
Author Comments
Scoville A directory is great if you're simply interested in a general topic. ... as your questions become more specific, ... you need a search engine. ... Use more than one search engine.
Venditto The most striking conclusion we drew from our tests was that all these engines had a long way to go before they could be relied upon to deliver consistently accurate search findings. ... no two search engines yielded the same results on a search during our entire testing period. ... different search engines are suitable for different types of tasks.
Webster [Speaking of Webcrawler and Lycos] These differences contribute to different result sets that are returned by different search engines for the same query. ... No single search tool can be relied upon to satisfy every query.
Weiss There is no one ultimate search tool for the Web. Because of its nature, various search engines use different search techniques and yield different "views" of the Web.
Winship Since [searching tools] start from different base documents and work in different ways, none of the resulting indexes are comprehensive and nor are the resources listed completely duplicated.

Issues

The future of these global indexes is uncertain. As the network grows, attempts to provide comprehensive indexing will produce increasingly unwieldy results. Not only will the size of the databases become a problem but also the retrieval precision for the same recall performance will drop due in part to the decreased quality control discussed by Barry [HREF23]. In the world of paper the solution to this is through the creation of specialised abstracting and indexing services in a restricted domain where not only do the filters of refereeing apply to what is produced but a second stage of quality filtering takes place in the decision on what to index. At the moment we are seeing the emergence of two separate systems to access information, one based on paper and the other on internet services. The Integrated Document Access Project (IDA) [HREF24], for which one of the authors (Barry) acts as an adviser, has uncovered a great deal of interesting information. The sponsors of this research--the Council of Australian University Librarians (CAUL) [HREF25] and the National Library of Australia--are currently considering how this can be taken further now that the final report [HREF26] has been submitted.

One scalable solution to indexing which offers the ability to provide a selective approach to indexing is the harvest system [HREF27]. ADFA (Australian Defence Force Academy), the ANU, National Library of Australia, and Charles Sturt University are involved in a joint project [HREF28], for which ADFA is the lead agency, to create specialised indexes within Australia. This project is proceeding slowly. Currently each of the participating institutions has deployed Harvest over a number of their internal sites. Charles Sturt has done some indexing of external sites and the National Library is discussing indexing all Commonwealth Government Agencies. The above agencies are also seeking funding to improve this indexing approach by the use of meta data embedded in documents, probably as iterated in the Dublic Core metadata proposal [HREF29].

Conclusion

As long as users continue to treat these systems as if they were capable of reading natural language and ignore the compexities underneath, there will be inherently a conflict between education and interface. Making the latter a natural language is currently beyond our capabilities.

Appendix A

Bibliography of Search Tools

Arents, Hans C. (1996). "A Selection of Internet Search Tools". Available HTTP: http://www.mtm.kuleuven.ac.be/Services/search.html

A rated list, selected and updated for the Belgian research community. Search tools are rated for their usability and effectiveness. Categories include search and meta-search engines, white and yellow pages, newsgroups, software archives, geographical and business directories.

Berkeley Digital Library SunSITE (1996). "Internet Search Tool Details". Available HTTP: http://sunsite.berkeley.edu/Help/searchdetails.html

Information about the most popular search tools, as gleaned from the tools themselves. Updated regularly.

CNet. Community Access (1995?). "Web Search Tools: An Educational Evaluation". Available HTTP: http://cnet.unb.ca/cabox/learning/win/webserch.html

Rates 18 search engines on the basis of interest to educators. Probably an example of how not to structure an evaluation exercise! SavvySearch scored highest, followed closely by Yahoo and Open Text.

Conte, Jr., Roy (1996). "Guiding Lights". Internet World,(7:5), 40-44. Available HTTP: http://pubs.iworld.com/iw-online/May96/guiding.html

Brief, informative summaries of tools, which include search and metasearch engines, directories, gopher archives, and newsgroups. The author concludes that "the perfect search tool doesn't yet exist.... users will require more sophisticated search tools."

Eagan, Ann and L. Bender (1996). "Spiders and Worms and Crawlers: Oh My: Searching on the World Wide Web". Untangling the Web. Proceedings of the Conference, April 26, 1996, University of California, Santa Barbara. Available HTTP: http://www.library.ucsb.edu/untangle/eagan.html

Divides search engines into four categories: Classics (World Wide Web Worm, WebCrawler, Yahoo!, EINet Galaxy), Leaders (InfoSeek Guide, Lycos, OpenText), Newer Kids on the Block (Magellan, Inktomi, Alta Vista, Excite) and Meta Search Engines (MetaCrawler, Savvy Search). Short bibliography.

Felt, Elizabeth and Jane Scales (1996). "Web Robots". Available HTTP: http://www.wsulibs.wsu.edu/general/robots.htm

A brief description of fourteen "robot-compiled" search engines.

Fillmore, Laura (1995). "Beyond the Back of the Book: Indexing in a Shrinking World". Paper presented at American Society of Indexers, Inc. and Societe Canadienne pour l'analyse des documents, Montreal, June 10, 1995. Available HTTP: http://www.obs-us.com/obs/english/papers/mont1.htm

Discusses the new role of indexers in the web environment to become "link editors" and the creation of "bookbots", which would empower the reader to create a "book" from the available information on the net.

Koch, Traugott (1996). "Literature about Search Services". Available HTTP: http://www.ub2.lu.se//desire/radar/lit-about-search-services.html

Covers search service comparisons, search services and retrieval, indexing the Internet, and collections/bibliographies. Entries are in inverted chronological order. Excellent, continually updated resource.

Koster, Martijn (1995?). "Robots in the Web: Threat or Treat?". ConneXions, (9:4), April 1995. Available HTTP: http://info.webcrawler.com/mak/projects/robots/threat-or-treat.html

"In the longer term complete Web-wide traversal by robots will become prohibitively slow, expensive, and ineffective for resource discovery.... Alternative strategies such as ALIWEB and Harvest are more efficient." Includes 25 references.

Lebedev, Alexander (1996). "Best Search Engines in Finding Scientific Information in the Net". Available HTTP: http://www.chem.msu.su/eng/comparison.html

The author ran eight different terms (single keywords) in physics and chemistry against eleven search engines, which were then scored on the basis of the total number of documents retrieved. Alta Vista scored highest, followed by Lycos, and then Inktomi.

Leighton, H. Vernon (1995). "Performance of Four World Wide Web (WWW) Index Services: Infoseek, Lycos, Webcrawler and WWWWorm". Available HTTP: http://www.winona.msus.edu/services-f/library-f/webind.htm

A project for a postgraduate Computer Science course which studies the relevance and precision of query results from the four services. Lycos and Infoseek scored significantly better than the other two. Bibliography of three references.

Leita, Carole (1996). "Chapiter 4: Locating Information". Web Search Strategies. Available HTTP: http://www.mispress.com/websearch/websch4.html

Discusses the basics of using a search engine, finding the right keywords,and other useful tips. Recommends Open Text for quick, "pinpointed" searches.

Leonard, Andrew J. (1996). "Search Engines: Where to Find Anything on the Net". c/net. Available HTTP: http://www.cnet.com/Content/Reviews/Compare/Search/index.html

An evaluation of nineteen search tools based on their ease of use, power, and accuracy of results. Includes features tables for the individual search engines and the metasearch engines. Alta Vista scored highest among the single-search engines reviewed; MetaCrawler rated highest of the metasearch tools. For those users who know what they are looking for on the Net, Metacrawler was recommended; Yahoo was recommended for those not quite sure what they are trying to find. Honourable mention went to Alta Vista.

Liu, Jian (1996). "Understanding WWW Search Tools". Available HTTP: http://www.indiana.edu/~librcsd/search/

A brief comparison of major search tools. Discusses their features and offers search tips.

Maire, Gilles (1996). "Chapitre 2: World Wide Web". Un nouveau guide d'Internet. Available HTTP: http://www.imaginet.fr/~gmaire/web.htm

Description of web terminology, followed by an overview of search engines. In French.

Mitchell, Steve (1996). "General Internet Resource Finding Tools: A Review and List of Those used to Build INFOMINE". Available HTTP: http://lib-www.ucr.edu/pubs/navigato.html

INFOMINE is a virtual library of ca. 5000 links to scholarly and educational resources. Tools are divided into the following basic categories: virtual libraries, subject guides, internet navigators (search engines and the like), academic library web collections, and directories of university web sites. Tools are rated on the basis of (1) usefulness for novice and/or experienced searchers and (2) relevance to small or large subject collections. Live links are provided for tools as well as for the bibliography. Despite US university focus, a valuable resource.

Notess, Greg (1996). "Internet Search Engines & Finding Aids: Capabilities & Features". Available HTTP: http://www.imt.net/~notess/compeng.html

Very useful table listing ten search tools and their features.

Randall,Neil (1995). "The Search Engine That Could". PC Computing, September 1995. Available HTTP: http://www.zdnet.com/pccomp/features/internet/search/index.html

Reviews and rates 14 search engines. InfoSeek scored highest, followed by Lycos, WebCrawler, and WWWWorm.

Scoville, Richard (1996). "Find it on the Net". PC World,(January 1996), 125-130. Available HTTP: http://www.pcworld.com/reprints/lycos.htm

General tips for searching. Eleven search engines were tested for total number of hits per query. Lycos scored highest, followed distantly by Open Text, Aliweb and InfoSeek. The same engines were then judged as to how many of the top ten hits were relevant. Once again Lycos scored best, followed closely by Excite and InfoSeek.

Stanley, Tracey (1996). "Alta Vista vs. Lycos". Ariadne on the Web, (Issue 2), March 1996. Available HTTP: http://ukoln.bath.ac.uk/ariadne/issue2/engines/

Among the largest databases, the two search engines are compared according to interface, query language, and search results. Both score fairly equally overall. However, the author notes that neither site really encourages clients to use the advanced or enhanced query options, which will result in more efficient searches.

Steinberg, Steve G. (1996). "Seek and Ye Shall Find". Riredb, (4.05), May 1996. Available HTTP: http://www.hotwired.com/wired/4.05/indexing/index.html

Brief assessment of nine indexing links to the Web. Stresses the importance of understanding how people think when searching on the Internet.

Tillman, Hope N. (1996). "Evaluating Quality on the Net". Paper presented at Computers in Libraries, Arlington, VA, February 26,1996. Available HTTP: http://challenge.tiac.net/users/hope/findqual.html

Examines relevance of existing criteria for other formats, the continuum of information on the net, and current state of evaluation tools on the net.

Tomaiuolo, Nicholas G. and Joan G. Packer (1996). "Quantitative Analysis of Five WWW 'Search Engines'". Computers in Libraries, (16:6). Available HTTP: http://neal.ctstateu.edu:2001/htdocs/websearch.html

Search results were taken from actual questions asked at a university library Reference Desk. Based on the average number of relevant hits for the first ten hits retrieved, Alta Vista scored highest, with InfoSeek, Lycos and Magellan grouped closely behind, and Point scoring very poorly.

University of Leeds. Computing Service (1995?). "Searching the World Wide Web with Lycos and InfoSeek". Available HTTP: http://www.leeds.ac.uk/ucs/docs/fur14/fur14.html

Discussion of features of both search engines and ways in which they could be improved.

University of Michigan. School of Information and Library Studies (1996). "Searching the World Wide Web with Lycos and InfoSeek". Available HTTP: http://www.sils.umich.edu/~fprefect/matrix/matrix.shtml

Explains difference between subject catalogues and search engines. Easy-to-read charts with rated checklist of features and attributes of 15 catalogues and indexes.

Venditto, Gus (1996). "Search Engine Showdown: IW Labs test seven Internet search tools". Internet World,(7:5), 79-86. Available HTTP: http://pubs.iworld.com/iw-online/May96/showdown.html

InfoSeek Guide was the best based on the criterion that a search engine "should be able to take a natural language phrase and find the most relevant information without expecting users to master Boolean or other structured logic". Alta Vista was scored highly for providing the most comprehensive search results. Well-written analysis of strengths and weaknesses of the seven tools.

Webster, Kathleen and K. Paul (1996). "Beyond Surfing: Tools and Techniques for Searching the Web". Information Technology, January 1996. Available HTTP: http://magi.com/~mmelick/it96jan.htm

Discusses browsing through subject trees vs. keyword searching using search engines. Two appendices provide brief evaluations. Includes a fairly comprehensive "Webliography".

Weiss, Aaron (1995). "Hop, Skip, and Jump: Navigating the World- Wide Web". Internet World, (6) April 1995. Available HTTP: http://pubs.iworld.com/iw-online/Apr95/feat41.htm

Discusses Jumpstation II, Webcrawler, Lycos, WWWWorm, RBSE, and CUI W3. Outlines differences between depth-first and breadth-first searching. Concludes that there is no single ultimate search engine for the Web.

Winship, Ian R. (1995). "World Wide Web Searching Tools - An Evaluation". VINE, (99), 49-54. Available HTTP: http://www.bubl.bath.ac.uk/BUBL/IWinship.html

An analysis of the major search engines available in June 1995. Concludes that record structure and search techniques appear to be more significant than retrieval performance. Brief bibliography.


References

1. Reviewed frequently in "Annual Review of Information Science and Technology". now in its 30th year

2. Salton, Gerard. "Automatic information organisation and retrieval". N.Y., McGrawHill, 1968

3. Cleverdon, Cyril William. "Report on the testing and analysis of an investigation into the comparative efficiency of indexing system." Cranfield,England, College of Aeronautics;1962. 305p LC:63-60414.

4. Cleverdon, Cyril William. "The Cranfield tests on index language devices" in "ASLIB proceedings", June 1967, v.19, n.6, pp173-194.

5. Salton, Gerard; McGill, Michael J. "Introduction to modern information retrieval", New York, McGraw Hill, 1983 447p. ISBN 0-07-054484-0.

6. Sparck Jones, Karen; M. Kay. "Linguistics and information science", N.Y., Academic Press, 1973, 208 p ISBN 0126562504.

7. Schamber, Linda. "Relevance and information behaviour" in "Annual Review of Information Science and Technology", Medford, Learned Information,1994, v29. pp.3-48

8. Kantor, Paul B. "Information Retrieval techniques" in "Annual Review of Information Science and Technology". Medford, Learned Information,1994, v29 p53-90.


Hypertext References


HREF1
http://snazzy.anu.edu.au/People/TonyB.html
Tony Barry's Home Page.
HREF2
http://www.bond.edu.au/Bond/Library/People/jpr/
Joanna Richardson's Home Page.
HREF3
http://bond.edu.au/Bond/Library/People/jpr/ausweb96/
Latest version of this paper
HREF4
http://info.anu.edu.au/elisa/databases/dbfiles/psyclit.html
Psychlit description at ANU
HREF5
http://atlas.cs.virginia.edu/~clv2m/salton.txt
Obituary, IRLIST Digest September 4, 1995 Volume XII, Number 34 Issue 271 ISSN 1064-6965
HREF6
http://www.cs.cornell.edu/Info/Department/Annual94/Faculty/Salton.html
Gerald Salton's Home page
HREF7
http://dxsting.cern.ch/sting/wais/wais-concepts.txt
Kahle, Brewster. Wide Area Information Server Concepts Thinking Machines. 11/3/89, Version 4, Draft
HREF8
http://sol.usc.edu/~plobbes/class/cs586/presentation/papers.html
Class reference
HREF9
http://sol.usc.edu/~plobbes/class/cs586/
CS586: Database Systems unit
HREF10
http://elisa.anu.edu.au/elisa/elibrary/indexes1.html
Indexes to the Internet page
HREF11
http://www.december.com/net/tools/index.html
December, John. Internet Tools Summary
HREF12
http://www.december.com/net/tools/nir.html
NIR = Network Information Retrieval
HREF13
http://www.cis.ohio-state.edu/htbin/rfc/rfc1689.html
IETF Request for Comment 1689. A Status Report on Networked Information Retrieval: Tools and Groups.
HREF14
http://www.yahoo.com/ - Yahoo
HREF15
http://www.einet.net/ - EINet Galaxy
HREF16
http://www.w3.org/hypertext/DataSources/bySubject/Overview.html - The WWW Virtual Library
HREF17
http://www.mckinley.com/ - Magellan
HREF18
http://gnn.com/gnn/wic/index.html - GNN's Whole Internet Catalog
HREF19
http://www.altavista.digital.com/ - Alta Vista
HREF20
http://www.excite.com/ - Excite
HREF21
http://www.hotbot.com/ - HotBot
HREF22
http://www5conf.inria.fr/fich_html/slides/panels/Panel10/overview.htm
Slides from Panel Session: Efficiency of Internet Indexing at 5th WWW Conference, Paris, May 1996.
HREF23
http://snazzy.anu.edu.au/CNASI/pubs/Questnet95.html
Barry, Antony. NIR is not enough, Questnet '95, Bond University, 6-8 Sept. 1995.
HREF24
http://www.ida.unisa.edu.au
Integrated Document Access Project Home Page.
HREF25
http://online.anu.edu.au/caul/ida/updates.htm
Minutes of the IDA Advisory Committee
HREF26
http://www.ida.unisa.edu.au/finalreport.html
IDA Final Report.
HREF27
http://harvest.cs.colorado.edu/
Harvest System Home Page.
HREF28
http://www.gu.edu.au:80/alib/iii/successf.htm
National Priority (Reserve) Fund Projects for the Development of Library Infrastructure. Program 2(a) Improved Information Infrastructure - Network Information Support.
HREF29
http://www.oclc.org:5047/oclc/research/publications/weibel/metadata/dubl in_core_report.html
OCLC/NCSA Metadata Workshop Report.

Copyright

Tony Barry, Joanna Richardson © 1996. The authors assign to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The authors also grant a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers, and for the document to be published on mirrors on the World Wide Web. Any other usage is prohibited without the express permission of the author.
Pointers to Abstract and Conference Presentation
Abstract Interactive Version Papers & posters in this theme All Papers & posters AusWeb96 Home Page

AusWeb96 The Second Australian WorldWideWeb Conference ausweb96@scu.edu.au