Scoped web site search: an Australian university case study

Sue Steele, Manager, Web Resources and Development, Information Technology Services [HREF1] , Building 203, Monash University [HREF2], Victoria, 3800. Sue.Steele@its.monash.edu.au

Abstract

The first part of this paper examines public web site searching at thirty-eight Australian universities with particular emphasis on the use of site-specific scoped search options. Google is the most common search engine, in use at eighteen universities. Twelve universities offer more than one search engine on all or part of their web site. Twenty-three universities are using scoped web searches on their web sites.

The second part examines scoped searching at Monash University in more detail by analysing the university's 2005 search engine logs. Monash University uses scoped searches on almost all of its sub-sites. Scoping parameters are inconsistently applied and there is no clear definition of a web-site for searching purposes. Initial analysis indicates a relatively high proportion of searchers choosing the broader search option. Some scoped search options result in very high zero-hit search rates. Further user-centred research is required.

1       Introduction

This study was conceived after the author examined the Monash web search engine logs for March 2005 to determine the most popular searches. It appeared, from a cursory examination of the query logs while extracting the popular queries, that a large number of scoped queries were more suited to a university-wide search or to another part of the web site.

A long-standing practice at Monash is to have scoped searches on faculty and divisional web sites and this has been facilitated in the current web site design. In general the search defaults to the local faculty or divisional web site and users can change this to the university-wide option if they select from a drop-down list.

This practise developed organically as a response to perceived issues with the enterprise level search engine results. It seemed like a good way to produce more meaningful results, it was possible and so it was done. The effectiveness of a scoped-search approach had not been questioned or tested at any stage.

Search engine research and query log analysis tends to be performed on large search engines or on an institution or enterprise-wide basis. (Chau, Fang and Sheng 2003, Jansen and Spink 2005, Park, Lee and Bae 2005, Wang, Berry and Yang 2003). If it is performed on smaller scales it is less frequently published.

This first part of this study examines Australian university web search engines to see what search engines are in use, what search options and enhancements are offered, and to estimate the extent of scoped searching defaults on their faculty or divisional web sites.

The second part examines search logs from a number of Monash web sites to estimate the nature and extent of scoped searching,and the results obtained.

2       Public web search at Australian universities

An analysis of Australian university public web search options was performed.

All AVCC member universities [HREF3] were examined between February 11 and February 14 2006.

2.1.    University public web search engines

The search engine, or engines, used by each university was noted for comparison with a similar analysis performed in June 2004. The search engine used was determined by a combination of logos and other obvious references, performing searches, examining the results and their html source, reading search help.

The 2004 examination noted five universities offering two public search engines. This figure may have been higher as the examination was not as comprehensive as the more recent one. In 2006, the number of universities offering two search engines had risen to twelve. Five universities have changed to a different search engine since 2004.

Table 1: Summary of University public search engine use by vendor
Search engine 2006 users 2004 users
Altavista 0 1
Freefind 1 1
Google 18 15
htdig 5 5
mngo 1 1
Oracle ultra 2 0
PanOptic 7 4
Teratext 1 1
Ultraseek 4 9
Verity 2 0

Use of Google and PanOptic has increased. Use of Ultraseek and Altavista has decreased since 2004. Google remains the most popular search offering, followed by PanOptic. Appendix 1 lists the search engine/s in use at each university in June 2004 and February 2006

Google University Search is a popular and cost effective public web site search option for many universities in Australia and overseas. It is interesting that some Australian universities have chosen other free search options, such as htdig, in preference to Google. Google university search is limited to publicly accessible content and restricted to a single domain. Therefore universities must use a different option for their intranet content (not part of this study). A multi-domain institution such as Monash University cannot utilise Google University Search effectively. Locally hosted search engines allow universities more flexibility in what is included in their public search option as well as providing for intranet searches.

31% of Australian universities had two search engines on all or part of their web site. Ten of these universities offered Google and another search engine. The author was disturbed by this finding and its possible implications. Why offer more than one search option? Is one better than another? What do users think when presented with more than one option? How many users select the second or non-default search engine, and why? Why do some university libraries maintain separate search engines? An interesting followup to this study would be to find out why universities have chosen their search engine offering/s.

2.2     Search features

A number of features can be implemented to improve end user searching (Rosenfeld 2005, Smith et al 2003, Boston, Rajapatarina and Missingham 2005). These include :

Nine universities using Google as their only search engine were not examined for these features because Google provides a standard range of search and result options that do include spell checking, and default AND searching. Where universities offered Google and another search engine, the second option was examined. Where a university offered more than one option (other than Google) only the primary search engine was further examined.

Several searches were performed at each site to test the site's search engine features. Search help, and the advanced search features provided, where available, were examined to gather more information. Searches performed at each site included: 'course fee', 'course fees', 'biology', 'biol', 'biog'. Searches were also deliberately misspelled to test spell-checking, for example 'bioligy'.

Table 2: University public web search engine features summary
Feature Universities
offering
Default boolean AND 23
Spell check 6
Best bets 6
Word stemming 11
Highlight query terms 25
Advanced search 21

A number of universities are taking advantage of the advanced options provided by the search engine software they have installed. Some others are not. Six universities running PanOptic or Verity, for example have the option of best bets and spell check but have apparently not yet implemented them. Appendix 2 lists the search engine features in use at each university studied in February 2006

21% of universities examined offer default boolean "OR" searches. This option presents searchers with sub-optimal results. While this could be intentional, it is more likely a result of the search engine software used. Older versions of Ultraseek and HTDig for example, exhibit this behaviour. Again, an understanding of why universities offer particular search engines would be useful.

2.3     University wide or locally scoped search

A simple search, for example 'enrolment' was performed across each university surveyed. The search was at first performed from the search option on the top level university home page as a benchmark, and the results noted. Within each university the search was repeated on two faculty sites (Arts and Science or equivalents), on the library site and on a campus site. The default search interfaces contained in common navigational elements such as headers, footers, site-navigation bars were examined and search results were checked to determine if the sub-sites were returning site-specific (scoped) or university-wide results.

Of the 38 universities examined, 37 had a university-wide default search option on the top level home page. One university had a default "Courses" search option at the top level and across all other sites and pages examined, university-wide search could be selected from a drop-down list. Appendix 3 lists the search scoping options in use at each university in February 2006

University libraries were more likely to use scoped search than other areas, and to use a 'local' search engine. Twelve universities examined used more than one search engine. In six of the twelve cases, the library was the source of the second search engine offering.

Seven of the 152 sub-sites examined lacked a search option. At one university this appeared to be part of one of at least two standard university web design options.

Table 3: Summary of scoped and university-wide search options at Australian university web sites
  Number Percentage
Total universities examined 38  
Universities with university-wide search on all sites examined, including top level homepage 13 34%
Universities with scoped search on all sub-sites examined. 2 5%
Universities with a mixture of site-scoped and university-wide searches on sub-sites examined. 23 61%
Universities having one or more sub-site with no search option 7 18%
University libraries with scoped search 20 53%
Universities with university-wide searches except for library site 9 24%

14 of the 38 universities examined showed a consistent approach to web search. The 13 where site-wide search is the only option and the university that offers 'Courses' as its default search.

At all of the universities offering scoped searches, effective search engine use may require a user to be certain of exactly where they are within a university web site, and to have an understanding of the university's structure because scoped searches will return different results for the same search term. (Neilsen 2005). Further research in this area is recommended to answer questions arising. Why are libraries more likely to have scoped searching? Why is scoped searching offered? What bebefits does it offer? Why do some sites have no search option? Are users confused by different search scope options within a single university web site? Have different search options been tested for ease of use/understanding and quality of results?

3.      Monash University web search analysis

Monash web sites are heavy users of scoped search options. The standard university web templates expect the search option to be customized for each web site.

Search engine and web server logs for 2005 were examined for the information provided in this section. During 2005 Monash used Ultraseek 3.1 as its search engine.

Ultraseek stores useful search log information in two files:

Web site managers can customize template headers so that the default search is localized to their site, with an option for the user to select the whole of Monash. The site-specific limits are stored as part of the raw query.log files and were used to isolate site-specific queries for the sites examined in this study.For example a query received via the engineering faculty web site:

	2005/05/15	00:19:36  1314   '+url:http://www.eng.monash.edu.au/ || Materials Engineering

The "||" indicates this query was formed from hidden form elements that included the '+url...' and a query string entered by a user.

In addition, the Ultraseek access.log files were used to determine the proportion of default and non-default searches from individual sites. In this case the http_referer was used as the site-specific determinant.

Ultraseek access.log entries for each site examined were isolated based on the http_referer string.. An entry was considered a default search option if it contained a query prefix (qp)string representing the escaped form of the scoped site query determinant. For example, an access log entry where the referer comes from the Faculty of Law web site contains the string "qp=%2Burl%3Ahttp%3A%2F%2Fwww.law.monash.edu.au" indicating the search was scoped to law, the default. This is not an ideal method, but no other was available. It assumes that the same proportion of default and non-default queries for each site are cached.

Web searches across the whole of Monash totalled 3,534,000 in 2005. 28% of searches were generated from the www.monash.edu.au domain with over half of those coming from the university home page. The remaining 72% of searches were generated from other university web sites.

The Monash Ultraseek search index contained a single 'collection' of documents with public and some intranet content in the same index. It attempted to index all Monash domains, not just monash.edu.au. Some of the other domains include monash.edu.my, monash.ac.uk, monash.org, monyx.com and many domains specific to research centres and Monash companies.

3.1 Further analysis of the four Monash web sites included in the Australian university examination

The four Monash sub-sites: Library, Arts Faculty, Science Faculty and a Campus web site studied in section 2.3 were examined in more detail. Each site's web server logs were analysed to determine the relative size of the sites and their levels of activity in 2005. The Ultraseek logs were analysed to determine the scoped searches for each site, the number of zero result scoped searches and the proportion of non-default searches (i.e. searches where the user has deliberately selected a different search option) emanating from each site.

Table 4: Search statistics for selected Monash web sites
Site Total pages served Distinct files served Total scoped searches Zero result scoped searches Proportion of zero result searches Proportion of non-default search selections
Library 19,206,527 N.A. 157,097 24,056 15% 20%
Arts 12,126,495 153,366 88,749 8,732 10% 17%
Science 968,356 12,199 14,168 3,175 22% 28%
Campus 250,845 489 10,745 6,439 60% 10%

Search represents a small proportion of each web site's total activity (represented by total pages served), generally less than 1%. On the campus site, the number of searches is just over 4% of the site's total activity. The campus site is very small by Monash standards and contains a limited amount of information. This may result in an increased proportion of search as users fail to find the desired information by browsing. However, as the default search is restricted to the campus site, this also results in a very large proportion of zero result searches in relation to the other sites examined, and in relation to the University of Tennessee (Wang, Berry and Yang 2003) where there was a 30% zero hit rate.

The Monash University web presence is not quite a public site and not quite an intranet. In particular, the nature of the Monash web site, with internal content and public content interspersed on the site at present, and with Ultraseek configured to index public and internal content in a single index, web site searching may be considered mostly 'known item' searching, especially for Monash students and staff. They are looking for something they have seen before, or been told of, or believe should be available on the site  (Mukherjee and Mao 2004, Chau, Fang and Sheng 2005)The queries they enter often have a 'right' answer, or can be answered by a very small subset of the university's web offering (Fagin et al 2005, Chau, Fang and Sheng 2005). Users may get a false-negative for their search for one of two reasons:

  1. They may be searching in the 'wrong' part of the site and use the default scoped search. The search results screen did not assist users by clearly indicating search scope (Rosenfeld and Morville 2002)
  2. They may have entered their query in all upper case or in mixed case. If an Ultraseek "query is entered entirely in lower case, case is ignored in the resulting hits. However, if any capitalization is used, case is exactly matched." (Infoseek Corporation 1999b)

The non-default search options were selected approximately 20% of the time. This is a relatively high roportion (Park, Lee, and Bae 2005, Chau, Fang and Sheng 2005) and may indicate a level of familiarity with local versus global Monash search options among regular users.

Query logs were examined to extract scoped queries for the selected web sites, so that query terms could be analysed. The most popular search terms were extracted for each site, as well as the most common zero-result search terms. The top ten search terms for each site are listed in Table 5. The top fifty search terms for each site are detailed in Appendix 4. Each extract was sorted alphabetically and manually examined in addition to extracting search term counts.

Table 5: Top 10 most popular search queries for selected Monash web sites
Library Arts Faculty Science faculty Campus
Frequency Search term Frequency Search term Frequency Search term Frequency Search term
2994 muso 1840 EMPTY QUERY 327 EMPTY QUERY 145 courses
2243 q manual 711 cover sheet 196 honours 105 map
1741 referencing 525 handbook 82 handbook 104 campus centre
1562 Endnote 463 timetable 81 psychology 85 sport
1488 proquest 409 journalism 78 chemistry 79 short courses
1429 EMPTY QUERY 408 psychology 60 green chemistry 75 bookshop
1105 Webct 356 japanese 59 summer semester 59 psychology
973 cover sheet 348 summer semester 57 biotechnology 58 book shop
850 assignment cover sheet 343 summer 52 units 56 open day
677 past exams 325 chinese 50 physics 56 shuttle bus

Examination of the site-specific search logs, and of the most popular search lists indicates that most users are aware of the place they are searching, in that the majority of searches entered are the kinds of things reasonably expected to be found in that part of the web site. This may be particularly true for the library and faculty sites as they are 'branded' sites, and have their own web domain addresses.

Image of Arts faculty sub-brand web page header

Figure 1: Monash Arts Faculty sub-brand header with Arts Faculty search as default

They may be more obvious because of this. The campus site is in 'masterbrand' style and, as can be seen from the example in Figure 2 the masterbrand header does not distinguish the site, except by breadcrumbs and headings. The URI of the campus site studied is a sub-directory of the main monash web address. It may not be as readily distinguished as a separate site and separate search option.

Image of a campus's master-brand web page header

Figure 2: Monash masterbrand subsite header for Peninsula campus site, showing search options drop-down, with campus-specific search as default.

Some users may not be aware that they are in a specific sub-site and their searches may be 'inappropriate' as a result (Neilsen 2005). Even when users are aware of their site-location, they may need to know 'more' in order to successfully execute a search. Within the science faculty, for example, several schools have their own web domains. Their content is not contained within the main science web site, nor within the domain 'sci.monash.edu'. As a result, their content is not included in a 'science faculty' search. Nor is the content of 'scientific disciplines' such as biology and psychology that are part of the medical faculty. This may explain the relatively higher proportion of zero result searches and non-default search selections for the science site. Science is not the only area at Monash with a diverse web presence. Issues such as these should lead to a questioning of, and further study of the value of scoped searches.

Variant spellings and variant terms are quite common, as are spelling errors and mistypes. Wang, Berry and Yang (2003) found up to 26% misspellings on a university web search engine. The exact level of misspelling at Monash was not calculated, but visual inspection confirms it is of that order, especially if variant spellings are taken into account. For example students are required to use a coversheet when submitting assignments. Variations on coversheet found in the Arts faculty search logs include:

Table 6: Examples of variant spelling and usage when searching for coversheet
Frequency Search Term Frequency Search Term
711 cover sheet 30 english cover sheet
151 coversheet 23 coversheets
127 cover sheets 21 communications cover sheet
108 assignment cover sheet 19 arts coversheet
76 essay cover sheet 17 cover
46 arts cover sheet 17 essay cover sheets
37 cover page 13 assignment cover
35 assignment cover sheets 13 english coversheet
34 assessment cover sheet 13 history cover sheet
31 assignment cover sheet 13 sociology cover sheet

Common search themes included unit codes such as 'ges1000', 'sci1020' and 'sci2010', unit code prefixes such as 'ges', 'sci', department names, individuals' names, and Monash acronyms and abbreviations such as 'muso', 'wes' and 'webct'. These terms did not necessarily make it into the 'top' lists but there were many examples of them. Closer examination of the logs by subject matter experts from each region would result in a comprehensive list of recommended links for a best-bets service. (Smith et al 2003)

A library example further illustrates the level of mistyping and misspelling. How many ways can you type 'reference'?:

Table 7: Example of mistyping and misspelling reference
Frequency Search Term Frequency Search Term
6 refencing 2 referance
2 referances 10 referancing
12 referecing 9 referenceing
2 Reference-Internet 3 referencin
5 REFERENCING  2 referencing & science
2 referencing powerpoint 3 referencng
4 referening 4 referensing
6 refernce 33 referncing
2 referneces 4 refernecing
4 refferencing 6 refrences
5 refrence 26 refrencing

Each of the searches in Table 10 would have resulted in a zero-result search. A spell-checker, depending on its configuration, could have prompted many of these for the correct term. (Boston, Rajapatarina and Misingham 2005)

The most common zero-result searches for each site are listed below:

Table 8: Most common zero-result search terms for selected web sites
Library Arts Faculty Science Faculty Campus
Number Search term Number Search term Number Search term Number Search term
2532 muso 1840 EMPTY QUERY 327 EMPTY QUERY 85 sport
1429 EMPTY QUERY 26 WebCT 28 chemstore 75 bookshop
216 coversheet 25 cover sheet 15 e-rat 59 psychology
150 Qmanual 15 monquest 12 stilwell 58 book shop
99 psychinfo 14 LSS 12 veterinary 56 open day
99 Wes 14 referencing 10 BIO2051 54 calendar
93 Textron.com 13 JRN 10 erat 51 EMPTY QUERY
73 mutts 13 mymonash 10 moresi 50 gym
60 Webct 12 monet 10 scishop 50 security
55 amh 12 yasumasa morimura 9 asthenosphere 48 sports
48 mymonash 11 enrolement 9 microstructures 47 food
46 mus1110 11 hpl1503 9 neuroscience 38 bank
42 assignment cover sheet 11 Morimura 8 entomology 38 music
41 punishment 11 PSS1711 8 mscourse 36 commerce
40 psycINFO 11 psycology 8 staff 36 pictures
39 legibook 11 SILL 7 casper 34 scholarships
36 gym 10 employment 7 ci1010 32 cafe
36 mgc1010 10 mail.monash.edu.au 7 forsyth 32 clubs
35 man11 9 apy1910 7 heatflow 32 hair
33 referncing 9 Com1010 7 overload 31 jobs

Zero-result searches appear to be caused by a number of factors:

Examination of the raw query logs indicates a certain level of frustration among some users who do not find what they are expecting. Presumably users often 'know' that Monash has information relevant to their search query, but they cannot find it. Studies have shown that the majority of web search sessions are short with most users entering a single query and only looking at a single results page (Jansen and Spink 2003, Mat-Hassan and Levene 2005). This may also be the case at Monash, but a small group of users repeat the same search many times. Without interviewing the individuals who repeat searches one can only speculate as to the level of frustration of a person who repeats a search more than 10 times without realizing they are searching the 'wrong' part of the web site. For example, this extract from the library query log:

	24/08/2005 14:09	0		cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 
	24/08/2005 14:09	0	m0	cse2309 

Or this one from the Arts query log

	21/11/2005 0:39	0		plaguariased 
	21/11/2005 0:39	0		plaguariasem 
	21/11/2005 0:39	0	m0	plaguarilism 
	21/11/2005 0:40	0	m0	plaguariase 
	21/11/2005 0:40	0	m0	plaguariase 
	21/11/2005 0:40	0	m0	plaguariasm 

Best-bets and spell-checking would improve the results of these searches in the same way they would for all search results. In addition, offering a 'would you like to search the whole of Monash' option as part of the results page for site-specific searches could assist. Where the search returned zero hits, the 'whole of Monash' could be displayed by default, with a suitable explanation.

3.2     Monash campus web sites

The poor search performance of the campus web site examined above led to an analysis of all campus web sites to see if this was a common problem.

The results of this examination are shown in Table 9. The campuses are not named. Two campuses are not represented as they do not have a scoped search option.

Table 9: 2005 Campus web site search summary
Site Total pages served Unique pages served Total site specific searches Zero result site specific searches Proportion of zero result searches Proportion of non default searches selected
Campus 1 250,845 489 10,745 6,439 60% 10%
Campus 2 106,075 369 2,996 929 30% 17%
Campus 3 121,933 4,562 718 718 100% 10%
Campus 4 1,140,302 5,654 11,860 2,070 17% 16%
Campus 5 320,551 7,154 3,086 498 16% 19%
Campus 6 N.A. N.A. 6,073 4,823 79% N.A.
Campus 7 192,795 968 480 147 30% 48%
Campus 8 N.A. N.A. 8,665 1,799 20% 13%

In addition to Campus 1, two other campus web sites showed very high zero-result searches. In both cases this was because of a syntax error in their sites' header codes. These errors had been fixed before the study was undertaken. The remaining campuses have zero-result proportions within the range of other Monash web sites and other university sites (Wang Berry Yang 2003). The default search option for Campus 1 was changed to 'whole of Monash' by the site owner before this paper was completed, after he was made aware of the high level of zero search results for the site. However, this also creates additional inconsisteneies in that search is not operating the same way on all campus sites. The campus sites are not internally consistent in the kind and depth of information provided and are potentially confusing.

Further examination of the search results of 'small' Monash sites, especially sub-sites that are masterbrand, is required because it is unclear from this study whether there is a real issue with micro-level scoped search limiting.

Monash faculties

During the examination of the Arts faculty it was noted that there were additional scoping restrictions within the faculty web site - sub-sites within a sub-site so to speak. The remaining faculties were examined at a basic level, and also for sub-site scope restrictions. The results are shown below:

Table 10: 2005 faculty search summary. Search option 1 - Whole site is faculty-wide search. Search option 2 - Some parts of the site have further restricted searching to sub-sites
Faculty Total pages served Distinct files requested Total scoped searches Zero result scoped searches Proportion zero result scoped searches Proportion of non default options selected Search options
Art & Design 577,822 6,838 9,822 1,568 15% 11% 1
Arts 12,126,945 153,666 88,749 8,732 10% 17% 2
Business N.A. N.A. 120,732 24,151 20% 12% 2
Education 1,099,459 42,674 20,686 4,148 20% 16% 1
Engineering 2,040,613 62,797 42,398 10,630 13% 18% 2
IT 4,750,849 26,114 49,442 6,912 13% 24% 1
Law 2,253,262 20,430 24,044 2,523 10% 14% 1
Medicine 6,380,074 153,286 103,706 14,877 14% 15% 1
Pharmacy 1,140,302 5,654 11,865 2,073 17% 16% 1
Science 968,356 12,199 14,148 3,175 22% 28% 2

Four faculties have 'local' scoped searches that are more specific than 'whole of faculty'. Six faculties have faculty-level search across their whole site. Upon examination of the sub-faculty searches it was found that there are effectively nested sub-scope searches within some faculty sub-sites. It was also found that sub-site restriction is not uniform across a faculty. Neither the Monash web style guide [HREF4] nor the web site templates define a 'web site' for the purposes of search engine restriction. It is clear that there are a number of possible interpretations in current use. How useful and how apparent these subsite distinctions are to users is unclear at this stage. A clear definition of a 'web site' for the purposes of search scope would be a good first step.

Table 11: summary of faculties with sub-site scoped searches
Faculty Subsites with local scoped search Total number of subsites at equivalent level Sub-sub sites with local scoped search
Arts 7 26 5
Business and Economics 26 28 7
Engineering 5 9 0
Science 3 12 2

Example of faculty sub-site header with three drop-down search options

Figure 3: Example of faculty sub-site header with three drop-down search options

Sub-site specific scoped searches for the Arts and Business faculties were examined further. The proportion of zero-result searches found varied from 8% to 100%, the median was 42% and the average 56% which is more than twice as high as the faculties as a whole. The proportion of non-default searches selected at sub-sub-site level varied from 12% to 48%, with a median of 22% and average 24% somewhat higher than the faculties as a whole.

Different default search options on faculty web sites cause differences in search results for the same search, depending on where in the site it is performed. Examples noticed during the analysis show major differences in hit rates for common terms, as the following examples show.

	2005/11/16 09:21:17 	385		timetable
	2005/11/16 09:27:11 	385		timetable
	2005/11/17 12:58:47 	12		timetable
	2005/11/17 12:59:04 	12		timetable
	
	2005/01/19 21:03:24 	63		alcohol
	2005/03/16 22:36:21 	63		alcohol
	2005/04/11 15:43:59 	0		alcohol
	2005/04/11 15:44:02 	0		alcohol
	2005/04/11 15:45:56 	65		alcohol
	2005/05/15 10:45:50 	64		alcohol 

This is also true across the university in general. A search performed on a faculty or divisional site will not produce the same results as one performed from the university home page or any other area defaulting to 'whole of Monash' search. It appears that most users can tell they are on a faculty or other uniquely branded site, but it is not clear whether they can distinguish when they are on a designated sub-site of one of those sites. It is not possible to tell from search logs which type of result set is more useful to a user, unless one of the result sets is zero. However it may be confusing to users that the same search can produce vastly different results (Nielsen 2005). Further research is required in this area.

4.      Conclusions and future work

Examination of public web search at Australian universities raises a number of questions such as:

Initially the author assumed that Monash university's reliance on scoped searching would be uncommon. This was not the case. 66% of Australian universities have some scoped searching. Have the search options and results been evaluated? Is scoped search offered because the enterprise search is seen as sub-optimal? Further research in these areas is recommended.

Within Monash, additional research is required to determine suitable best-bets for commonly performed searches so that these can be offered as results for any relevant search, no matter where it is performed in the site, and to determine a suitable configuration level for the spell-check feature provided with the Verity search software currently in use. Usability testing should also be conducted to determine the optimal level, if any, for sub-site search default options, and to ascertain the usefulness of a 'broaden this search' option on result pages of scoped searches.

References

Alexander, D. (2005) How usable are university websites? In Proceedings of the Eleventh Australian World Wide Web Conference (AusWeb), Gold Coast , Australia [HREF5]

Boston,T. Rajapatarina, B and Missingham R. Libraries Australia: Simplifying the search experience. Online 2005 Conference, Sydney, Australia [HREF6]

Chau, M Fang, X and Sheng, O. (2005) Analysis of the query logs of a web site search engine. Journal of the American Society for information science and technology, 56(13)

Fagin, R et al (2003) Searching the workplace web. WWW 2003, May 2003, Budapest, Hungary

Jansen, B and Spink, A (2005) An analysis of web searching by European Allthe Web.com users. Information processing and managementvol 41

Infoseek.Corporation 1999a Ultraseek Server 3.1 Administrator Guide

Infoseek Corporation 1999b Ultraseek Server 3.1 Customization Guide

Mat-Hassan, M and Levene, M. (2005) Associating search and navigation behavior throught log analysis. Journal of the American Society for information science and technology. 56(9)

Mukherjee R and Mao Jianchang. (2004) Enterprise search: tough stuff. QUEUE April 2004.

Nielsen, J. (2001) Search: visible and simple. Alertbox. [HREF7]

Nielsen, J. (2005) Mental models for search are getting firmer. Alertbox [HREF8]

Park, S. Lee, J H and Bae, H J. (2005) End user searching: a web log analysis of NAVER, a Korean web search engine. Library and information science research 27

Rosenfeld, L. and Morville, P. (2002) Information Architecture for the World Wide Web, 2nd edition, Sebastapol, CA: O'Reilly.

Rosenfeld, L. (2005) Enterprise Information Architecture Seminar Presentation. Fall 2005 [HREF9]

Smith, J et al (2003) Enhancing end-user searching on HealthInsite. 10th Asia Pacific Special, Health and Law Librarians' Conference, Adelaide, Australia

Wang, P, Berry, M and Yang, Y (2003) Mining longtitudinal web queries: trends and patterns Journal of the American Society for Information Science and Technology, 54(8)

Hypertext references

HREF1
http://www.its.monash.edu.au/
HREF2
http://www.monash.edu.au/
HREF3
http://www.avcc.edu.au/content.asp?page=/universities/memberUnis.htm
HREF4
http://www.monash.edu.au/staff/web/
HREF5
http://ausweb.scu.edu.au/aw05/papers/refereed/alexander/paper.html
HREF6
http://conferences.alia.org.au/online2005/papers/b3.pdf
HREF7
http://www.useit.com/alertbox/20010513.html
HREF8
http://www.useit.com/alertbox/20050509.html
HREF9
http://louisrosenfeld.com/presentations/1105-RosenfeldEIA.ppt

Appendices

Appendix 1: Search engine software in use at Australian universities
University Search engine Feb 2006 Search engine June 2004
University of Adelaide Oracle ultra and Google (Google on library site) htdig
Australian Catholic University local (library uses Google)  
Australian National University PanOptic PanOptic
University of Ballarat Oracle ultra  
Bond University Freefind.com Freefind.com
University of Canberra PanOptic PanOptic
Central Queensland University Google Google
Charles Darwin University Google (site search) Google
Charles Sturt University Google Google
Curtin University of technology Google (library uses htdig) Google
Deakin University Google Google
Edith Cowan University Google Google
Flinders University Google and htdig Google and htdig
Griffith University PanOptic and Google Something local and Google
James Cook University Google and htdig Google and htdig
La Trobe University Google Google
Macquarie University Google htdig
University of Melbourne Ultraseek 3.1 Ultraseek 3.1
Monash University Verity k2 Ultraseek 3.0
Murdoch University mnoGoSearch and Google MngoSearch 3.2.15 and Google
University of New England PanOptic PanOptic
University of New South Wales Google and Verity Google
University of Newcastle Verity k2 Ultraseek 4.3.1
University of Queensland Google (library uses htdig)  
Queensland University of Technology PanOptic AltaVista
RMIT University Teratext Teratext
Southern Cross University htdig htdig
University of South Australia Local and Google Local and Google
University of southern Queensland Ultraseek 4.5.0 Ultraseek 4.0
University of the Sunshine Coast undetermined  
Swinburne University of Technology Google Google
University of Sydney PanOptic PanOptic
University of Tasmania Ultraseek 4.3.3 Ultraseek 4.3.3
University of Technology Sydney Ultraseek 4.1.1(htdig on library site) Ultraseek
Victoria University Google  
University of Western Australia Google (library uses its own local search engine)  
University of Western Sydney   Google
University of Wollongong PanOptic PanOptic

 

Appendix 2: University web search engine features
University Search engine examined Default boolean AND Spell check Best bets Word stemming Highlight query terms Advanced search
The University of Adelaide Oracle ultra NO NO NO YES (advanced) YES YES
Australian Catholic University local YES NO NO NO NO YES
The Australian National University PanOptic YES YES YES YES (advanced) YES YES
University of Ballarat Oracle ultra NO NO NO YES (advanced) YES YES
Bond University freefind.com YES NO NO YES (explicit with *) YES NO
University of Canberra PanOptic YES NO YES NO YES YES
Curtin University of Technology Htdig (library) YES NO NO NO YES NO
Flinders University htdig YES NO NO YES for plurals YES NO
Griffith University PanOptic YES YES YES NO YES YES
James Cook University htdig YES NO NO YES for plurals YES NO
The University of Melbourne Ultraseek 3.1 NO NO NO NO YES YES
Monash University Verity k2 YES NO NO NO NO YES
Murdoch University mnGoSearch YES NO YES (default) YES YES YES
The University of New England PanOptic YES NO NO NO YES YES
The University of New South Wales Verity YES? NO NO YES YES YES
The University of Newcastle Verity k2 YES? YES NO YES for plurals YES YES
The University of Queensland Htdig (library) YES NO NO YES YES NO
Queensland University of Technology PanOptic YES YES YES NO YES YES
RMIT University Teratext YES NO NO NO YES YES
Southern Cross University htdig YES NO NO NO YES NO
University of South Australia Local YES NO NO NO NO NO
University of Southern Queensland Ultraseek 4.5.0 NO NO NO NO YES YES
University of the Sunshine Coast undetermined YES YES NO NO YES YES
The University of Sydney PanOptic YES NO NO NO YES YES
University of Tasmania Ultraseek 4.3.3 YES NO NO NO YES YES
University of Technology Sydney Ultraseek 4.1.1 NO NO NO NO YES YES
The University of Western Australia Local (library) NO NO NO NO NO NO
University of Western Sydney   YES NO NO YES (advanced option) YES YES
University of Wollongong PanOptic YES YES YES NO YES YES

 

Appendix 3: University sub-site search options
University Library Science Arts Campus Notes
University of Adelaide Scoped University University University  
Australian Catholic University University University University University  
Australian National University Scoped University University Scoped  
University of Ballarat University University University University  
Bond University Scoped None None None  
University of Canberra Scoped None University University Library default is scoped with radio button options on page for whole site. No search option on top level health sciences site, uni-wide search option on lower level pages
Central Queensland University Scoped University University University  
Charles Darwin University # # # # Default search option is 'courses' (this is the header default across the entire site). The web site search is towards the bottom of the options list.
Charles Sturt University Scoped University None University  
Curtin University of technology Scoped University University Scoped and University Humanities site search appears uni-wide. However the search is broken and no search terms return results - a blank search page is returned. Campus web site offers two search options in the header
Deakin University University University University University  
Edith Cowan University None Scoped University University Radio button default restricts to site. Campus websites were not apparent. Campus information page has uni-wide search
Flinders University Scoped University University University Library search offers local htdig. Top level site search offers Google and htdig. Faculty of Science and Engineering has no search option on home page, Faculty of Health Sciences has default university search. Campus sites not available, campus information page has uni-wide search
Griffith University University University University University  
James Cook University University University University University Default is Google, but can select 'local search engine'. No campus web sites. Campus maps and campus locations pages have uni-wide search
La Trobe University Scoped University University University  
Macquarie University University University University none  
University of Melbourne Scoped University University University Does not have separate campus sites, campus information page has uni-wide search
Monash University Scoped Scoped Scoped Scoped  
Murdoch University Scoped University University University Default on uni homepage is 'search a-z index'(a form of best bets), Arts and Science search defaults to ,a-z index search, user must then select the 'search' button from the index page to get a web search option.
University of New England University Scoped None University Initial default on library search box (radio button) is staff directory search. On science faculty search page, users must explicitly select the type of search, faculty wide is first choice. It is also the default option if a user elects to search the site via Google
University of New South Wales Scoped University Scoped Scoped and University Top level site search defaults to Google. Library search is not Google. Only ADFA campus appears to have its own site, it is a Scoped search. Campus maps page has university-wide search
University of Newcastle University University University University  
University of Queensland Scoped Scoped Scoped Scoped Campus searches are part of the 'About' site, which has a site specific search
Queensland University of Technology Scoped Scoped Scoped University  
RMIT University University University University University  
Southern Cross University University University Scoped University Library site says it is scoped, but this does not seem to be the case. Arts faculty has Scoped search in left hand navigation box and uni-wide one in header
University of South Australia Scoped University University University  
University of Southern Queensland University University University University  
University of the Sunshine Coast University University University University  
Swinburne University of Technology University University University University  
University of Sydney Scoped Scoped Scoped University  
University of Tasmania University Scoped Scoped University Arts and Science faculty sites offer two search boxes, the default university-wide one in the top page header and a Scoped one beneath the faculty header, this appears to be default behaviour for some utas sites
University of Technology Sydney Scoped Scoped Scoped none No 'search' option on campus pages - must go to 'find' link in footer and this includes a link to the university search page
Victoria University University University University University  
University of Western Australia Scoped University University    
University of Western Sydney Scoped University University University Library search uses Google, uncertain about top level site engine
University of Wollongong University University University University  

 

Appendix 4: Top 50 most popular search queries for selected Monash web sites
Library Arts Faculty Science faculty Campus
Frequency Search term Frequency Search term Frequency Search term Frequency Search term
2994 muso 1840 EMPTY QUERY 327 EMPTY QUERY 145 courses
2243 q manual 711 cover sheet 196 honours 105 map
1741 referencing 525 handbook 82 handbook 104 campus centre
1562 Endnote 463 timetable 81 psychology 85 sport
1488 proquest 409 journalism 78 chemistry 79 short courses
1429 EMPTY QUERY 408 psychology 60 green chemistry 75 bookshop
1105 Webct 356 japanese 59 summer semester 59 psychology
973 cover sheet 348 summer semester 57 biotechnology 58 book shop
850 assignment cover sheet 343 summer 52 units 56 open day
677 past exams 325 chinese 50 physics 56 shuttle bus
667 Qmanual 319 music 44 summer 54 calendar
615 voyager catalogue 288 behavioural studies 43 biology 51 EMPTY QUERY
542 wireless 283 subjects 43 cover sheet 50 gym
524 Factiva 277 units 43 synchrotron 50 security
514 document delivery 257 sociology 40 microbiology 48 sports
450 harvard referencing 247 honours 39 environmental science 47 food
435 Docdel 245 history 37 biochemistry 46 parking permit
422 Fines 227 international studies 36 chemstock 41 parking
412 google 206 webct 36 muso 40 education
410 law library 201 english 34 sci1020 40 monash college
405 coversheet 192 philosophy 32 imperviousness Yarra 38 bank
402 Ovid 191 anthropology 32 jobs 38 music
383 Allocate 180 referencing 32 zoology 37 orientation
376 Renew 174 korean 30 sci2010 36 commerce
346 Ibis 173 crimonology 29 geology 36 pictures
333 assignment coversheet 172 communications 29 store 34 scholarships
331 journals 163 courses 28 chemstore 34 library
315 my monash 160 politics 28 imperviousness 33 course
308 web ct 154 fees 7 dean 33 engineering
299 Harvard 151 coversheet 27 fees 32 cafe
289 caval card 151 french 27 physiology 32 clubs
283 Kinetica 147 clayton map 26 mathematics 32 hair
283 reference 147 ges1000 25 chem store 32 short course
278 lectures online 147 social work 25 honours application 31 jobs
245 thesis 139 library 25 scholarship 31 law
241 citation 135 spanish 24 astronomy 30 bus
241 key law resources 132 undergraduate handbook 24 near pass 30 parking permits
241 newspapers 130 tourism 24 nursing 29 hairdresser
240 opening hours 129 summer course 24 risk assessment 28 graduation
239 marketing myopia 127 cover sheets 24 science 28 robert blackwood hall
235 exams 124 asian studies 24 statistics 28 arts
232 bibliography 124 bachelor of letters 23 bachelor of science 28 menzies building
232 medline 124 visual culture 23 ian cartwright 27 student services
224 vancouver 122 bachelor of arts 23 map 25 monash international
220 harvard business review 122 communication 23 subjects 24 courses
219 Coolcat 121 mutts 23 undergraduate handbook 24 photos
217 citing and referencing 117 allocate + 22 courses 24 post office
217 theses 117 linguistics 22 genetics 23 accomodation
216 bookshop 117 scholarships 22 immunology 23 photo
215 monash college 111 summer school 22 pharmacology 22 accounting

Copyright

<Sue Steele>, © 2006. The author assigns to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.