Sue Steele, Manager, Web Resources and Development, Information Technology Services [HREF1] , Building 203, Monash University [HREF2], Victoria, 3800. Sue.Steele@its.monash.edu.au
The first part of this paper examines public web site searching at thirty-eight Australian universities with particular emphasis on the use of site-specific scoped search options. Google is the most common search engine, in use at eighteen universities. Twelve universities offer more than one search engine on all or part of their web site. Twenty-three universities are using scoped web searches on their web sites.
The second part examines scoped searching at Monash University in more detail by analysing the university's 2005 search engine logs. Monash University uses scoped searches on almost all of its sub-sites. Scoping parameters are inconsistently applied and there is no clear definition of a web-site for searching purposes. Initial analysis indicates a relatively high proportion of searchers choosing the broader search option. Some scoped search options result in very high zero-hit search rates. Further user-centred research is required.
This study was conceived after the author examined the Monash web search engine logs for March 2005 to determine the most popular searches. It appeared, from a cursory examination of the query logs while extracting the popular queries, that a large number of scoped queries were more suited to a university-wide search or to another part of the web site.
A long-standing practice at Monash is to have scoped searches on faculty and divisional web sites and this has been facilitated in the current web site design. In general the search defaults to the local faculty or divisional web site and users can change this to the university-wide option if they select from a drop-down list.
This practise developed organically as a response to perceived issues with the enterprise level search engine results. It seemed like a good way to produce more meaningful results, it was possible and so it was done. The effectiveness of a scoped-search approach had not been questioned or tested at any stage.
Search engine research and query log analysis tends to be performed on large search engines or on an institution or enterprise-wide basis. (Chau, Fang and Sheng 2003, Jansen and Spink 2005, Park, Lee and Bae 2005, Wang, Berry and Yang 2003). If it is performed on smaller scales it is less frequently published.
This first part of this study examines Australian university web search engines to see what search engines are in use, what search options and enhancements are offered, and to estimate the extent of scoped searching defaults on their faculty or divisional web sites.
The second part examines search logs from a number of Monash web sites to estimate the nature and extent of scoped searching,and the results obtained.
An analysis of Australian university public web search options was performed.
All AVCC member universities [HREF3] were examined between February 11 and February 14 2006.
The search engine, or engines, used by each university was noted for comparison with a similar analysis performed in June 2004. The search engine used was determined by a combination of logos and other obvious references, performing searches, examining the results and their html source, reading search help.
The 2004 examination noted five universities offering two public search engines. This figure may have been higher as the examination was not as comprehensive as the more recent one. In 2006, the number of universities offering two search engines had risen to twelve. Five universities have changed to a different search engine since 2004.
| Search engine | 2006 users | 2004 users |
|---|---|---|
| Altavista | 0 | 1 |
| Freefind | 1 | 1 |
| 18 | 15 | |
| htdig | 5 | 5 |
| mngo | 1 | 1 |
| Oracle ultra | 2 | 0 |
| PanOptic | 7 | 4 |
| Teratext | 1 | 1 |
| Ultraseek | 4 | 9 |
| Verity | 2 | 0 |
Use of Google and PanOptic has increased. Use of Ultraseek and Altavista has decreased since 2004. Google remains the most popular search offering, followed by PanOptic. Appendix 1 lists the search engine/s in use at each university in June 2004 and February 2006
Google University Search is a popular and cost effective public web site search option for many universities in Australia and overseas. It is interesting that some Australian universities have chosen other free search options, such as htdig, in preference to Google. Google university search is limited to publicly accessible content and restricted to a single domain. Therefore universities must use a different option for their intranet content (not part of this study). A multi-domain institution such as Monash University cannot utilise Google University Search effectively. Locally hosted search engines allow universities more flexibility in what is included in their public search option as well as providing for intranet searches.
31% of Australian universities had two search engines on all or part of their web site. Ten of these universities offered Google and another search engine. The author was disturbed by this finding and its possible implications. Why offer more than one search option? Is one better than another? What do users think when presented with more than one option? How many users select the second or non-default search engine, and why? Why do some university libraries maintain separate search engines? An interesting followup to this study would be to find out why universities have chosen their search engine offering/s.
A number of features can be implemented to improve end user searching (Rosenfeld 2005, Smith et al 2003, Boston, Rajapatarina and Missingham 2005). These include :
Nine universities using Google as their only search engine were not examined for these features because Google provides a standard range of search and result options that do include spell checking, and default AND searching. Where universities offered Google and another search engine, the second option was examined. Where a university offered more than one option (other than Google) only the primary search engine was further examined.
Several searches were performed at each site to test the site's search engine features. Search help, and the advanced search features provided, where available, were examined to gather more information. Searches performed at each site included: 'course fee', 'course fees', 'biology', 'biol', 'biog'. Searches were also deliberately misspelled to test spell-checking, for example 'bioligy'.
| Feature | Universities offering |
|---|---|
| Default boolean AND | 23 |
| Spell check | 6 |
| Best bets | 6 |
| Word stemming | 11 |
| Highlight query terms | 25 |
| Advanced search | 21 |
A number of universities are taking advantage of the advanced options provided by the search engine software they have installed. Some others are not. Six universities running PanOptic or Verity, for example have the option of best bets and spell check but have apparently not yet implemented them. Appendix 2 lists the search engine features in use at each university studied in February 2006
21% of universities examined offer default boolean "OR" searches. This option presents searchers with sub-optimal results. While this could be intentional, it is more likely a result of the search engine software used. Older versions of Ultraseek and HTDig for example, exhibit this behaviour. Again, an understanding of why universities offer particular search engines would be useful.
A simple search, for example 'enrolment' was performed across each university surveyed. The search was at first performed from the search option on the top level university home page as a benchmark, and the results noted. Within each university the search was repeated on two faculty sites (Arts and Science or equivalents), on the library site and on a campus site. The default search interfaces contained in common navigational elements such as headers, footers, site-navigation bars were examined and search results were checked to determine if the sub-sites were returning site-specific (scoped) or university-wide results.
Of the 38 universities examined, 37 had a university-wide default search option on the top level home page. One university had a default "Courses" search option at the top level and across all other sites and pages examined, university-wide search could be selected from a drop-down list. Appendix 3 lists the search scoping options in use at each university in February 2006
University libraries were more likely to use scoped search than other areas, and to use a 'local' search engine. Twelve universities examined used more than one search engine. In six of the twelve cases, the library was the source of the second search engine offering.
Seven of the 152 sub-sites examined lacked a search option. At one university this appeared to be part of one of at least two standard university web design options.
| Number | Percentage | |
|---|---|---|
| Total universities examined | 38 | |
| Universities with university-wide search on all sites examined, including top level homepage | 13 | 34% |
| Universities with scoped search on all sub-sites examined. | 2 | 5% |
| Universities with a mixture of site-scoped and university-wide searches on sub-sites examined. | 23 | 61% |
| Universities having one or more sub-site with no search option | 7 | 18% |
| University libraries with scoped search | 20 | 53% |
| Universities with university-wide searches except for library site | 9 | 24% |
14 of the 38 universities examined showed a consistent approach to web search. The 13 where site-wide search is the only option and the university that offers 'Courses' as its default search.
At all of the universities offering scoped searches, effective search engine use may require a user to be certain of exactly where they are within a university web site, and to have an understanding of the university's structure because scoped searches will return different results for the same search term. (Neilsen 2005). Further research in this area is recommended to answer questions arising. Why are libraries more likely to have scoped searching? Why is scoped searching offered? What bebefits does it offer? Why do some sites have no search option? Are users confused by different search scope options within a single university web site? Have different search options been tested for ease of use/understanding and quality of results?
Monash web sites are heavy users of scoped search options. The standard university web templates expect the search option to be customized for each web site.
Search engine and web server logs for 2005 were examined for the information provided in this section. During 2005 Monash used Ultraseek 3.1 as its search engine.
Ultraseek stores useful search log information in two files:
3/05/2005 15:20 1 convection animation 3/05/2005 15:53 13 summe 3/05/2005 18:56 55 chapter 2 3/05/2005 18:57 55 chapter 2 3/05/2005 18:57 9 m0 faults
"A cache hit may result from two or more users making the same query within a short period of time, or from a user requesting an additional page of results from the same query." (Infoseek Corporation 1999a)
Web site managers can customize template headers so that the default search is localized to their site, with an option for the user to select the whole of Monash. The site-specific limits are stored as part of the raw query.log files and were used to isolate site-specific queries for the sites examined in this study.For example a query received via the engineering faculty web site:
2005/05/15 00:19:36 1314 '+url:http://www.eng.monash.edu.au/ || Materials Engineering
The "||" indicates this query was formed from hidden form elements that included the '+url...' and a query string entered by a user.
In addition, the Ultraseek access.log files were used to determine the proportion of default and non-default searches from individual sites. In this case the http_referer was used as the site-specific determinant.
Ultraseek access.log entries for each site examined were isolated based on the http_referer string.. An entry was considered a default search option if it contained a query prefix (qp)string representing the escaped form of the scoped site query determinant. For example, an access log entry where the referer comes from the Faculty of Law web site contains the string "qp=%2Burl%3Ahttp%3A%2F%2Fwww.law.monash.edu.au" indicating the search was scoped to law, the default. This is not an ideal method, but no other was available. It assumes that the same proportion of default and non-default queries for each site are cached.
Web searches across the whole of Monash totalled 3,534,000 in 2005. 28% of searches were generated from the www.monash.edu.au domain with over half of those coming from the university home page. The remaining 72% of searches were generated from other university web sites.
The Monash Ultraseek search index contained a single 'collection' of documents with public and some intranet content in the same index. It attempted to index all Monash domains, not just monash.edu.au. Some of the other domains include monash.edu.my, monash.ac.uk, monash.org, monyx.com and many domains specific to research centres and Monash companies.
The four Monash sub-sites: Library, Arts Faculty, Science Faculty and a Campus web site studied in section 2.3 were examined in more detail. Each site's web server logs were analysed to determine the relative size of the sites and their levels of activity in 2005. The Ultraseek logs were analysed to determine the scoped searches for each site, the number of zero result scoped searches and the proportion of non-default searches (i.e. searches where the user has deliberately selected a different search option) emanating from each site.
| Site | Total pages served | Distinct files served | Total scoped searches | Zero result scoped searches | Proportion of zero result searches | Proportion of non-default search selections |
|---|---|---|---|---|---|---|
| Library | 19,206,527 | N.A. | 157,097 | 24,056 | 15% | 20% |
| Arts | 12,126,495 | 153,366 | 88,749 | 8,732 | 10% | 17% |
| Science | 968,356 | 12,199 | 14,168 | 3,175 | 22% | 28% |
| Campus | 250,845 | 489 | 10,745 | 6,439 | 60% | 10% |
Search represents a small proportion of each web site's total activity (represented by total pages served), generally less than 1%. On the campus site, the number of searches is just over 4% of the site's total activity. The campus site is very small by Monash standards and contains a limited amount of information. This may result in an increased proportion of search as users fail to find the desired information by browsing. However, as the default search is restricted to the campus site, this also results in a very large proportion of zero result searches in relation to the other sites examined, and in relation to the University of Tennessee (Wang, Berry and Yang 2003) where there was a 30% zero hit rate.
The Monash University web presence is not quite a public site and not quite an intranet. In particular, the nature of the Monash web site, with internal content and public content interspersed on the site at present, and with Ultraseek configured to index public and internal content in a single index, web site searching may be considered mostly 'known item' searching, especially for Monash students and staff. They are looking for something they have seen before, or been told of, or believe should be available on the site (Mukherjee and Mao 2004, Chau, Fang and Sheng 2005)The queries they enter often have a 'right' answer, or can be answered by a very small subset of the university's web offering (Fagin et al 2005, Chau, Fang and Sheng 2005). Users may get a false-negative for their search for one of two reasons:
The non-default search options were selected approximately 20% of the time. This is a relatively high roportion (Park, Lee, and Bae 2005, Chau, Fang and Sheng 2005) and may indicate a level of familiarity with local versus global Monash search options among regular users.
Query logs were examined to extract scoped queries for the selected web sites, so that query terms could be analysed. The most popular search terms were extracted for each site, as well as the most common zero-result search terms. The top ten search terms for each site are listed in Table 5. The top fifty search terms for each site are detailed in Appendix 4. Each extract was sorted alphabetically and manually examined in addition to extracting search term counts.
| Library | Arts Faculty | Science faculty | Campus | ||||
|---|---|---|---|---|---|---|---|
| Frequency | Search term | Frequency | Search term | Frequency | Search term | Frequency | Search term |
| 2994 | muso | 1840 | EMPTY QUERY | 327 | EMPTY QUERY | 145 | courses |
| 2243 | q manual | 711 | cover sheet | 196 | honours | 105 | map |
| 1741 | referencing | 525 | handbook | 82 | handbook | 104 | campus centre |
| 1562 | Endnote | 463 | timetable | 81 | psychology | 85 | sport |
| 1488 | proquest | 409 | journalism | 78 | chemistry | 79 | short courses |
| 1429 | EMPTY QUERY | 408 | psychology | 60 | green chemistry | 75 | bookshop |
| 1105 | Webct | 356 | japanese | 59 | summer semester | 59 | psychology |
| 973 | cover sheet | 348 | summer semester | 57 | biotechnology | 58 | book shop |
| 850 | assignment cover sheet | 343 | summer | 52 | units | 56 | open day |
| 677 | past exams | 325 | chinese | 50 | physics | 56 | shuttle bus |
Examination of the site-specific search logs, and of the most popular search lists indicates that most users are aware of the place they are searching, in that the majority of searches entered are the kinds of things reasonably expected to be found in that part of the web site. This may be particularly true for the library and faculty sites as they are 'branded' sites, and have their own web domain addresses.

Figure 1: Monash Arts Faculty sub-brand header with Arts Faculty search as default
They may be more obvious because of this. The campus site is in 'masterbrand' style and, as can be seen from the example in Figure 2 the masterbrand header does not distinguish the site, except by breadcrumbs and headings. The URI of the campus site studied is a sub-directory of the main monash web address. It may not be as readily distinguished as a separate site and separate search option.

Figure 2: Monash masterbrand subsite header for Peninsula campus site, showing search options drop-down, with campus-specific search as default.
Some users may not be aware that they are in a specific sub-site and their searches may be 'inappropriate' as a result (Neilsen 2005). Even when users are aware of their site-location, they may need to know 'more' in order to successfully execute a search. Within the science faculty, for example, several schools have their own web domains. Their content is not contained within the main science web site, nor within the domain 'sci.monash.edu'. As a result, their content is not included in a 'science faculty' search. Nor is the content of 'scientific disciplines' such as biology and psychology that are part of the medical faculty. This may explain the relatively higher proportion of zero result searches and non-default search selections for the science site. Science is not the only area at Monash with a diverse web presence. Issues such as these should lead to a questioning of, and further study of the value of scoped searches.
Variant spellings and variant terms are quite common, as are spelling errors and mistypes. Wang, Berry and Yang (2003) found up to 26% misspellings on a university web search engine. The exact level of misspelling at Monash was not calculated, but visual inspection confirms it is of that order, especially if variant spellings are taken into account. For example students are required to use a coversheet when submitting assignments. Variations on coversheet found in the Arts faculty search logs include:
| Frequency | Search Term | Frequency | Search Term |
|---|---|---|---|
| 711 | cover sheet | 30 | english cover sheet |
| 151 | coversheet | 23 | coversheets |
| 127 | cover sheets | 21 | communications cover sheet |
| 108 | assignment cover sheet | 19 | arts coversheet |
| 76 | essay cover sheet | 17 | cover |
| 46 | arts cover sheet | 17 | essay cover sheets |
| 37 | cover page | 13 | assignment cover |
| 35 | assignment cover sheets | 13 | english coversheet |
| 34 | assessment cover sheet | 13 | history cover sheet |
| 31 | assignment cover sheet | 13 | sociology cover sheet |
Common search themes included unit codes such as 'ges1000', 'sci1020' and 'sci2010', unit code prefixes such as 'ges', 'sci', department names, individuals' names, and Monash acronyms and abbreviations such as 'muso', 'wes' and 'webct'. These terms did not necessarily make it into the 'top' lists but there were many examples of them. Closer examination of the logs by subject matter experts from each region would result in a comprehensive list of recommended links for a best-bets service. (Smith et al 2003)
A library example further illustrates the level of mistyping and misspelling. How many ways can you type 'reference'?:
| Frequency | Search Term | Frequency | Search Term |
|---|---|---|---|
| 6 | refencing | 2 | referance |
| 2 | referances | 10 | referancing |
| 12 | referecing | 9 | referenceing |
| 2 | Reference-Internet | 3 | referencin |
| 5 | REFERENCING | 2 | referencing & science |
| 2 | referencing powerpoint | 3 | referencng |
| 4 | referening | 4 | referensing |
| 6 | refernce | 33 | referncing |
| 2 | referneces | 4 | refernecing |
| 4 | refferencing | 6 | refrences |
| 5 | refrence | 26 | refrencing |
Each of the searches in Table 10 would have resulted in a zero-result search. A spell-checker, depending on its configuration, could have prompted many of these for the correct term. (Boston, Rajapatarina and Misingham 2005)
The most common zero-result searches for each site are listed below:
| Library | Arts Faculty | Science Faculty | Campus | ||||
|---|---|---|---|---|---|---|---|
| Number | Search term | Number | Search term | Number | Search term | Number | Search term |
| 2532 | muso | 1840 | EMPTY QUERY | 327 | EMPTY QUERY | 85 | sport |
| 1429 | EMPTY QUERY | 26 | WebCT | 28 | chemstore | 75 | bookshop |
| 216 | coversheet | 25 | cover sheet | 15 | e-rat | 59 | psychology |
| 150 | Qmanual | 15 | monquest | 12 | stilwell | 58 | book shop |
| 99 | psychinfo | 14 | LSS | 12 | veterinary | 56 | open day |
| 99 | Wes | 14 | referencing | 10 | BIO2051 | 54 | calendar |
| 93 | Textron.com | 13 | JRN | 10 | erat | 51 | EMPTY QUERY |
| 73 | mutts | 13 | mymonash | 10 | moresi | 50 | gym |
| 60 | Webct | 12 | monet | 10 | scishop | 50 | security |
| 55 | amh | 12 | yasumasa morimura | 9 | asthenosphere | 48 | sports |
| 48 | mymonash | 11 | enrolement | 9 | microstructures | 47 | food |
| 46 | mus1110 | 11 | hpl1503 | 9 | neuroscience | 38 | bank |
| 42 | assignment cover sheet | 11 | Morimura | 8 | entomology | 38 | music |
| 41 | punishment | 11 | PSS1711 | 8 | mscourse | 36 | commerce |
| 40 | psycINFO | 11 | psycology | 8 | staff | 36 | pictures |
| 39 | legibook | 11 | SILL | 7 | casper | 34 | scholarships |
| 36 | gym | 10 | employment | 7 | ci1010 | 32 | cafe |
| 36 | mgc1010 | 10 | mail.monash.edu.au | 7 | forsyth | 32 | clubs |
| 35 | man11 | 9 | apy1910 | 7 | heatflow | 32 | hair |
| 33 | referncing | 9 | Com1010 | 7 | overload | 31 | jobs |
Zero-result searches appear to be caused by a number of factors:
Examination of the raw query logs indicates a certain level of frustration among some users who do not find what they are expecting. Presumably users often 'know' that Monash has information relevant to their search query, but they cannot find it. Studies have shown that the majority of web search sessions are short with most users entering a single query and only looking at a single results page (Jansen and Spink 2003, Mat-Hassan and Levene 2005). This may also be the case at Monash, but a small group of users repeat the same search many times. Without interviewing the individuals who repeat searches one can only speculate as to the level of frustration of a person who repeats a search more than 10 times without realizing they are searching the 'wrong' part of the web site. For example, this extract from the library query log:
24/08/2005 14:09 0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309 24/08/2005 14:09 0 m0 cse2309
Or this one from the Arts query log
21/11/2005 0:39 0 plaguariased 21/11/2005 0:39 0 plaguariasem 21/11/2005 0:39 0 m0 plaguarilism 21/11/2005 0:40 0 m0 plaguariase 21/11/2005 0:40 0 m0 plaguariase 21/11/2005 0:40 0 m0 plaguariasm
Best-bets and spell-checking would improve the results of these searches in the same way they would for all search results. In addition, offering a 'would you like to search the whole of Monash' option as part of the results page for site-specific searches could assist. Where the search returned zero hits, the 'whole of Monash' could be displayed by default, with a suitable explanation.
The poor search performance of the campus web site examined above led to an analysis of all campus web sites to see if this was a common problem.
The results of this examination are shown in Table 9. The campuses are not named. Two campuses are not represented as they do not have a scoped search option.
| Site | Total pages served | Unique pages served | Total site specific searches | Zero result site specific searches | Proportion of zero result searches | Proportion of non default searches selected |
|---|---|---|---|---|---|---|
| Campus 1 | 250,845 | 489 | 10,745 | 6,439 | 60% | 10% |
| Campus 2 | 106,075 | 369 | 2,996 | 929 | 30% | 17% |
| Campus 3 | 121,933 | 4,562 | 718 | 718 | 100% | 10% |
| Campus 4 | 1,140,302 | 5,654 | 11,860 | 2,070 | 17% | 16% |
| Campus 5 | 320,551 | 7,154 | 3,086 | 498 | 16% | 19% |
| Campus 6 | N.A. | N.A. | 6,073 | 4,823 | 79% | N.A. |
| Campus 7 | 192,795 | 968 | 480 | 147 | 30% | 48% |
| Campus 8 | N.A. | N.A. | 8,665 | 1,799 | 20% | 13% |
In addition to Campus 1, two other campus web sites showed very high zero-result searches. In both cases this was because of a syntax error in their sites' header codes. These errors had been fixed before the study was undertaken. The remaining campuses have zero-result proportions within the range of other Monash web sites and other university sites (Wang Berry Yang 2003). The default search option for Campus 1 was changed to 'whole of Monash' by the site owner before this paper was completed, after he was made aware of the high level of zero search results for the site. However, this also creates additional inconsisteneies in that search is not operating the same way on all campus sites. The campus sites are not internally consistent in the kind and depth of information provided and are potentially confusing.
Further examination of the search results of 'small' Monash sites, especially sub-sites that are masterbrand, is required because it is unclear from this study whether there is a real issue with micro-level scoped search limiting.
During the examination of the Arts faculty it was noted that there were additional scoping restrictions within the faculty web site - sub-sites within a sub-site so to speak. The remaining faculties were examined at a basic level, and also for sub-site scope restrictions. The results are shown below:
| Faculty | Total pages served | Distinct files requested | Total scoped searches | Zero result scoped searches | Proportion zero result scoped searches | Proportion of non default options selected | Search options |
|---|---|---|---|---|---|---|---|
| Art & Design | 577,822 | 6,838 | 9,822 | 1,568 | 15% | 11% | 1 |
| Arts | 12,126,945 | 153,666 | 88,749 | 8,732 | 10% | 17% | 2 |
| Business | N.A. | N.A. | 120,732 | 24,151 | 20% | 12% | 2 |
| Education | 1,099,459 | 42,674 | 20,686 | 4,148 | 20% | 16% | 1 |
| Engineering | 2,040,613 | 62,797 | 42,398 | 10,630 | 13% | 18% | 2 |
| IT | 4,750,849 | 26,114 | 49,442 | 6,912 | 13% | 24% | 1 |
| Law | 2,253,262 | 20,430 | 24,044 | 2,523 | 10% | 14% | 1 |
| Medicine | 6,380,074 | 153,286 | 103,706 | 14,877 | 14% | 15% | 1 |
| Pharmacy | 1,140,302 | 5,654 | 11,865 | 2,073 | 17% | 16% | 1 |
| Science | 968,356 | 12,199 | 14,148 | 3,175 | 22% | 28% | 2 |
Four faculties have 'local' scoped searches that are more specific than 'whole of faculty'. Six faculties have faculty-level search across their whole site. Upon examination of the sub-faculty searches it was found that there are effectively nested sub-scope searches within some faculty sub-sites. It was also found that sub-site restriction is not uniform across a faculty. Neither the Monash web style guide [HREF4] nor the web site templates define a 'web site' for the purposes of search engine restriction. It is clear that there are a number of possible interpretations in current use. How useful and how apparent these subsite distinctions are to users is unclear at this stage. A clear definition of a 'web site' for the purposes of search scope would be a good first step.
| Faculty | Subsites with local scoped search | Total number of subsites at equivalent level | Sub-sub sites with local scoped search |
|---|---|---|---|
| Arts | 7 | 26 | 5 |
| Business and Economics | 26 | 28 | 7 |
| Engineering | 5 | 9 | 0 |
| Science | 3 | 12 | 2 |

Figure 3: Example of faculty sub-site header with three drop-down search options
Sub-site specific scoped searches for the Arts and Business faculties were examined further. The proportion of zero-result searches found varied from 8% to 100%, the median was 42% and the average 56% which is more than twice as high as the faculties as a whole. The proportion of non-default searches selected at sub-sub-site level varied from 12% to 48%, with a median of 22% and average 24% somewhat higher than the faculties as a whole.
Different default search options on faculty web sites cause differences in search results for the same search, depending on where in the site it is performed. Examples noticed during the analysis show major differences in hit rates for common terms, as the following examples show.
2005/11/16 09:21:17 385 timetable 2005/11/16 09:27:11 385 timetable 2005/11/17 12:58:47 12 timetable 2005/11/17 12:59:04 12 timetable 2005/01/19 21:03:24 63 alcohol 2005/03/16 22:36:21 63 alcohol 2005/04/11 15:43:59 0 alcohol 2005/04/11 15:44:02 0 alcohol 2005/04/11 15:45:56 65 alcohol 2005/05/15 10:45:50 64 alcohol
This is also true across the university in general. A search performed on a faculty or divisional site will not produce the same results as one performed from the university home page or any other area defaulting to 'whole of Monash' search. It appears that most users can tell they are on a faculty or other uniquely branded site, but it is not clear whether they can distinguish when they are on a designated sub-site of one of those sites. It is not possible to tell from search logs which type of result set is more useful to a user, unless one of the result sets is zero. However it may be confusing to users that the same search can produce vastly different results (Nielsen 2005). Further research is required in this area.
Examination of public web search at Australian universities raises a number of questions such as:
Initially the author assumed that Monash university's reliance on scoped searching would be uncommon. This was not the case. 66% of Australian universities have some scoped searching. Have the search options and results been evaluated? Is scoped search offered because the enterprise search is seen as sub-optimal? Further research in these areas is recommended.
Within Monash, additional research is required to determine suitable best-bets for commonly performed searches so that these can be offered as results for any relevant search, no matter where it is performed in the site, and to determine a suitable configuration level for the spell-check feature provided with the Verity search software currently in use. Usability testing should also be conducted to determine the optimal level, if any, for sub-site search default options, and to ascertain the usefulness of a 'broaden this search' option on result pages of scoped searches.
Alexander, D. (2005) How usable are university websites? In Proceedings of the Eleventh Australian World Wide Web Conference (AusWeb), Gold Coast , Australia [HREF5]
Boston,T. Rajapatarina, B and Missingham R. Libraries Australia: Simplifying the search experience. Online 2005 Conference, Sydney, Australia [HREF6]
Chau, M Fang, X and Sheng, O. (2005) Analysis of the query logs of a web site search engine. Journal of the American Society for information science and technology, 56(13)
Fagin, R et al (2003) Searching the workplace web. WWW 2003, May 2003, Budapest, Hungary
Jansen, B and Spink, A (2005) An analysis of web searching by European Allthe Web.com users. Information processing and managementvol 41
Infoseek.Corporation 1999a Ultraseek Server 3.1 Administrator Guide
Infoseek Corporation 1999b Ultraseek Server 3.1 Customization Guide
Mat-Hassan, M and Levene, M. (2005) Associating search and navigation behavior throught log analysis. Journal of the American Society for information science and technology. 56(9)
Mukherjee R and Mao Jianchang. (2004) Enterprise search: tough stuff. QUEUE April 2004.
Nielsen, J. (2001) Search: visible and simple. Alertbox. [HREF7]
Nielsen, J. (2005) Mental models for search are getting firmer. Alertbox [HREF8]
Park, S. Lee, J H and Bae, H J. (2005) End user searching: a web log analysis of NAVER, a Korean web search engine. Library and information science research 27
Rosenfeld, L. and Morville, P. (2002) Information Architecture for the World Wide Web, 2nd edition, Sebastapol, CA: O'Reilly.
Rosenfeld, L. (2005) Enterprise Information Architecture Seminar Presentation. Fall 2005 [HREF9]
Smith, J et al (2003) Enhancing end-user searching on HealthInsite. 10th Asia Pacific Special, Health and Law Librarians' Conference, Adelaide, Australia
Wang, P, Berry, M and Yang, Y (2003) Mining longtitudinal web queries: trends and patterns Journal of the American Society for Information Science and Technology, 54(8)
| University | Search engine Feb 2006 | Search engine June 2004 |
|---|---|---|
| University of Adelaide | Oracle ultra and Google (Google on library site) | htdig |
| Australian Catholic University | local (library uses Google) | |
| Australian National University | PanOptic | PanOptic |
| University of Ballarat | Oracle ultra | |
| Bond University | Freefind.com | Freefind.com |
| University of Canberra | PanOptic | PanOptic |
| Central Queensland University | ||
| Charles Darwin University | Google (site search) | |
| Charles Sturt University | ||
| Curtin University of technology | Google (library uses htdig) | |
| Deakin University | ||
| Edith Cowan University | ||
| Flinders University | Google and htdig | Google and htdig |
| Griffith University | PanOptic and Google | Something local and Google |
| James Cook University | Google and htdig | Google and htdig |
| La Trobe University | ||
| Macquarie University | htdig | |
| University of Melbourne | Ultraseek 3.1 | Ultraseek 3.1 |
| Monash University | Verity k2 | Ultraseek 3.0 |
| Murdoch University | mnoGoSearch and Google | MngoSearch 3.2.15 and Google |
| University of New England | PanOptic | PanOptic |
| University of New South Wales | Google and Verity | |
| University of Newcastle | Verity k2 | Ultraseek 4.3.1 |
| University of Queensland | Google (library uses htdig) | |
| Queensland University of Technology | PanOptic | AltaVista |
| RMIT University | Teratext | Teratext |
| Southern Cross University | htdig | htdig |
| University of South Australia | Local and Google | Local and Google |
| University of southern Queensland | Ultraseek 4.5.0 | Ultraseek 4.0 |
| University of the Sunshine Coast | undetermined | |
| Swinburne University of Technology | ||
| University of Sydney | PanOptic | PanOptic |
| University of Tasmania | Ultraseek 4.3.3 | Ultraseek 4.3.3 |
| University of Technology Sydney | Ultraseek 4.1.1(htdig on library site) | Ultraseek |
| Victoria University | ||
| University of Western Australia | Google (library uses its own local search engine) | |
| University of Western Sydney | ||
| University of Wollongong | PanOptic | PanOptic |
| University | Search engine examined | Default boolean AND | Spell check | Best bets | Word stemming | Highlight query terms | Advanced search |
|---|---|---|---|---|---|---|---|
| The University of Adelaide | Oracle ultra | NO | NO | NO | YES (advanced) | YES | YES |
| Australian Catholic University | local | YES | NO | NO | NO | NO | YES |
| The Australian National University | PanOptic | YES | YES | YES | YES (advanced) | YES | YES |
| University of Ballarat | Oracle ultra | NO | NO | NO | YES (advanced) | YES | YES |
| Bond University | freefind.com | YES | NO | NO | YES (explicit with *) | YES | NO |
| University of Canberra | PanOptic | YES | NO | YES | NO | YES | YES |
| Curtin University of Technology | Htdig (library) | YES | NO | NO | NO | YES | NO |
| Flinders University | htdig | YES | NO | NO | YES for plurals | YES | NO |
| Griffith University | PanOptic | YES | YES | YES | NO | YES | YES |
| James Cook University | htdig | YES | NO | NO | YES for plurals | YES | NO |
| The University of Melbourne | Ultraseek 3.1 | NO | NO | NO | NO | YES | YES |
| Monash University | Verity k2 | YES | NO | NO | NO | NO | YES |
| Murdoch University | mnGoSearch | YES | NO | YES (default) | YES | YES | YES |
| The University of New England | PanOptic | YES | NO | NO | NO | YES | YES |
| The University of New South Wales | Verity | YES? | NO | NO | YES | YES | YES |
| The University of Newcastle | Verity k2 | YES? | YES | NO | YES for plurals | YES | YES |
| The University of Queensland | Htdig (library) | YES | NO | NO | YES | YES | NO |
| Queensland University of Technology | PanOptic | YES | YES | YES | NO | YES | YES |
| RMIT University | Teratext | YES | NO | NO | NO | YES | YES |
| Southern Cross University | htdig | YES | NO | NO | NO | YES | NO |
| University of South Australia | Local | YES | NO | NO | NO | NO | NO |
| University of Southern Queensland | Ultraseek 4.5.0 | NO | NO | NO | NO | YES | YES |
| University of the Sunshine Coast | undetermined | YES | YES | NO | NO | YES | YES |
| The University of Sydney | PanOptic | YES | NO | NO | NO | YES | YES |
| University of Tasmania | Ultraseek 4.3.3 | YES | NO | NO | NO | YES | YES |
| University of Technology Sydney | Ultraseek 4.1.1 | NO | NO | NO | NO | YES | YES |
| The University of Western Australia | Local (library) | NO | NO | NO | NO | NO | NO |
| University of Western Sydney | YES | NO | NO | YES (advanced option) | YES | YES | |
| University of Wollongong | PanOptic | YES | YES | YES | NO | YES | YES |
| University | Library | Science | Arts | Campus | Notes |
|---|---|---|---|---|---|
| University of Adelaide | Scoped | University | University | University | |
| Australian Catholic University | University | University | University | University | |
| Australian National University | Scoped | University | University | Scoped | |
| University of Ballarat | University | University | University | University | |
| Bond University | Scoped | None | None | None | |
| University of Canberra | Scoped | None | University | University | Library default is scoped with radio button options on page for whole site. No search option on top level health sciences site, uni-wide search option on lower level pages |
| Central Queensland University | Scoped | University | University | University | |
| Charles Darwin University | # | # | # | # | Default search option is 'courses' (this is the header default across the entire site). The web site search is towards the bottom of the options list. |
| Charles Sturt University | Scoped | University | None | University | |
| Curtin University of technology | Scoped | University | University | Scoped and University | Humanities site search appears uni-wide. However the search is broken and no search terms return results - a blank search page is returned. Campus web site offers two search options in the header |
| Deakin University | University | University | University | University | |
| Edith Cowan University | None | Scoped | University | University | Radio button default restricts to site. Campus websites were not apparent. Campus information page has uni-wide search |
| Flinders University | Scoped | University | University | University | Library search offers local htdig. Top level site search offers Google and htdig. Faculty of Science and Engineering has no search option on home page, Faculty of Health Sciences has default university search. Campus sites not available, campus information page has uni-wide search |
| Griffith University | University | University | University | University | |
| James Cook University | University | University | University | University | Default is Google, but can select 'local search engine'. No campus web sites. Campus maps and campus locations pages have uni-wide search |
| La Trobe University | Scoped | University | University | University | |
| Macquarie University | University | University | University | none | |
| University of Melbourne | Scoped | University | University | University | Does not have separate campus sites, campus information page has uni-wide search |
| Monash University | Scoped | Scoped | Scoped | Scoped | |
| Murdoch University | Scoped | University | University | University | Default on uni homepage is 'search a-z index'(a form of best bets), Arts and Science search defaults to ,a-z index search, user must then select the 'search' button from the index page to get a web search option. |
| University of New England | University | Scoped | None | University | Initial default on library search box (radio button) is staff directory search. On science faculty search page, users must explicitly select the type of search, faculty wide is first choice. It is also the default option if a user elects to search the site via Google |
| University of New South Wales | Scoped | University | Scoped | Scoped and University | Top level site search defaults to Google. Library search is not Google. Only ADFA campus appears to have its own site, it is a Scoped search. Campus maps page has university-wide search |
| University of Newcastle | University | University | University | University | |
| University of Queensland | Scoped | Scoped | Scoped | Scoped | Campus searches are part of the 'About' site, which has a site specific search |
| Queensland University of Technology | Scoped | Scoped | Scoped | University | |
| RMIT University | University | University | University | University | |
| Southern Cross University | University | University | Scoped | University | Library site says it is scoped, but this does not seem to be the case. Arts faculty has Scoped search in left hand navigation box and uni-wide one in header |
| University of South Australia | Scoped | University | University | University | |
| University of Southern Queensland | University | University | University | University | |
| University of the Sunshine Coast | University | University | University | University | |
| Swinburne University of Technology | University | University | University | University | |
| University of Sydney | Scoped | Scoped | Scoped | University | |
| University of Tasmania | University | Scoped | Scoped | University | Arts and Science faculty sites offer two search boxes, the default university-wide one in the top page header and a Scoped one beneath the faculty header, this appears to be default behaviour for some utas sites |
| University of Technology Sydney | Scoped | Scoped | Scoped | none | No 'search' option on campus pages - must go to 'find' link in footer and this includes a link to the university search page |
| Victoria University | University | University | University | University | |
| University of Western Australia | Scoped | University | University | ||
| University of Western Sydney | Scoped | University | University | University | Library search uses Google, uncertain about top level site engine |
| University of Wollongong | University | University | University | University |
| Library | Arts Faculty | Science faculty | Campus | ||||
|---|---|---|---|---|---|---|---|
| Frequency | Search term | Frequency | Search term | Frequency | Search term | Frequency | Search term |
| 2994 | muso | 1840 | EMPTY QUERY | 327 | EMPTY QUERY | 145 | courses |
| 2243 | q manual | 711 | cover sheet | 196 | honours | 105 | map |
| 1741 | referencing | 525 | handbook | 82 | handbook | 104 | campus centre |
| 1562 | Endnote | 463 | timetable | 81 | psychology | 85 | sport |
| 1488 | proquest | 409 | journalism | 78 | chemistry | 79 | short courses |
| 1429 | EMPTY QUERY | 408 | psychology | 60 | green chemistry | 75 | bookshop |
| 1105 | Webct | 356 | japanese | 59 | summer semester | 59 | psychology |
| 973 | cover sheet | 348 | summer semester | 57 | biotechnology | 58 | book shop |
| 850 | assignment cover sheet | 343 | summer | 52 | units | 56 | open day |
| 677 | past exams | 325 | chinese | 50 | physics | 56 | shuttle bus |
| 667 | Qmanual | 319 | music | 44 | summer | 54 | calendar |
| 615 | voyager catalogue | 288 | behavioural studies | 43 | biology | 51 | EMPTY QUERY |
| 542 | wireless | 283 | subjects | 43 | cover sheet | 50 | gym |
| 524 | Factiva | 277 | units | 43 | synchrotron | 50 | security |
| 514 | document delivery | 257 | sociology | 40 | microbiology | 48 | sports |
| 450 | harvard referencing | 247 | honours | 39 | environmental science | 47 | food |
| 435 | Docdel | 245 | history | 37 | biochemistry | 46 | parking permit |
| 422 | Fines | 227 | international studies | 36 | chemstock | 41 | parking |
| 412 | 206 | webct | 36 | muso | 40 | education | |
| 410 | law library | 201 | english | 34 | sci1020 | 40 | monash college |
| 405 | coversheet | 192 | philosophy | 32 | imperviousness Yarra | 38 | bank |
| 402 | Ovid | 191 | anthropology | 32 | jobs | 38 | music |
| 383 | Allocate | 180 | referencing | 32 | zoology | 37 | orientation |
| 376 | Renew | 174 | korean | 30 | sci2010 | 36 | commerce |
| 346 | Ibis | 173 | crimonology | 29 | geology | 36 | pictures |
| 333 | assignment coversheet | 172 | communications | 29 | store | 34 | scholarships |
| 331 | journals | 163 | courses | 28 | chemstore | 34 | library |
| 315 | my monash | 160 | politics | 28 | imperviousness | 33 | course |
| 308 | web ct | 154 | fees | 7 | dean | 33 | engineering |
| 299 | Harvard | 151 | coversheet | 27 | fees | 32 | cafe |
| 289 | caval card | 151 | french | 27 | physiology | 32 | clubs |
| 283 | Kinetica | 147 | clayton map | 26 | mathematics | 32 | hair |
| 283 | reference | 147 | ges1000 | 25 | chem store | 32 | short course |
| 278 | lectures online | 147 | social work | 25 | honours application | 31 | jobs |
| 245 | thesis | 139 | library | 25 | scholarship | 31 | law |
| 241 | citation | 135 | spanish | 24 | astronomy | 30 | bus |
| 241 | key law resources | 132 | undergraduate handbook | 24 | near pass | 30 | parking permits |
| 241 | newspapers | 130 | tourism | 24 | nursing | 29 | hairdresser |
| 240 | opening hours | 129 | summer course | 24 | risk assessment | 28 | graduation |
| 239 | marketing myopia | 127 | cover sheets | 24 | science | 28 | robert blackwood hall |
| 235 | exams | 124 | asian studies | 24 | statistics | 28 | arts |
| 232 | bibliography | 124 | bachelor of letters | 23 | bachelor of science | 28 | menzies building |
| 232 | medline | 124 | visual culture | 23 | ian cartwright | 27 | student services |
| 224 | vancouver | 122 | bachelor of arts | 23 | map | 25 | monash international |
| 220 | harvard business review | 122 | communication | 23 | subjects | 24 | courses |
| 219 | Coolcat | 121 | mutts | 23 | undergraduate handbook | 24 | photos |
| 217 | citing and referencing | 117 | allocate + | 22 | courses | 24 | post office |
| 217 | theses | 117 | linguistics | 22 | genetics | 23 | accomodation |
| 216 | bookshop | 117 | scholarships | 22 | immunology | 23 | photo |
| 215 | monash college | 111 | summer school | 22 | pharmacology | 22 | accounting |