Ranjit Sidhu, SiD, Statistics into Decisions [HREF1] Sydney, New South Wales, 2039. Email: info@SiDecisions.com
In this paper I would like to focus on the lessons learnt whilst working with 40 universities in the UK and now the education community in Australia. In particular, I would like to focus on what are the important factors that must be considered by the organisation and staff before an analytics system is implemented.
Due to Education Institutions often delivering multiple objectives to many different customer groups unique challenges face universities when trying to perform detailed analysis of their website.
Often, functionality that is generally designed for retail and e-commerce sites is focused on to the detriment of understanding the University’s particular goals and creating a technical framework for these goals to be achieved. Other considerations such as creating a non technical structure so that all institutional stakeholders can be involved in the analytics structure are also imperative.
Within site analytics there are two main technologies; log file analysis and the page tagging method. Others do exist such as the web beacons, but most universities use one or the other. Logfile analysis is where a program uses the server log files to measure activity; the first system was introduced around 1994 [HREF2]. The most common one used by universities in the past is Webalizer. The page tagging method originated around the mid 90’s and was where a web counter was manipulated by the use of JavaScript for an invisible image to pass along information about the user. Often a cookie would be used to track the visitor further.
A full pros and cons of each system is a discussion for elsewhere [HREF2] [HREF3] [HREF4]. In my experience the University Sector has traditionally had log file systems and is generally embracing the Page Tagging (JS) option. The main reasons some universities use Page Tagging instead to Logfile systems are that collecting the logfiles from the different servers is costly and can even be impossible. Also the storing of the logfiles takes up valuable server space and time. Along with this the reprocessing is costly in both human time and actual time taken to reprocess. Another problem can be demands from all the separate departments and faculties have to be processed by the web team which means a severe increase in work load [HREF5]. From a technical point the non-caching of the JavaScript provides insight in navigational analysis.
Alternatively, the reasons for using Logfile instead of Page Tagging, again strictly from the university prospective, are firstly one of control as log files analytics licence cost is usually a one-off payment, unlike the continual payment of JavaScript coupled with the fact that universities can keep their own files and processing on their server. There are also several technical pluses such as bandwidth can be measured and that on implementation there is no need for a piece of JavaScript to be added to the page.
The consent generally is that with dynamic sites and the need for quicker analysis, universities are considering the JavaScript method more and more.
In this section I would like to tackle the main proposition of this paper: that before implementing an analytics solution the university should take a step back and review its online space so that a bespoke set of guidelines can be created as a driver to what solution they require.
In short; it is essential for universities to be able to create meaningful segmentation so that the differing sections of the university can individually measured as can the differing groups of visitors to the website.
For this to happen, five main questions should be asked:
Unlike e-commerce sites or information sites that have distinct goals for their visitors [HREF3], university sites can have a combination of different groups of visitors with differing demands from the website. This can be seen in the clear difference between the 'current student' and the 'prospective student'. A current student will look at many pages and is a regular visitor, therefore important criteria of success of a website could include, percentage number of returning visitors, length of time spent on site, percentage of interaction with different parts of the site, e.g. blackboard, library, ease of use and navigational sense and practice. These are often the goals of 'information' sites such as bbc.co.uk [HREF6] or smh.com.au [HREF7].
In contrast when considering a prospective student the website now becomes a competitive space that the website is often seen as a driver to application, therefore the criteria of success could include, percentage of visitors who reach apply stage, percentage of visitors leakage from campaign start (e.g. email, mail shot etc.) to application end. These are often seen as the goals associated with e-commerce sites, e.g. lastminute.com.au [HREF8], www.travel.com.au [HREF9].
It is therefore important for a university to understand all the customer groups it serves. Once this has been done it helps to understand the balance of focus of the website when considering these differing groups within a particular university. For example, a university that is perhaps more traditional and sees itself as a research organisation first will have a greater emphasis on the ‘information’ provision of its site than a ‘newer’ organisation. This was backed by the Sector Analysis in the UK [HREF10], where we found that: When looking at the external visitor for a ‘traditional’ (Russell group) university compared to a newer university, the visitors acted more in keeping with a news site (information) whilst at the newer universities there tended to be viewers who used the website more in keeping with that of an e-commerce site.


Once you have identified your visitors into groups it is then useful to identify the internal customers (stakeholders) who would be interested in these particular groupings. In other words, have a clear understanding of who are going to use the statistics and provide value for the expenditure of the analytics. This can also be a good way to bringing internal stakeholders on board and also it keeps us away from being unhealthy internally orientated as to problems that we may face alone [HREF11].

Opening dialogue with the stakeholders is imperative as it will help clearly define how to break up your website into meaningful parts, and what users need to be identified.
Pages, or groups of pages, must be able to be seen as a consistent group by the analytics tool. For universities this becomes even more important when thinking about distinct internal stakeholders who will be interested in only their part of the site, e.g. faculties, international, alumni, etc.
A common problem is that many analysis tools use the URL as defining a page which seems sensible where it’s grouping can be extracted sensibly, for example International can be simply be extracted from:
http://www.international.newcastle.edu.au/ or
http://info.anu.edu.au/studyat/International_Office
However, often the top page may often be well signified only for a lower page to be completely different as it served by a differing template, server etc.
From: http://www.griffith.edu.au/international/ = international homepage
To: https://www81.secure.griffith.edu.au/psp/EX88PD/GI/EX/c/EPPCM_CONTENT_MGMT.EPPCM_PUB_VIEWER.GBL?EPPCM_CONTENTID=2022&SHOW_SUMMARY=N = international exchange page.
It is important to make a directory structure so that ALL pages within the groupings are understood to be within that directory. It is also important as this allows interpretation of statistics by non technical staff. There are several methods of attaining a sensible structure [HREF5]. The main ways used by university’s are; using page title instead of URL, customising the java scripting so that a meaningful naming can be pulled from elsewhere in the uploaded page, labelling the pages manually-(this can be costly and time consuming) and finally some systems allow CMS descriptions of the pages to be uploaded into the analytics console.
It is essential when considering all this to have in mind that the reporting at the end of the day must be able to see, for example,
How many visitors went: * being wildcard or ‘ALL’

A clear understanding of how this is going to take place before implementation is started. An important question for any supplier!
Once clear directories are put into place, it is then important to identify important ‘actions’ that need to be measured and grouped. These may be positioned horizontally across the site, therefore not in a clear directory. Again, before implementing, discussions with the different stakeholder should take place so that clear naming structures are put into place allowing coherent reporting with minimum of fuss. For example, marketing will need to measure all their campaigns, e.g.Open days, emails, physical campaigns, whereas there will be special demands from particular stakeholders, e.g. Alumni and International.
Most universities also need to measure clicks on particular objects, for example, click to the central body for application, click to download open day application, click to telephone numbers or click to podcast.
Two methods are often used to track the click: the redirect or the query string. The query string is preferable with regards to SEO, although the redirect wins on being more reliable. In both cases identification or naming of the click must have enough information to be reported on by different reporting groups.
For example the two clicks:
grad.apply.120107 and postgrad.apply.120107
Will be seen in the end process of:

Simple naming structures and naming conventions must be put in place for all those using a tracking system. A lack of a coherent naming convention can mean hours and hours of reprocessing. For example the use of the inconsistent naming convention below instead of the example above:
grad.apply.120107
Postgrad.application.120107
It has been argued above that it is important to have fluid breakdowns of the different website visitor groups. This is something that the JavaScript system allows to better effect then the log file system [HREF4]. Universities are also pretty much unique in the size of their internal audience. The internal/external breakdown we did in our sector statistics analysis [HREF5] was perhaps the greatest step forward to universities understanding their website better as universities do not simply need to remove internal IP, but use it. Many analytics can exclude IP, but for universities this would be paramount to not caring about 45% of their traffic!
There are three main ways of achieving this through analytics: Behaviour, technical (IP) and labelling.
1 Behaviour:
Here you can recognise particular groups by the behaviour exercised by the visitor; this can be called 'funnel reporting ' or 'related viewings'. For this to happen it is important to have a clear directory naming convention as mentioned above. An example is people who went through the below funnel analysis will be current students interested in postgraduate studies: (note with cookie it does not matter whether they log off at any stage):

The benefits of this system are that it does not require implementation. The negatives are that it does not report beyond the directory structure of the website, therefore can be considered crude and limited as functionality and applicability.
2 Technical (IP) breakdown
Here, either by a server side action or product side functionality (preferred due to minimal implementation) you can create reports pulling apart your universities IP range from the rest of the world- creating an internal/external split. This is the method I helped create for the Universities in the UK. This allowed insight into how different visitor groups were using University site [HREF10] as different from IP excluding.
The benefits of this system are that it provides fully flexible reporting, with the negative being that you are hamstrung by how your IP is broken down within the university. It also misses out on current students/staff who log in on home, indeed it may be impossible to distinguish between staff, students, postgraduate and undergraduate.
3 Labelling Method
Here the cookie is used in conjunction with JavaScript labelling so that once the visitor has done a particular action that identifies them within a group [HREF4]. The benefit is that it is the most accurate as it is not simply geographically based (students in holidays, working from home etc can be measured) and that visitors can be broken down into sub groups. The negative is that it does require further implementation and planning.
With the amount of reports available within a web analytics reporting system (over 150 by most providers), it can often be hard to remember what reports would be useful to the university rather then what reports can be provided by an analytics provider.
Again, the importance of dialogue with internal stakeholders cannot be understated. However, an external consultant may also be helpful as often the possibilities of tracking move on from month to month.
Often University reports show visitors, visits, bounce rates at a top level for the site, with no relevance to the requirement of stakeholders therefore are completely un- actionable. A popular example of this is that often the number of international visitors to a site is given; although useful, this provides very little insight for the international department in comparison to a report that lists, based on popularity, education organisations in each given country that have visited the application pages or pages *international*. Again bulk number of visitors is given, why not rather through segmentation Analysis to see how groups of users interact within different parts of the sites?
In this way, web analytics becomes a pro-active tool to change rather than simply a retrospective method to report.
Web Analytics is perhaps the most feature driven service around. It is often very hard to step back from delving in product features and focus instead on the online requirements and needs of your organisation.
Universities, because of their diverse nature, require information to be segmented into groups. This is true for the visitors, website and the information provided by the website. An overall strategy which outlines clear conventions for implementation and naming as well as segmentation of visitors should be in place before a full implementation is carried out. In this way the University will be in a better position to provide the answer required rather than all the answers technology can provide!
Clifton, Brian (2007). Web Traffic Data Sources & Vendor Comparison. Omega Digital Media. Available online [HREF2].
Emetrics Summit conferences. Available online [HREF12].
Peterson, Eric T. (2005). Web site measurement hacks. O'Reilly.
Wenzel, S. (2007). Web analytics book. Available online [HREF13].