Link? Rot. URI Citation Durability in 10 Years of AusWeb Proceedings.

Baden Hughes, Research Fellow, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia. Email: badenh@csse.unimelb.edu.au


Keywords

citation analysis, link durability, quantitative studies, web survey, AusWeb


Abstract

The AusWeb conference has played a significant role in promoting and advancing research in web technologies in Australia over the last decade. In addition, the AusWeb forum serves as a point of reflection for practitioners engaged in web infrastructure, content and policy development; allowing the distillation of best practice in the management of web services particularly in higher educational institutions in the Australasian region. Papers contributed to AusWeb are highly connected to the web in general, to digital libraries and to other conference sites in particular, owing to the publication of proceedings in HTML and full support for hyperlink based referencing. Authors have exploited this medium progressively more effectively, with the vast majority of references for AusWeb papers now being URIs as opposed to more traditional citation forms. Although care has been taken by the proceedings editors to ensure that a high degree of syntactic standards compliance is adherred to by AusWeb authors, the editors are not responsible for paper content, or the citations made within them. As such, it is the responsibility of the authors (and arguably perhaps, the AusWeb reviewers) to ensure that the citations made in a given paper, particularly those cited by URI, are available for consideration by interested readers.

The objective of this paper is to examine the reliability of URI citations in 10 years worth of AusWeb proceedings, particularly to determine the durability of such references, and to classify their causes of unavailability. To our knowledge, the work in this paper is the only work specifically focusing on URI citation durability in conferences with web-based proceedings.

We consider the availability and persistence of URIs cited in the proceedings of AusWeb 1995-2005. 5975 unique URIs are extracted from 611 papers (referreed papers and edited posters) published in the 10 years worth of AusWeb proceedings available on the web via http://www.ausweb.scu.edu.au/. The availability of URIs was checked once per week for 6 months from September 2005 to March 2006. We found that approximately 42% of those URIs failed to resolve initially, and an almost identical number (42%) failed to resolve at the last check. A majority (99%) of the unresolved URIs were due to 404 (page not found) errors. We explore possible factors which may cause a URI to fail based on its age, path depth, top level domain and file extension. Based on the data collected we conclude that the half-life of a URI referenced in AusWeb proceedings papers is approximtely 6 years, and we compare this to previous studies of similar phenomena. We also find that URIs are more likely to be unavailable if they pointed to resources in the net or edu domains or country-specific top level domains, used non standard ports, or referenced a resource with an uncommon or deprecated file type extension.


[ Full Paper ] [ Presentation ] [ Proceedings ] [ AusWeb Home Page ]

 

 

 

 

All materials Copyright AusWeb06. The Twelfth Australasian World Wide Web Conference, Australis Noosa Lakes, from 1st to 5th July 2006
Contact: Norsearch Conference Services +61 2 66 20 3932 (outside Australia) (02) 6620 3932 (inside Australia) Fax (02) 6626 9317