Link? Rot. URI Citation Durability in 10 Years of AusWeb Proceedings.
Baden Hughes, Research Fellow, Department of Computer Science and Software Engineering, The University of Melbourne, Victoria 3010, Australia. Email: badenh@csse.unimelb.edu.au
Keywords
citation analysis, link durability, quantitative studies, web survey, AusWeb
Abstract
The AusWeb conference has played a significant role in promoting and advancing research in
web technologies in Australia over the last decade. In addition, the AusWeb forum serves as a point
of reflection for practitioners engaged in web infrastructure, content and policy development;
allowing the distillation of best practice in the management of web services particularly
in higher educational institutions in the Australasian region.
Papers contributed to AusWeb are highly connected to the web in general, to digital libraries
and to other conference sites in particular, owing to the publication of proceedings
in HTML and full support for hyperlink based referencing. Authors have exploited this
medium progressively more effectively, with the vast majority of references for AusWeb
papers now being URIs as opposed to more traditional citation forms.
Although care has been taken by the proceedings editors to ensure that a high degree of syntactic
standards compliance is adherred to by AusWeb authors, the editors are not
responsible for paper content, or the citations made within them. As such, it is
the responsibility of the authors (and arguably perhaps, the AusWeb reviewers) to
ensure that the citations made in a given paper, particularly those cited by URI,
are available for consideration by interested readers.
The objective of this paper is to examine the reliability of URI citations in 10
years worth of AusWeb proceedings, particularly to determine the durability of such
references, and to classify their causes of unavailability. To our knowledge, the work in
this paper is the only work specifically focusing on URI citation durability in
conferences with web-based proceedings.
We consider the availability and persistence of URIs cited in the proceedings of AusWeb 1995-2005.
5975 unique URIs are extracted from 611 papers (referreed papers and edited posters) published in the 10 years worth of AusWeb proceedings
available on the web via http://www.ausweb.scu.edu.au/.
The availability of URIs was checked once per week for 6 months from
September 2005 to March 2006. We found
that approximately 42% of those URIs failed to resolve initially, and an almost
identical number (42%) failed to resolve
at the last check. A majority (99%) of the unresolved URIs were due to 404 (page not found)
errors. We explore possible factors which may cause a URI to fail based on its age, path depth, top level domain and
file extension. Based on the data collected we conclude that the half-life of a URI referenced
in AusWeb proceedings papers is approximtely 6 years, and we compare this to previous
studies of similar phenomena. We also find that URIs are more likely
to be unavailable if they pointed to resources in the net or edu domains or country-specific top level
domains, used non standard ports, or referenced a resource with an uncommon or deprecated
file type extension.
[ Full Paper ] [ Presentation ] [ Proceedings ] [ AusWeb Home Page ]
|