Building an Institutional Research Repository from the Ground Up: The ARROW Architecture Experience


Dr Andrew Treloar [HREF29], Project Manager, Strategic Information Initiatives, Information Technology Services [HREF30] & ARROW [HREF32] Technical Architect & Adjunct Librarian, Monash University Library [HREF33]. Building 3A, Monash University [HREF31], Victoria, 3800. Email: Andrew.Treloar@its.monash.edu.au

Abstract

This paper describes the thinking behind the technical architecture for ARROW - Australian Research Repositories Online to the World (a DEST-funded project under the Research Information Infrastructure Framework for Australian Higher Education). The paper begins by describing the vacant lot - the context in which the project came about. It then moves on to the design brief for the architect - the list of requirements. Next comes the resulting architectural drawings - the broad model and list of functions. In order to turn a blueprint into reality, one needs building materials - in this case the pieces of software required. Finally the paper discusses the state of the building site, and when the 'house' might be open to its first visitors.

1. Vacant Lot

This is a story about building an institutional research repository. More specifically, it is about designing the architecture to make this building possible (later papers will talk about the 'building' itself). But before the architecture can occur, the right environment needs to exist. What was the vacant lot that made it possible for this building to even be thought about? In this case the vacant lot has two components: a general one and a more specifically Australian one.

1.1 Overall context

There is a growing interest among academic institutions in collecting, preserving, reusing and creating value-added services from digital content produced in and for research, teaching and learning. The emphasis on research outputs and collaboration, and distance, flexible and online learning, together with developments in information technology, has led to an increased awareness that the digital content being created by members of the academic community is an institutional asset. This content is also increasingly being recognised as an institutional challenge, requiring both tactical management and a strategic response.

At the same time many academic libraries are responding to the challenges of new technologies by taking the opportunity to redefine their fundamental role in the creation, distribution and provision of access to information. Over the past decade libraries have moved almost completely towards a digital platform for management of the information (both print and electronic) that they acquire or subscribe to. They have built significant digital collections of material published by others, and they are increasingly producing new content themselves [Harboe-Ree et. al. (2004)]. Often this content originates from, or is the intellectual property of, their own institutions.

Meanwhile, all around the world, universities, their libraries, faculties, research centres and information technology and course development units, are trying to cope with the digital revolution. There is a growing recognition and articulation of the convergence that is occurring among the various digital initiatives in which universities are engaged, and the opportunities for potential synergies and more significant outcomes through collaboration and interoperability.

As one example, the COLIS (the Collaborative Online Learning and Information Services model [HREF3]) work at Macquarie University has focused on testing the feasibility of interoperable standards as a way of managing interactions between a range of electronic services. Through the success of the COLIS model, McLean and others have demonstrated that the new electronic environment can and must comprise a complex interactive matrix that is dependent on the information resources mentioned above, as well as on user directories, content and rights management software, and metadata repositories.

Sally A Rogers, from Ohio State University, argues that the full array of a university's digital assets and information services should be broadly defined, and should include the library's catalogue, the electronic journals, reference databases and other electronic resources available through the library, as well as institutional repositories and resources created or collated elsewhere in the university, such as course material [Rogers (2003)]. She notes the overlapping of such initiatives as digital collections, course web sites, electronic course packs and learning objects, the desirability of integration to search across these repositories and the development of standards to promote interoperability. Rogers also highlights the potential of increased interoperability and connectivity to generate innovation in research, teaching and learning.

1.2 Australian context

It was against this backdrop that the November 2002 report of the Higher Education Information Infrastructure Advisory Committee (HEIIAC) of the Australian Government Department of Education, Science and Training (DEST) [DEST (2002)] identified the following critical features of an enhanced research infrastructure:

The HEIIAC report was primarily concerned with managing the current problems associated with scholarly communication and publishing, and it stressed the need to adopt a national collaborative approach. As already discussed, a range of players embrace scholarly communication strategies and argue that they should be incorporated into a more holistic approach to the management of institutional digital content and intellectual capital.

The merging of these two approaches would yield substantial benefits to Australian university communities, consistent with the following statements of principle:

  1. Australian universities have a commitment to support and promote their institutions' research activity through the creation and preservation of digital content, especially institutional repositories and electronic publishing.
  2. Australian universities have a commitment to help their institutions achieve their goals more effectively by assisting with the integration of digital resources.
  3. Australian universities have a commitment to collaborating nationally and internationally in the achievement of a more integrated approach to the management and interoperability of digital content. [Harboe-Ree and Treloar 2004]

These statements reflect the HEIIAC objectives and place them into a framework that, if implemented, would improve institutional and national efficiency and effectiveness. The challenge for HEIIAC was to turn these principles and objectives into action.

1.3 DEST RII Process

In June of 2003, the Australian Commonwealth Department of Education, Science and Training issued a call for proposals to "further the discovery, creation, management and dissemination of Australian research information in a digital environment" [DEST (2003a)]. This sought to "fund proposals which help promote Australian research output and help to build the Australian research information infrastructure, through the development of distributed digital repositories and common technical services that manage access and authorisation to these."

The guidelines for submissions identified the following requirements to be met by successful bids:

In response to this call, 14 projects were submitted of which four were funded [DEST (2003b) ]. The successful projects were:

These four projects were funded for a combined total of A$12 million over a period of 3 years, with funding commencing at the start of 2004 [HREF11].

The focus of this paper will be the architectural design of the ARROW Project.

2. Design Brief

The original design brief was encapsulated in the Summary section of the ARROW Bid document sent to DEST. This read:

"The ARROW project (ARROW) will identify and test a software solution or solutions to support best-practice institutional digital repositories comprising e-prints, digital theses and electronic publishing. A wide range of digital content types will be managed in these repositories. The NLA will develop a repository and associated metadata to support independent scholars (those not associated with institutions). A complementary activity of ARROW is the development and testing of national resource discovery services (developed by the NLA) using metadata harvested from the institutional repositories, and the exposing of metadata to provide services via protocols and toolkits. This will include a potential path for the redevelopment of the Australian Digital Theses (ADT) metadata repository incorporated into the NLA’s national resource discovery services.

Initially ARROW will be tested in the four partner institutions, prior to it being offered more widely across the higher-education sector. The solution will be open-standards based, or will support open standards, and will facilitate interoperability within and between participating institutions.."

This is a very high-level statement. What does it mean when fleshed out a bit? The best way to get an accurate sense of this is to focus on the content streams that ARROW will have to manage and the content types it will have to deal with.

2.1 Content Streams

The functions that ARROW will perform can best be characterised in terms of different content streams.

2.1.1 E-print repositories

An e-print repository stores and makes available (in digital form) working papers, pre-prints (not yet published in the traditional literature) and post-prints. E-print repositories have been proliferating in recent years. Most have been set up by universities, but many have also been established by scholarly and professional societies and higher education research centres. Australian universities running e-prints repositories include The Australian National University, Monash University, The University of Melbourne, The University of Queensland, and Queensland University of Technology. The increased activity around e-prints has been facilitated by the development of free, open-source software [HREF12] that manages e-print repositories.

A key feature of these repositories is that content is usually available on an open-access basis (anyone can read or view it and no fees are payable). Many e-print repositories also work on a self-submission basis, with researchers depositing material into the repository themselves using an online deposit process. The rationale behind the growing e-prints movement is to reclaim institutional scholarly output and make it widely accessible internationally, thus removing barriers to learning and research, and improving its availability and citation.

2.1.2 Digital thesis repositories

A digital thesis repository stores and makes available online, in digital form, graduate research output (M.A. by research and Ph.D.). Digital theses in these repositories are offered on an open access basis. In Australia the Australian Digital Theses Program [HREF13] is a national collaborative distributed database of digitised theses produced at Australian Universities. Twenty-two higher education institutions are participating members of the Program, which uses deposit-process software [HREF14] first developed at Virginia Polytechnic Institute in the United States of America.

2.1.3 Electronic publishing

A growing number of higher education institutions are trying to establish sustainable publishing alternatives to reclaim the scholarly output currently published in heavily protected commercial journals and monographs. Institutional e-presses aim to offer electronic publishing services and functionality similar to those offered by commercial presses publishing product online, but in a way that is more aligned with institutional objectives, thereby tackling problems associated with the current scholarly publishing climate. These problems include pricing and intellectual property issues, as well as long lead times for publication and publishing models that do not allow for publication of media rich titles.

The activities of an e–press can range from digitising material originally designed for print and making it available online, through to the publication (in the sense of making public) material that was born digital and that can only be fully represented digitally. E-presses are more akin to traditional publishing than e-print repositories in that e-press content tends to be offered on a subscription and/or pay-per-view basis.

As with e-print repositories, the Australian higher education sector is experiencing significant activity in this area. Both Monash University and The Australian National University are establishing e-presses, and Royal Melbourne Institute of Technology Publishing [HREF15] has been engaged in electronic publishing for several years now.

2.1.4 DEST Returns

Each year, Australian universities need to send to DEST information about their research output for the previous year. In most universities, this process involves manual data collection using paper forms which are then keyed into a database or spreadsheet. This is tedious and susceptible to error. In addition, the end result is a largely static document with no way to link from the publication information to the publications themselves.

ARROW wanted to see if it was possible to partially automate the gathering of publications for the annual Department of Education, Science and Training returns and storage of both the publications and required metadata in the institutional ARROW repository. This would meet the following objectives:

ARROW also wanted to  see if it also would be possible to enable universities to enter into an ongoing dialogue with their researchers about the issues associated with academics signing over copyright in research output.

2.1.5 Independent Scholars

Of course, not all research takes place in a university. Much also occurs in research institutes of one sort or another, in R&D centres in corporations or even in informal locations (I call this the Researcher in the Backyard Shed). Researchers at institutions without institutional repositories would find it difficult to make their research visible. As ARROW was seeking to capture and make visible as much Australian research as possible, it would be useful to find a way to deal with this potential content stream.

2.2 Content types

2.2.1 Content Type Philosophy

Another part of the design brief process was deciding on what content types (as opposed to streams) would be accepted. The project decided to adopt a variant of the model developed by MIT in its DSpace [HREF16] implementation. The DSPace philosophy can be summarised as follows: We also decided to be informed by the National Archives of Australia guidelines on digital formats [HREF26]. Based on this, ARROW decided to accept three types of content : On the vexed subject of Lossy vs Lossless formats, the decision was made that wherever possible, ARROW would endeavour to store data objects in lossless digital formats (these are formats that do not throw away information when compressing the file). Lossy formats (which do throw away information during compression) might be stored in addition, or rendered on the fly (where possible). Storage in lossy formats would be used only as a last resort.

2.2.2 Supported Formats

For Textual content, the supported formats are: For Still Images, the supported formats are: For Moving Images, the supported formats are: For Audio, the supported formats are:

For Multimedia content, the supported formats are:

2.2.3 Known Formats

For Textual content the following formats are known:

NOTE: The reason for including Microsoft Office file formats is simply a recognition of the market reality. If alternatives (such as StarOffice [HREF39] or OpenOffice [HREF40] become more widely deployed in the target environments for ARROW, these list may well be augmented).

For Still Images the following formats are known:
For Moving Images the following formats are known: For Audio the following formats are known:

For Multimedia content, the supported formats are known:

2.2.4 Unsupported Formats

All other formats would be unsupported.

3. Architectural Drawings

Now that we had a clear design brief it was possible to move on to the next step: deciding the broad architecture. This involved a series of iterative steps, as well as a lot of research into what approaches similar projects overseas had adopted. We ended up defining three categories of required repository functionality.

3.1 Common Repository

We decided that, if possible, we wanted all the various content types to be stored in a common repository. This would:

3.2 Content Management and Workflow

In order to get the content into the common repository, we needed a way to efficiently manage different classes of content contributors and different content streams. We ended up deciding to define a series of Content Management and Workflow modules, corresponding to the content streams discussed under section 2.1. Each of these modules would have its own content submission forms and workflow. Each would also have specific functionality to deal with the requirements of that particular stream type.

3.2.1 ePrints

Objective Module to submit and manage e-prints.
Deliverables Software, based on the ARROW architecture, that provides no less functionality than the eprints.org software.
Issues Management of content self-submission and administrative management.

3.2.2 eTheses

Objective A module that will manage thesis metadata and submit digital theses.
Deliverables Software, based on the ARROW architecture, that provides no less functionality than the current Australian Digital Theses Program software and includes OAI-PMH compliance.
Issues Data capture from various sources; efficient harvesting from institutional repositories; identification of software; performance and scalability requirements; interactions with other metadata services.

3.2.3 ePress


Objective To create or integrate a module to manage a fully functional electronic press.
Deliverables Software, based on the ARROW architecture, that provides sufficient functionality to run an open-access ejournal electronic press.
Issues Integration of existing electronic press software.

3.2.4 DEST Research Directory

Objective Testing of the feasibility and effectiveness of using an ARROW repository to support the annual Department of Education, Science and Training returns.
Deliverables Repository holding a proportion of the institution's Department of Education, Science and Training 2003 returns.
Issues Management of content submission from academics; embedding use of repository in institution-collection process.

3.2.5 NLA Repository

Objective Installation of an independent scholars' repository at the National Library of Australia.
Deliverables Repository, compliant with the ARROW architecture, adapted for independent scholar submission.
Issues Management of content submission from independent scholars.

3.2.6 Other applications

We also recognised that the ARROW infrastructure would be potentially applicable to a wider range of problems. For this reason we left open the possibility of adding other Content Management and Workflow modules later on.

3.3 Search and Exposure

The ability to locate appropriate content for citation purposes is a critical success factor in creating reliable scholarly communication and increasing the impact of research. ARROW decided to develop a nationally available resource discovery service to provide access to Australian research output. The project will establish automated mechanisms for harvesting and re-purposing metadata from institutions and individual researchers. This will be done by applying international standards, specifications and technologies to ensure interoperability. Resource discovery will be supported by descriptive metadata. Other types of metadata may also be generated to support digital rights management, persistent identification, and archiving and preservation to ensure the longevity of scholarly content. In addition, it will be possible to search ARROW repositories through a range of discovery tools (such as education portals or search engines). This exposure will increase awareness of unique Australian content, both nationally and internationally. The project will also seek to expose published Australian research in commercial repositories, such as those created by large journal publishers.

3.4 OLAD

The end result of the architectural decisions in each of the categories of Common Repository,  Content Management and Workflow and  Search & Exposure was a layered architecture. The notion of a layered architecture is not particularly controversial. Such architectures have been preferred since at least the days of the International Standards Organisation Open Systems Interconnect seven-layer reference model for network services. In the Digital Library field these sorts of high-level models are so common that the project group took to referring to 'obligatory' layered architecture diagrams. Figure 1 therefore is the OLAD (Obligatory Layered Architecture Diagram) for ARROW.

ARROW OLAD

Figure 1: Obligatory Layered Architecture Diagram for ARROW.

4. Building Materials

Now that we had defined the architecture, we had to work out how we were going to build it. In construction terms, what building materials were available and what were we going to chose?

4.1 Foundation - the repository

We recognised very early on that the choice of repository was foundational. Particular repository technologies would in turn determine the functionality we could provide and the way we could provide it. Much of the latter half of 2003 was spent in careful analysis of available candidates, based on a mixture of: As a result of this work, we rapidly settled on two likely candidates: DSpace and FEDORA.

4.1.1 DSpace

DSpace [HREF16] is a joint activity between MIT Libraries and Hewlett-Packard to jointly develop a software system to enables institutions to:

It is being made available under the BSD open source license to other groups to run as-is, or to modify and extend as needed.

The current version of DSpace (1.1.1 - version 1.2 is anticipated in April 2004) can best be thought of as a general-purpose repository application, with a series of both hard-wired and preferred behaviours. It is designed to provide stable long-term storage needed to house the digital products of MIT faculty and researchers. DSpace is intended to have different advantages for different stakeholder groups:

"For the user: DSpace enables easy remote access and the ability to read and search DSpace items from one location: the World Wide Web.

For the contributor: DSpace offers the advantages of digital distribution and long-term preservation for a variety of formats including text, audio, video, images, datasets and more. Authors can store their digital works in collections that are maintained by MIT communities.

For the institution: DSpace offers the opportunity to provide access to all the research of the institution through one interface. The repository is organized to accommodate the varying policy and workflow issues inherent in a multi-disciplinary environment. Submission workflow and access policies can be customized to adhere closely to each community's needs." [HREF17]

While DSpace grew out of the needs of MIT, a group of North American and European universities are now participating in the DSpace Federation [HREF18], which will test the existing software, and offer suggestions about how to further develop and improve it.

DSpace supports a wide range of content types [HREF19], and particular installations can easily extend the range available.

4.1.2 FEDORA

FEDORA is both a software platform and an architecture (it stands for the Flexible Extensible Digital Object and Repository Architecture). The architecture came out of Digital Library work done in the computer science field in the late 1990s [Payette and Staples (2002)]. The history of the FEDORA repository software is described on its website as follows:

"In the summer of 1999 ... the [University of Virginia] Library's research and development group discovered a paper about Fedora written by Sandra Payette and Carl Lagoze of Cornell's Digital Library Research Group. Fedora was designed on the principle that interoperability and extensibility is best achieved by architecting a clean and modular separation of data, interfaces, and mechanisms (i.e., executable programs). With Cornell's help, the Virginia team installed the research software version of Fedora and began experimenting with some of Virginia's digital collections. Convinced that Fedora was exactly the framework they were seeking, the Virginia team reinterpreted the implementation and developed a prototype that used a relational database backend and a Java servlet that provided the repository access functionality. The prototype provided strong evidence that the Fedora architecture could indeed be the foundation for a practical, scalable digital library system. In September of 2001 The University of Virginia received a grant of $1,000,000 from the Andrew W. Mellon Foundation to enable the Library, in collaboration with Cornell University, to build a sophisticated digital object repository system based on the Flexible Extensible Digital Object and Repository Architecture (Fedora). The Mellon grant was based on the success of the Virginia prototype, and the vision of a new open-source version of Fedora that exploits the latest web technologies. Virginia and Cornell have joined forces to build this robust implementation of the Fedora architecture with a full array of management utilities necessary to support it." [HREF41].

Increasingly, the term FEDORA (which was first used over 5 years ago as an acronym for the architecture) is now being used to refer to this software implementation. In this latter sense, FEDORA is "an open source, digital object repository system using public APIs exposed as web services." [Staples, Wayland and Payette (2003)]. FEDORA can best be thought of as services-mediation infrastructure, rather than an off-the-shelf application. It can use web services to call other services as well as expose its own services using web services standards. Key to the FEDORA architecture (yes, I know this is like referring to an ATM Machine...) is its underlying object-based model. FEDORA stores digital content objects, either as datastreams contained within the repository or as links to external resources. It also stores disseminators, which are ways to render these digital content objects. The software maintains bindings between content objects and their disseminators. Each object has a default disseminator, but may be able to be disseminated in other ways. This architecture is extremely flexible, and provides significant advantages as a platform on which to build other applications.

Version 1.2 of FEDORA, released in late December 2003, provides versioning of both objects and their disseminators, as well as a Java-based Administration GUI.

4.1.3 Other Open-Source Repositories

There is also a range of other open-source repository projects underway. The Soros Institute is currently maintaining a document which summarises the functionality of many of them [HREF8]. In addition to DSpace, the current version also reviews FEDORA, CDSWare, MyCoRe, i-Tor, eprints.org and ARNO. These each come out of particular responses to the challenges of managing large amounts of digital content, and each have their own strengths and weaknesses.

4.1.4 Selection

At the time of writing this paper the final selection had not been announced. It is hoped that by the time of the conference, the announcement will have been made. This paragraph will then be updated to explain the reasons behind the selection.

4.2 Framing it up - the application development framework

One of the things that the repository may determine is the choice of application development framework. This is because some repositories only allow particular languages to call their Application Programming Interfaces (APIs). We wanted to be able to code in a variety of languages (not be restricted to one) and we wanted to be able to expose repository functionality via Web Services. These two points are partially inter-related: having web services makes it much easier to use a range of languages.

4.3 Doors and Windows - the search and exposure layer

As discussed above we wanted to make items in ARROW repositories as accessible as possible. We decided to target three very different technologies.

4.3.1 OAI-PMH

The Open Archives Initiative's Protocol for Metadata Harvesting (OAI-PMH) was created to facilitate discovery of distributed resources, such as those contained in a repository. The OAI-PMH achieves this by providing a simple, yet powerful framework for metadata harvesting. Harvesters can incrementally gather records contained in OAI-PMH repositories and use them to create services covering the content of several repositories. [Van de Sompel, Young and Hickey (2003)]. OAI-PMH is rapidly gathering strength as a way of providing federated resource discovery services and was seen as essential to the success of ARROW.

The National Library will use OAI-PMH where available (and other technologies where not) to harvest the metadata from ARROW and other institutional repositories. These metadata will then be used to provide national and international resource discovery for Australian research. This national resource-discovery service will also link with other national services delivered by the National Library for the Australian Digital Theses Program and the international Networked Digital Library of Theses and Dissertations.

4.3.2 Google

There is little need to discuss the success of Google at a Web conference. For most students (and probably for most staff!) Google is the resource discovery mechanism of choice. Enabling Google to access at least the metadata (and preferably the full text) of items in ARROW repositories was an easy choice to make. In practice, this means provision of a robots.txt file and publically-available content in a directory location accessible by Google spidering software.

4.3.3 SRU/SRW

The third exposure layer was in some ways a less obvious choice. Both OAI-PMH and Google are 'proxy' search services. That is, they collect proxy records and place them in a database where they can be searched. Such proxy systems run the risk of always potentially being out of date (if only slightly). We therefore wanted to make it possible for other search services to connect directly to ARROW repositories and run interactive searches. The standard protocol for such connections in the library world is Z39.50 (More formally known as ISO 23950: "Information Retrieval (Z39.50): Application Service Definition and Protocol Specification") [HREF21]. Z39.50 has not been taken up as quickly as its proponents had hoped (for a variety of reasons too complex to cover here). As a result the Z39.50 Next Generation group (ZNG) have been working on more modern and lightweight protocols to achieve much of the original Z39.50 functionality. These newer protocols are called SRU (Search/Retrieve over URL) [HREF22] and SRW (Search/Retrieve for Web Services) [HREF23]. ARROW decided to support both SRW and SRU connections to make it possible for real-time searching through things like the portlet technology being developed by education.au (HREF24).

5. Building Site

5.1 Where we are now

Up to now, this paper has described activities that have already taken place. We have now come up to the time of writing this paper. The point of all the preceding work is, of course, to actually build something. The ARROW project started to receive funds in late January 2004. Since that time we have:

5.2 Plans for rest of this year

Over the rest of this year, the ARROW project will:

6. Open House!

6.1 When are we going to be open for business?

We hope to have functional software available by the end of 2004. This would be the Open House date, and from that point onwards we will be loading content and providing a semi-production service. Initially this service will only be available at the four project partner institutions. We have made an allocation in the budget in year 3 (2006) to roll out the ARROW initiative to up to 10 other institutions across Australia. We may be able to start this phase earlier if all goes well, but we don't want to commit to this at such an early stage.

6.2 Plans for the future

The initial round of DEST funding runs out at the end of 2006. One of the DEST requirements was that successful projects should address the issues of sustainability. Both DEST and ARROW are keen to see the initiative continue beyond the end of 2006 and are thinking hard about how to ensure long-term viability for the project (assuming it is successful). It is far too early to say what these plans might be, but one idea that we keep playing with can be summarised as "Embedding ARROW into the things that universities have to do anyway".

7. Conclusions

The process of developing the architecture for ARROW has been a constant interaction between our vision for what we wanted to do and what the software might make possible. Sometimes the software possibilities constrained the vision. Sometimes they expanded it. But the end result is, we hope, a flexible architecture that will enable us to meet the DEST requirements to make Australian research more visible. And, who knows, perhaps ARROW will end up becoming something more. In our less-guarded moments we (the ARROW Project Team) like to talk about ARROW becoming part of the fundamental infrastructure of higher-education in Australia. Perhaps it will, but we have quite enough on our plates already, and our first challenge is to succeed with the initial (and quite daunting enough) list of deliverables. The architectural work described in this paper is just the first step.

8. Acknowledgement

The ARROW Project is sponsored as part of the Commonwealth Government's Backing Australia's Ability [HREF42].

References

DEST (Australian Commonwealth Department of Education, Science and Training) (2002), Research Information Infrastructure Framework for Australian Higher Education. The Final Report of the Higher Education Information Infrastructure Advisory Committee (Systemic Infrastructure Initiative). [HREF4]

DEST (2003a), Information Infrastructure - Call for Proposals 2003. [HREF5]

DEST (2003b), Information Infrastructure - Outcomes of Selections Process. [HREF6]

Harboe-Ree, C., Sabto, M. and Treloar, A. (2004), "The library as digitorium: new modes of creation, distribution and access", Proceedings of VALA 2004, Melbourne, February. [HREF1]

Harboe-Ree, C. and Treloar, A. (2004), "Connecting the Dots Downunder: Towards An Integrated Institutional Approach To Digital Content Management", High Energy Physics Libraries Webzine, issue 9, March. [HREF44]

Clifford A. Lynch, "Institutional Repositories: Essential Infrastructure for Scholarship in the Digital Age" ARL, no. 226 (February 2003): 1-7. [HREF7]

Open Society Institute (2004), A Guide to Institutional Repository Software version 2.0. [HREF8]

Payette, Sandra & Staples, Thornton, "The Mellon Fedora Project: digital library architecture meets XML and web services", Sixth European Conference on Research and Advanced Technology for Digital Libraries. Lecture notes in computer science, vol. 2459. Springer-Verlag, Berlin Heidelberg New York (2002) 406-421. [HREF9]

Rogers, S.A., "Developing an institutional Knowledge Bank at Ohio State University: from Concept to Action Plan", in portal: Libraries and the Academy, January 2003. [HREF2]

Staples, Thornton, Wayland, Ross & Payette, Sandra, "The Fedora Project: an open-source digital object repository management system", in D-lib Magazine, April 2003. [HREF10]

Van de Sompel, H., Young, J. and Hickey, T. (2003), "Using the OAI-PMH ... Differently", D-Lib Magazine, July/August. [HREF20]

Hypertext References

HREF1
http://www.vala.org.au/vala2004/2004pdfs/21HrSaTr.pdf
HREF2
http://www.lib.ohio-state.edu/Lib_Info/rogersKBdoc.pdf
HREF3
http://www.colis.mq.edu.au/
HREF4
http://www.dest.gov.au/highered/otherpub/heiiac/exec_summary.htm
HREF5
http://www.dest.gov.au/highered/research/proposal.htm#1
HREF6
http://www.dest.gov.au/highered/research/outcomes2003.htm
HREF7
http://www.arl.org/newsltr/226/ir.html
HREF8
http://www.soros.org/openaccess/software/
HREF9
http://www.fedora.info/documents/ecdl2002final.pdf
HREF10
http://dlib.org/dlib/april03/staples/04staples.htm
HREF11
http://www.dest.gov.au/Ministers/Media/McGauran/2003/10/mcg002221003.asp
HREF12
http//www.eprints.org
HREF13
http://adt.caul.edu.au
HREF14
http://etd.vt.edu/
HREF15
http://www.rmitpublishing.com.au/
HREF16
http://www.dspace.org
HREF17
http://libraries.mit.edu/dspace-mit/
HREF18
http://dspace.org/federation/index.html
HREF19
http://dspace.org/faqs/index.html#content
HREF20
http://www.dlib.org/dlib/july03/young/07young.html
HREF21
http://lcweb.loc.gov/z3950/agency/
HREF22
http://www.loc.gov/z3950/agency/zing/srw/sru.html
HREF23
http://www.loc.gov/z3950/agency/zing/
HREF24
http://www.educationau.edu.au/
HREF25
http://arrow.edu.au/
HREF26
http://www.naa.gov.au/recordkeeping/preservation/digital/xml_data_formats.html
HREF27
http://sts.anu.edu.au/downloads/APSR.pdf
HREF28
http://www.melcoe.mq.edu.au/projects/MAMS/index.htm
HREF29
http://andrew.treloar.net/
HREF30
http://www.its.monash.edu.au/
HREF31
http://www.monash.edu.au/
HREF32
http://arrow.edu.au/
HREF33
http://lib.monash.edu.au/
HREF34
http://home.earthlink.net/~ritter/tiff/
HREF35
http://www.libpng.org/pub/png/
HREF36
http://www.w3.org/Graphics/SVG/
HREF37
http://www.w3.org/AudioVideo/
HREF38
http://www.state.ma.us/mgis/mrsid.htm
HREF39
http://www.staroffice.com
HREF40
http://www.openoffice.org/
HREF41
http://www.fedora.info/history.shtml
HREF42
http://backingaus.innovation.gov.au/
HREF43
http://aarlin.edu.au/
HREF44
http://library.cern.ch/HEPLW/9/papers/1/

Copyright

© Dr Andrew Treloar, 2004. The author assigns to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.