Beverley Forsyth. La Trobe University Library, [HREF 8] Bundoora, Victoria 3083. Australia. Phone: +61 3 285 5371 Fax: +61 3 285 5186.
The technologies of Gopher, WAIS, and HTML have only been with us for just over three years and yet they have had a profound effect on the traditional roles of the academic library and the librarian. Our initial experiences of Gopher were firstly wonder and excitement as we used it to explore the resources of the Internet and to mark out the sites which were most likely to be of direct benefit to us and our clients. We quickly seized this means of dramatically expanding our individual collection resources base by accessing full-text resources and other libraries' catalogues and creating genuine union catalogues which our predecessors could only have dreamt of. Together with the next generation of Internet tools, HTML and the Web, these advances have enabled us to advance so that not only can we access other collections we can also mount/broadcast/publish/promote/promulgate(?) specialist resources which may otherwise have remained inhouse. Typically major libraries are offered the papers of notable scholars as donations for their archive collections but these rarely see the light of day as they are too expensive to catalogue and make available to the wider community. As the research process migrates to a computerised base such donations may be in electronic form which may make access easier for interested researchers. It's a bit like unchaining the books with the advent of the printing process. However in embracing this brave new world we agree with the challenge made by the Director of the Office of Scientific and Academic Publishing at the Association of Research Libraries, Ann Okerson (1990),
It is critical that in starting virtually "from scratch" with a brand new "making public" vehicle, we are unfettered by old modes of viewing and doing publishing by existing notions of publishing offices; marginal cost structure of publishing; the ideas of "circulation;" indexing and abstracting "monographs " and "serials;" advertising; ownership; possibly even profits. We have the opportunity to begin with a blank page -- even that notion needs a new metaphor.La Trobe University Library's venture into mounting (we purposely avoid the word "publish") a new previously unpublished resource on the Web was the result of a Research Forum held almost exactly a year ago. La Trobe University Library enjoys the unusual advantage of having retained academic equivalence for its professional staff. One tangible benefit of this is that the Library actively fosters and encourages the research activities of the Library staff. There is a Library Research Committee which as part of its brief arranges Research Forums where people engaged in research of interest to the Library are invited to share their research experiences with our staff. Last April, John Stinson, an academic in the Music Department at La Trobe, gave a stimulating Research Forum on the topic of 'Databases as Tools in Humanities Research'
He pointed out that his musicological research made extensive use of computers in the collection, collation, storage and retrieval of data.
One result of a ten year research collaborative endeavour conducted by John Stinson, Associate Professor John Griffiths, Professor Margaret Manion, Professor Carsaniga, and Brian Parish was the compilation of a unique suite of databases of every musical work known to have existed in the fourteenth century on which is stored data on all the manuscript sources, facsimile editions, editions in modern notation, scholarly studies of the works and recordings. Stinson was keen to make this research available to the musicologists of the world but had been advised that it in print form it would be prohibitively expensive to produce. Authors/researchers are now frequently encountering the problem of finding a publisher who is prepared to publish lengthy original source material because of the large costs with very limited returns. Over ten years ago Eric Cochrane (1981) lamented "Printing costs have all but extinguished the four-century-old tradition of European text editing" and this situation has worsened.
In the case of this particular database the dynamic power of an electronic form of publishing was more suitable for the continuing amendments and additions the compiler envisaged. Electronic publishing also held the potential for adding multimedia dimensions to the database at some future time. For example when the recording project of accompanying fourteenth century music is completed it could be mounted with links to the database together with the digitised images of the photographs of the manuscripts.
At the time John Stinson presented the Library Research Forum, the Library was developing a CWIS as a joint venture with the University's Computing Services (since renamed Information Technology Services). In the week following the Forum there was a Library Research Lunch with discussions based on the issues John had raised about developing databases and mounting electronic materials for the Library. A subgroup of particularly interested (and available??) Library staff was formed to investigate the possibility of mounting the Stinson databases on the Library server as a trial.
Our interest in the 14th Century Databases arose from several factors. Primarily we recognised it as a valuable resource of musicological research but on a less altruistic level we also saw these computer files formed a perfect model for a pilot project which would provide some library staff with the opportunity to acquire some important skills in manipulating files under different programs. We thought experience could be gained in formatting, indexing and the display of electronic data while at the same time the project team would be investigating in a methodical way the process of electronic publishing on the Internet. Other benefits we foresaw were an awareness of Internet technologies and resources.
In conceptualising the presentation of the data it was decided to group it into three categories composer(160 items), manuscript(427 items), repertoire(3197 items). In this comprehensive collation some of the items were quite large - it is amazing how prolific some composers were and how much has been written about others. In all, there was about 7.5 Mb of data stored on a PC as a Foxpro database.
The next step was to decide on a publication platform. The most relevant ones were FTP, WAIS, Gopher, and HTML. By mid 1994, FTP was already considered too primitive and was quickly ruled out. WAIS was rarely used as the preferred access tool but provided the easiest and most commonly used method on the Internet to provide full text searching. Gopher had the greatest base with stable clients for most operating systems and its ability to present information within a hierarchical menu of items and submenus suited the data structure as conceived by the music database compiler. The Gopher model also provided a familiar way of organising and retrieving documents and with the addition of full text searching techniques was considered a suitable platform. Furthermore, some Library staff already had expertise with Gopher servers through their work with the University's CWIS.
However the new kid on the block, WWW, seemed the most exiting platform and the GUI based browsers (Cello and Mosaic - remember them?) seemed laden with many useful features. While HTML at this stage (June-July 1994) provided a host of promising possibilities it was still somewhat immature, and the well featured clients for PCs and Macintoshes were only in the alpha/beta testing phase. It was argued by John Price-Wilkin (1994) that there were limitations to HTML as a markup language and to Web servers in their ability to deliver structured information. Nick Arnett (1994) also argued that the HTML format provided documents with good appearance but that when there are many documents, appearance becomes less important than navigability. Just as the printed books in the 15th century were not as beautiful as the illustrated manuscripts which preceded them, so resources on the 'Net need to be navigable rather than glossy productions consuming unnecessary bandwidth.
For these reasons we chose WAIS as the full text search mechanism, Gopher as our primary protocol to provide structure and a more global means of access to the documents, and the WWW to package and publicise the project. The WWW Homepage [HREF 1] for the project consists of a glossy GIF image taken from the Squarcialupi Codex (a very expensive facsimile of this mediaeval music manuscript is held in our Library) on which are superimposed a title and compilers' names, making it quite a good dust jacket. There are introductory notes, and of course links to the searchable and browsable files. The use of all of these protocols fitted in well with our project aims of exploring and gaining experience with the emerging Internet protocols suited to electronic publications.
The step that remained was mounting the data. Extracting the files in the required format from Foxpro was easy as John Stinson was highly skilled with Foxpro. The question was what method would be the most expedient. Originally we were expecting to export the data as separate items for each of the three files of Composer, Manuscript, Repertoire. This would have proved to be a time consuming task considering the number of the items and a pain given the restrictions of DOS file names being limited to 8 characters. We considered the item names were better suited to UNIX's 255 character file name convention which would help provide meaning to the items; we did not want another database with 'computerese' file names. These meaningful file names could then be simply used to provide a sorted and browsable Gopher listing. However, after consultation with a member of Computing Services, Paul Nankervis, who was responsible for the university's Gopher server, we learnt he had written a C program which was able to divide a single text file into separate UNIX files based on a delimiting character (*) and making the line following the delimiter the UNIX file name. So, once we knew what we wanted, it became a fairly simple matter to export three text files from Foxpro, FTP them(it still has a place!) to the appropriate Gopher server directories, explode them out using the invaluable C program, create WAIS full text indexes using another preestablished script called "makeindex" and hey Presto! - 7.5 Megabytes of scholarly research was made accessible to the world.
The only major problem we encountered along the way was what to do with diacritics which abounded in a database with extensive French, German and Italian content. We experimented with loading text from Foxpro and retaining the diacritics which had been entered into the DOS Foxpro version. We discovered that the DOS character set is different from the standard Windows Character set, which fortunately is the same as the Macintosh standard, but of course totally useless for a VT100 client. Again we chose a solution which least compromised the data but is, nevertheless, the lowest common denominator. This was 7 bit ASCII, the character set guaranteed to be understood by all clients. In future the increasing ubiquity of GUI clients and a greater adherence to ISO standard character sets means it may be possible to mount data without loss of this crucial detail.
The project was sufficiently complete by the end of June for it to be launched in conjunction with the launch of another of John Stinson's projects - a CD recording of mediaeval music. The work that remained outstanding was the creation of a context for the Fourteenth Century Music Databases. This involved writing some explanatory notes and publicising its existence. A link to it can be found on the Labyrinth Homepage [HREF 2] at the University of Georgetown which contains comprehensive links to Internet accessible works relating to the mediaeval and Renaissance period, and in the more familiar subject lists such as the The World Wide Web Virtual Library [HREF 3], Yahoo [HREF 4], and Gopher Jewels [HREF 5].
In transforming the Foxpro database there was concern that we would loose some searching capabilities by converting a "fielded" database to a "relevance feedback" searchable full text database. Searching the Music Databases is based on an early version of the WAIS full text, relevance feedback, searching algorithm with no Boolean capabilities. This means that when searching by a particular term it is sometimes not immediately obvious why a particular item is retrieved without scanning the full document. However, relevance feedback is becoming a feature of Internet searching and users are becoming increasingly familiar with it in the same way that the advent of CDROMs made Boolean logic a standard part of library thinking. Once John Stinson became familiar with full text searching and relevance feedback he found it a powerful tool and was able to manipulate the data in ways not possible before.
With the changes that have occurred in the last year, good implementations of forms searching through the use of Common Gateway Interfaces (CGI) and reliable cross platform WWW Form capable Clients, it might be possible to "pubnet" the database as created by the information provider. However we suspect it might be more time consuming than it first looks and we do wonder about licensing agreements for network access to proprietary database products, such as Microsoft's Foxpro program. Also, in mounting "documents" for global access it is vital that their structure reflects the needs of the information seekers rather than those of the provider. Databases are so often written to enable the efficient and effective input and storage of the data that at the making public (publication:-) stage the database needs to be reformatted to suit a whole new class of users. This is a role well suited to libraries as they have a good understanding of the particular information needs of their communities.
The mounting of this unique scholarly information resource enabled us to reflect on the roles of libraries in an emerging electronic environment and to question if libraries need to exist for electronic information. It could be argued that a direct link into an author's hard disk is a sufficient means of ensuring distribution. So far experience has shown that this form of 'publishing' cannot be guaranteed by the author in the way that it can be by mounting files on a Library server. Authors move on, lose interest in their earlier works, or upgrade their PCs and cannot guarantee continual and consistent access. Our experience showed that a resource resident on a Foxpro program could not become available beyond the immediate user whereas a translation into a common protocol gave it unlimited potential access. The power of Web is such that despite a lack of publicity for the databases, within days of the files being mounted an email message requesting contact with John Stinson was received at La Trobe Computing Services from an author in the United States who had been commissioned by Garland Press to write a book on the fourteenth century composer Machaut and had already accessed the information about him in the database!
Others might argue that libraries were not in the business of publishing printed works so why should they now feel the need to get involved in electronic publishing? Originally libraries were publishers (and some still are in a limited way ) until the economies of scale gave rise to the commercial presses. When we put forward our application for financial support for this project to the Library Research Committee we were required to show why it should be funded by the Library as in many respects it appeared to be an exercise in publishing rather than a research endeavour. We were able to demonstrate our interest was in the process and not the product.
Maybe we are not interested in publishing per se but in 'pubnetting' as referred to by Laura Fillmore (1994). 'Pubnetting' offers many advantages over traditional publishing and collection development of libraries because the product is immediate, global, dynamic, manipulable, and multimedia. Perhaps 'pubnetting' is an extension of the traditional role of libraries in collection management, acquisitions, cataloguing, indexing and classification, and, most of all, in providing efficient access in a universally understandable form. An electronic product need not simply mirror a print product, it needs to exploit features made possible by the new medium. Recently there has been so much hype about the Multimedia capabilities of the new media that the other qualities are being overlooked. These include the ability to easily navigate and search through documents and the ability to update, manipulate, and reformat the information as needed.
Another aspect of the library's role in 'pubnetting' is to create the context around the core content. By this we mean creating links from a "document" to works outside itself, links that will help locate a resource and give it contextual meaning. Individual information resources are likely to miss the ramp onto or off the information highway without the supporting intellectual structures. The technology has created the physical web, the transport mechanism. What will be a continuing process is developing the navigational aids. This is the electronic equivalent of cataloguing and classifying a document. Maybe when we 'pubnet' we are doing no more than acquiring a single item - similar to current library practices- but instead of making it available only to our own clientele we make it available to a world wide web of clients.
To conclude, the project achieved its stated aims, data was mounted, knowledge and skills were gained, it allowed co-operation with other university departments, it raised the profile of the Library, and through publication in "La Trobe University Library News" will hopefully lead to the mounting of other local scholarly resources that might have otherwise gone unpublished. Additionally, one of the important realisations of our project was the recognition that the role of libraries, at least for their own communities, will be associated with the development, maintenance and enhancements of the systems, standards and conventions that enable networked information resources to appear and interoperate as part of a common environment. If we abrogate this role then we are in danger of ending up with a veritable tower of Babel.
Cochrane, Eric. Historians and Historiography in the Italian Renaissance. Univ of Chicago Press, 1981.
Fillmore, Laura. "Internet Publishing in a Borderless Environment: Bookworms into Butterflies" [HREF 7] Presented at the Frankfurt Book Fair - Frankfurt Electronic Media Conference. 7 October 1994.
Price-Wilken, John Using the World-Wide-Web to deliver Complex Electronic Documents Implications for Libraries. "The Public-Access Computer Systems Review Vol. 5 No. 3 1994"
Okerson, Ann. Scholarly publishing in the NREN. "ARL: A Bimonthly Newsletter of Research Library Issues and Actions. 151 (July 4, 1990): 1-4."