Providing Self-Study Resources for Computer Science Students: Five Years of the BURKS Project

John English [HREF1], Faculty of Information Technology [HREF2], University of Brighton [HREF3], Brighton BN2 4GJ, UK.
je@brighton.ac.uk

Abstract

This paper describes BURKS, a non-profit collection of useful resources for Computer Science students on CD-ROM which was originally developed by the author during the summer of 1996 and is now in its fifth edition. BURKS is a self-contained website spread across multiple CD-ROMs which includes an integrated web browser and about 2 gigabytes of software and documentation covering a wide range of Computer Science topics. For the benefit of novices, it also uses helper applications to provide automated installation of software from the CD to the user’s machine. In keeping with the non-profit nature of the project, the contents of the CDs are also available online at http://burks.brighton.ac.uk/. The paper discusses the rationale for the project, the techniques used to implement a single website on multiple CDs, and other issues involved in providing web-based material for offline use by students with little or no network connectivity.

Introduction

BURKS (the Brighton University Resource Kit for Students [HREF4]) was initiated in 1996 as a non-profit, zero-budget project to provide Computer Science students with a self-contained collection of useful resources from the Internet on CD-ROM. It was intended for use by complete novices, and so a major challenge was developing a product that was as simple to use as possible. It uses a preconfigured web browser included on the CD to provide the user interface and HTML indexing to catalogue the material on the CD. This provides an easy-to-use interface and has the additional benefit that the entire collection can be made available online. The collection includes software, tutorial material, reference manuals, specifications, journal articles and complete textbooks covering a wide range of computing-related topics.

New editions of BURKS are released every year in August (in time for the start of the new academic year in the UK). It is funded by industrial sponsorship to cover the initial production costs, and in past years has been sponsored by GEC-Marconi, Pavilion Internet, ROCC Computers, Macmillan Press and Net Monitor. Income from sales is then used to fund further print runs throughout the year.

The size of the collection has grown by roughly 50% each year, from 450 megabytes in the first edition to about 2 gigabytes in the current (fifth) edition. It is now a set of three CDs which form a self-contained website; the user is prompted to change disks when necessary, or the disks can be loaded into multiple drives if they are available. As the collection has grown, the price per megabyte has dropped steadily; the fifth edition (a set of three CDs) actually costs less than the first edition (a single CD produced in limited quantities) due to economies of scale.

To date, about 45,000 copies have been distributed worldwide (between 10,000 and 15,000 copies for all editions since the first, of which only 500 copies were ever produced). It is now a recommended resource for students at many universities in the UK and elsewhere (e.g. van Scoy 1998), and IBM recently took delivery of 5,000 copies for inclusion in an information pack for educators. The George Washington University acts as our distributor for sales within the USA and Canada.

BURKS has also proved attractive in industry, to the author’s initial surprise. This is apparently because Internet access in industry is often more restricted than it is in academia, and few people have the luxury of being able to spend hours scouring the net for potentially useful material. Although this is tangential to the original project aims, it is extremely useful for attracting the interest of potential sponsors.

To emphasize the non-profit nature of this project, the entire collection is also available online; unrestricted access to information is a cornerstone of the philosophy behind this project. The online version of BURKS is available at http://burks.brighton.ac.uk/ [HREF4]. During 2000, the BURKS website recorded about 12 million hits from 174 countries (an average of over 200,000 hits per week), transferring a total of over 200 gigabytes of data. Sharp rises in usage occurred during October 1999 and December 2000, and so far during 2001 the site has averaged about 800,000 hits per week. Full statistics are available on the website [HREF5], including usage figures for different sections of the material; since the level of activity is constantly changing, interested readers should visit the website for the latest facts and figures.

A project like this has many facets. This paper describes the rationale for BURKS and some of the aspects involved in its realisation as a successful product: the choice of content, implementation techniques and production issues.

Rationale

Over the last few years, the Internet (and particularly the World-Wide Web) has had a major impact on Computer Science students. Previously, the primary information centre of a University was its library; now, students are expected to use the Internet as a source of information as a matter of course. The main difference between a library and the Internet is that a library is organised, catalogued and stable, whereas the Internet is chaotic, uncharted and constantly changing. The amount of accessible information is enormous, but trying to separate the wheat from the chaff in such an anarchic system is extremely difficult (Knight 1996).

Search engines will locate information worldwide given a key word or phrase, but too broad a search will result in thousands of hits while too narrow a search may not find anything relevant (Eastman 1999). Much of the skill in using the Internet as an information resource is still based on knowing where to look (Gresham 1998).

There have been some accompanying demographic changes at the same time. Most Computer Science students now have their own computers at home, and CD-ROM drives are commonplace. At the same time, some things have not improved:

Students buy their own computers for many reasons. There is the convenience of being able to work at home; there is also the fact that the machine is always accessible, whereas college machines may only be accessible during certain opening hours, and at peak times the demand for college resources may also restrict accessibility. College machines may also be restricted in other ways; for example, there may be quota restrictions on disk space, limits on network access, or security restrictions which prevent students from installing software on the system.

However, Internet access for students from home is either nonexistent or limited by cost considerations; for example, local calls in the UK cost between 1p and 3p (approximately 3 to 8 Australian cents) per minute depending on the time of day. The maximum transfer speed for a 56k modem is about 5K per second (although the maximum speed is rarely achieved in practice), so downloading a 23 megabyte software package would take well over an hour.

Locating useful material or reading documentation online can also be a lengthy (and hence costly) business. Teaching staff will often provide links to useful material on course websites, but such links must be continually updated as sites appear and disappear; maintenance is a perennial problem. It is often easier to link to pages which act as portals to pages covering a particular subject domain; however, these too appear and disappear from time to time.

Material could perhaps be downloaded at college and taken home, but this would usually involve a substantial investment in time and effort for students. Software may need to be transferred in 1.44M packets using floppy disks, which will require the use of a tool to split the product across multiple disks. Documentation in HTML format is difficult to download without special tools; it often consists of a large number of files, and links to images and other files will often need to be corrected manually for local use. Complete novices are therefore unlikely to find it easy to copy material for use at home.

BURKS is an attempt to alleviate some of these problems by providing a collection of useful resources from the Internet on CD-ROM. It alleviates the effort involved in locating resources, and the resources it provides are not subject to random disappearances. Notess (1995) remarked that "some of the best Internet information resources should start being published in other formats". By doing precisely this, BURKS allows students to take away a substantial portion of the Internet for use at home. This in turn can help to alleviate the load on college resources, reducing the expenditure of time and effort involved in locating and downloading useful resources from the Internet. It also avoids duplication of effort, where an entire cohort of students ends up downloading multiple copies of the same software package. It is distributed on a non-profit basis so that as much information as possible is available for use by students at home at a price they can afford; the current edition costs about the same as 10 hours of cheap-rate phone calls in the UK but the information it contains would take about 100 hours to download at maximum speed.

The generality of the collection means that it is attractive to a wide audience. This means that it is feasible to manufacture it in quantity, which reduces the per-unit costs and thus makes it even more attractive.

About BURKS

The vast majority of students own IBM PC-compatible machines running Microsoft Windows, so this was chosen as the target platform for BURKS. However, in 1996 many users were still running Windows 3.1, and many did not have a web browser or any TCP/IP network connectivity. It was therefore necessary to design a product that was usable on such systems, which was completely self-contained and did not rely on the presence of any other software. At the same time, it was necessary to ensure that it did not impact on any other software which might be present on the user’s system.

BURKS includes a preconfigured web browser (Netscape Navigator 3.04) which can be used without any prior installation step and which does not make any changes to the system (e.g. registry entries) which might interfere with any existing browser. This will be started automatically on machines running systems like Windows 95 or later which provide CD autorun features, which means that most users only need to insert the disk in their CD drive for the browser to start up automatically, initially displaying the main index page from the CD. Helper applications are also provided to automate the installation of software packages from CD to make it as simple as possible for complete novices to use. The browser is preconfigured so that clicking on a link to a software package will launch the installer.

Now that most users have their own web browsers, it is also possible to configure BURKS to use an existing browser, based on the use of a special-purpose HTTP server included on the CD. When server-based operation is selected, both the server and the user’s web browser are started together. The server will automatically run helper applications such as the installer without the need for any special reconfiguration of the user’s browser. The server also incorporates a free-text search engine to simplify locating relevant material on the CD. The only reason that BURKS does not use this approach by default is that it relies on the user having a working TCP/IP installation, and unfortunately this is still far from universal.

Content

The information on BURKS includes selected reference material, tutorials, textbooks and journal articles, FAQs from a variety of Usenet newsgroups, and an assortment of other useful documentation. The main index is currently divided into the following major categories:

Searching a collection of this size (over 20,000 files) can be a problem. Hyperlinked indexes help by making it possible to catalogue the same items under many different headings. Thus, items which cross category boundaries can be listed under each category; for example, compilers are listed in the section for the corresponding language, as well as in the software section. There is also a permuted master document index and an alphabetical software index which can be searched for keywords. As mentioned earlier, a free-text search utility is also available when server-based operation is selected.

Implementation issues

A startup utility included on each CD is used to launch the preconfigured browser. The first time BURKS is used, essential files are copied into a directory on the hard disk. These include the initialisation file used to provide configuration information for the browser and for other elements of the system. The CD drive letter is obtained from the pathname of the startup utility, and this is used to build full pathnames for the helper applications as well as a file URL for the main index page on the CD. 16-bit versions of Netscape Navigator prior to version 4 support a command line switch [HREF9] which can be used to specify the location of the initialisation file, and so can be used without interfering with any existing Netscape browser the user may already have installed. (Later versions of Netscape Navigator seem to recognise this switch, but except in the earlier 16-bit versions it appears to have no effect.) The only drawback to the use of Netscape 3.04 is the lack of support for Java in the 16-bit version, but in all other respects it is ideally suited for a CD-based system.

For ease of use, BURKS preserves the appearance of a single connected website even though it now spans three CDs. In the current edition, disk 1 contains documentation and some software packages, disk 2 contains the bulk of the software collection, and disk 3 contains the Linux distribution and some remaining software packages. For this to work, the web browser must be configured to prompt the user to change disks when necessary. When a link to a document on one of the CDs is followed, Netscape will attempt to load it from the current disk, and will report an error if it does not exist.

Multi-disk operation

To maintain the illusion of a single shared web, it is possible to start BURKS by inserting any of the disks. All disks contain identical copies of the startup code, Netscape Navigator and its associated software, as well as copies of all the top-level indexes. However, the opening page of each document on a different disk is replaced by a blank page containing some simple JavaScript to display a dialog box. As a result, attempting to view a document will bring up a blank page and a dialog asking the user to insert the correct disk. Pressing OK will invoke location.reload() to reload the current page; if the correct disk has been inserted, this will now be the opening page of the requested document, and if not, the same dialog will be displayed. Pressing Cancel will call history.back(); this will go back to the previously displayed page, which will be the index page which referenced the missing document.

For installing software, a more elaborate scheme is necessary. When a link to a file associated with a helper application is followed, Netscape makes a temporary copy of the referenced file on the user’s hard disk, and then launches the helper application with the name of the temporary file (possibly something meaningless like X5QP3STV.SW) as a parameter. The file must exist, and must not be empty. While the helper application is executing, Netscape maintains an entry in its initialisation file which maps the temporary filename to the corresponding URL, and this enables the helper application to discover the original URL. BURKS uses links to a file called install.sw which exists in every software-related directory on each of the disks, together with a subsection reference which describes the file to be installed (disk number and file name). For example, a typical URL for a software package might look like this:


  file:///x:/burks/software/langs/install.sw#2.gnat313d.zip

This will be interpreted as a request to install the file gnat313d.zip from the directory x:/burks/software/langs on disk 2. The helper application checks for the presence of a particular file in the root directory of the CD to identify which disk is currently loaded, and will then ask the user to insert the correct disk if necessary. It also allows the user to specify an alternative drive letter, to cater for systems with more than one CD drive, and records the drive letters for each disk in the initialisation file.

Installation scripting

Since the file install.sw will have been copied to the user’s hard disk, it seems sensible to try and find a use for it. The primary reason for its existence in the scheme described above is to provide a filename which can be mapped to a URL from which the desired installation parameters can be extracted. It was decided to use this file to provide the helper application with installation instructions for the software packages in the corresponding directory. As one of the project requirements is that the CD be available online, the installation instructions are formatted so that they appear as genuine HTML content when an external visitor follows a link to a software package without the benefit of the helper application.

Because web browsers ignore white space in HTML text, a little judicious formatting allows the installation instructions to be expressed as valid HTML. Here is an example:


<DT><A NAME="2.gnat313d.zip" HREF="gnat313d.zip"

>gnat313d.zip</A><DD>

<B>Directions:</B>

Unzip into an empty directory, then

run SETUP.EXE

<BR><B>Notes:</B>

 This is just a compiler and needs to

 be invoked from a command prompt to

 compile a file that you’ve created

 separately with a text editor. See the

 development tools section of the CD for

 Windows-based environments which will

 allow you to edit and compile source

 files.

This is valid HTML, but it also doubles as an installation script in which the first character of each line determines its role. The name of each file which can be installed is given on a line which begins with ‘>’, which also has the effect of closing the HREF tag on the previous line. Lines after this which begin with ‘<’ (i.e. lines beginning with an HTML tag) are ignored by the installer; lines beginning with ‘U’ indicate that the installer should unzip the file, lines beginning with ‘r’ specify the name of a setup utility to be executed, and so on. Lines beginning with a space are displayed as installation notes by the installer. All this formatting is ignored by a web browser which is rendering it as an HTML document. A web browser will render the above example like this, including a link which will allow the file to be downloaded:
gnat313d.zip
Directions: Unzip into an empty directory, then run SETUP.EXE
Notes: This is just a compiler and needs to be invoked from a command prompt to compile a file that you’ve created separately with a text editor. See the development tools section of the CD for Windows-based environments which will allow you to edit and compile source files.
The installer will interpret the same text as instructions for installing the file gnat313d.zip. It displays the message ‘This is just a compiler and needs to be invoked from a command prompt to compile a file that you’ve created separately with a text editor. See the development tools section of the CD for Windows-based environments which will allow you to edit and compile source files’. It will then let the user select an installation directory, unzip the file, then switch to that directory and run SETUP.EXE to complete the installation.

Server-based operation

As mentioned earlier, BURKS can be configured for server-based operation using an existing browser, rather than relying on the preconfigured browser on the CD. This allows a Java-aware browser to be used, which is useful for some Java tutorials which include applets as examples, and it also allows for dynamic content generation. This allows the server to provide an integrated search engine. However, using a server requires the host system to have TCP/IP networking installed. Since this is not universal on Windows systems, BURKS always starts the preconfigured browser and provides a link to a helper application which will locate an unused port, start up the server on this port and then start the user’s default browser pointing at this port using a URL such as http://127.0.0.1:2000/. If the browser contacts the server successfully, an entry is made in the initialisation file so that server-based operation will be used by default in future, although the user has the option to switch back to using the preconfigured browser if required.

When a disk is inserted in server-based mode, the startup application on the CD will check for an existing copy of the server, start the server if it is not already running, and then start the default browser. To prevent multiple copies of the browser being started if the disk is changed, a hidden Java applet notifies the server (using non-standard HTTP requests) when it is started and stopped, so the server can tell whether a new copy of the browser is needed.

Production issues

Developing a product like BURKS requires a number of tools to automate the process as much as possible. In the case of BURKS, Perl scripts have been developed to deal with most of the mundane formatting, consistency checking and other basic ‘string-slinging’ jobs involved in its production. For example, a Perl script generates the index pages for each CD from a master database and copies files from a master repository to their correct locations. Other scripts have been developed to create local copies of websites automatically and to trace through a website identifying missing files. This reduces the effort involved in assembling the material, although such automation is never perfect, and a certain amount of manual editing is still sometimes necessary. Improving the tools used to automate this process is an ongoing activity.

Production is carried out in July each year. This mainly involves checking existing material for updates, downloading and testing any updated material, and updating the master database. New material to be added must also be downloaded, tested, and entered into the master database.

HTML testing is done using a Perl script which scans an HTML document tree looking for broken links. Software testing uses VMware [HREF10], a product which creates virtual machines in which different versions of Windows can be installed. It runs under Linux, and the virtual machines appear in separate windows on the Linux desktop. The disks for these virtual machines are implemented as files on the underlying Linux filesystem, so restoring a virtual machine is just a matter of restoring a disk image from a backup. Virtual disks can also be marked ‘non-persistent’; changes to a non-persistent disk are discarded when the virtual machine is shut down, so this is ideal for installing and testing software with no risk to the host system. It also makes it possible to test a new version of BURKS on a wide range of Windows systems by mounting the CD image as a virtual drive within each of the target Windows configurations.

However, the bulk of the effort involved in developing a product like BURKS is selecting appropriate resources to include. A fair amount of time is required to locate and download useful material. Redistribution may require explicit permission from the authors; I am glad to say that in most cases, when I have written to ask permission and explained my intentions, I have immediately been granted the necessary permission. As the project has matured, I have also been approached with offers of material to include. I have also included some work by students; offering to publish high-quality work on BURKS is an excellent incentive to raise the standard of work.

There are some subject areas where it is hard to find good introductory material online. Examples include databases, systems analysis and formal methods. I would be delighted to hear from anyone who could point me in the direction of any suitable resources, or who would like to challenge their students to produce some material as part of an assignment!

Experiences and future plans

A resource of this kind, which is available to students at a price they can afford, is educationally invaluable. It provides a broad selection of software and documentation which they will be able to use directly as part of their studies, as well as sufficient additional resources which may not be immediately useful but which may, by their very accessibility, encourage students to broaden their horizons. It also helps institutions to reduce the load on laboratory machines, both by enabling students to work at home and by reducing the amount of effort involved in searching the Internet and downloading copies of useful resources. Making the contents of the CD available online also means that students who do not have a computer at home are not disadvantaged, and any interested parties can see what material the CD provides.

Other institutions have produced teaching material on CD (e.g. Veraart and Wright 1995, Woodman et al. 1997), but in the author’s experience these are often fairly specific to the curriculum at an individual institution. There are also some excellent examples of commercial CD-based teaching material (e.g. Gries and Gries 2000), but these tend to focus on a single subject and are thus restricted to a limited range of the curriculum. The success of BURKS lies in its generality combined with its low-cost, non-profit nature which enables it to take maximum advantage of the economies inherent in large-scale CD production. This has enabled the price per megabyte to be reduced each year as the collection has expanded, and at least two universities in the UK which have previously produced CD collections of their own now purchase bulk quantities of BURKS to distribute to their students.

Updated versions of BURKS are produced every August; the 6th edition is scheduled for release in August 2001. One major difficulty is the ever-increasing amount of material; the 5th edition completely filled three CDs, so the 6th edition will probably be a set of four CDs. However, the market penetration of DVD drives (offering a minimum of 4.7 gigabytes per disk, or approximately the capacity of seven CDs) has now reached the point where a DVD edition will probably sell in sufficient quantities to make production worthwhile. DVD production poses no problems; a prototype DVD copy of the current edition has been produced from the CD masters without any difficulty. The only unknown is the size of the market for a DVD release in comparison with a CD release. Up until now, a DVD release would probably have been impractical. To enable the state of the market to be assessed, the 6th edition will be released in both multi-CD and single-DVD formats with the intention of moving to DVD-only editions in subsequent editions if the volume of DVD sales is sufficiently high.

Acknowledgements

BURKS has only been made possible by those authors who have generously given their permission for their work to be used. This includes many Usenet FAQ maintainers, those who have released their work into the public domain or under the terms of the Free Software Foundation’s GNU General Public License, and especially Netscape Communications Corporation for permission to use Netscape Navigator as the central engine for the whole project. Thanks are also due to my colleagues at the University of Brighton who have supported me in producing this product. A full list of credits is available on the BURKS website [HREF11].

References

Culwin, F. (1997). Java: An Object-First Approach. Prentice Hall. Available online [HREF12].

Eastman, C. M. (1999). "30,000 hits may be better than 300: precision anomalies in Internet searches", in Proceedings of the 22nd International ACM SIGIR Conference on Research and Development in Information Retrieval, ACM Press 1999, p.313-314.

Eckel, B. (2000). Thinking in C++ (2nd ed). Prentice Hall. Available online [HREF13].

Eckel, B. (2000). Thinking in Java (2nd ed). Prentice Hall. Available online [HREF14].

Gries, D., and Gries, P. (2000). ProgramLive: a Multimedia Java Learning Resource (CD-ROM). Data Description Inc. Available online [HREF15].

Gresham, K. (1998). "Surfing with a purpose", in EDUCOM Review, v.33 n.5 p.22-29.

Knight, J. P. (1996). "Resource Discovery on the Internet", in The New Review of Information Networking, v.2 p.3-14.

Notess, G. R. (1995). "Using CD-ROMs with the Internet", in Online, v.19 n.6 p.40-44.

Terry, P. D. (1997). Compilers and Compiler Generators: an Introduction with C++. International Thomson Computer Press. Available online [HREF16].

van Scoy, F. (1998). "Using the BURKS 2 CD-ROM in a Principles of Programming Languages Course", in Integrating Technology into Computer Science Education, ACM Press 1998, p.239-242.

Veraart, V. E., and Wright, S. L. (1996) "Using CD-ROMs and local web pages to provide course material for distance students", in Integrating Technology into Computer Science Education, ACM Press 1996, p.90-92.

Woodman, M., Law, A., Holland, S., and Griffith, R. (1997). "The Object Shop: Using CD-ROM multimedia to introduce object concepts", in Proceedings of the SIGCSE Technical Symposium on Computer Science Education, ACM Press 1997, p.345-349.

Hypertext References

HREF1
http://www.it.brighton.ac.uk/staff/je/
HREF2
http://www.it.brighton.ac.uk/
HREF3
http://www.brighton.ac.uk/
HREF4
http://burks.brighton.ac.uk/
HREF5
http://burks.brighton.ac.uk/stats.htm
HREF6
http://foldoc.doc.ic.ac.uk/
HREF7
http://www.redhat.com/
HREF8
http://www.linuxdoc.org/LDP/
HREF9
http://developer.netscape.com/docs/manuals/deploymt/options.htm
HREF10
http://www.vmware.com/
HREF11
http://burks.brighton.ac.uk/burks/readme/credits.htm
HREF12
http://www.scism.sbu.ac.uk/jfl/jflintro.html
HREF13
http://www.mindview.net/ThinkingInCPP2e.html
HREF14
http://www.mindview.net/TIJ2/index.html
HREF15
http://www.datadescription.com/ProgramLive/
HREF16
http://www.scifac.ru.ac.za/compilers/

Copyright

John English, © 2001. The authors assign to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The authors also grant a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.