The Web in your briefcase, or, experiences with a portable Web


Ken Yap CSIRO Division of Information Technology Locked Bag 17, North Ryde Email: ken@syd.dit.csiro.au Home page: Ken Yap
Keywords: WorldWideWeb, Web server, HTTP, Unix, Networking, SLIP

Abstract

Around mid-1995 our marketing manager made a request to have a notebook computer with a self-contained Web for demonstrations "on the road". This would be a notebook computer that would have a client and Web server containing various pages captured from the real World-Wide-Web and would serve up these pages as if the notebook was connected to the Internet. The various tribulations encountered in the installation and testing of the software and hardware, and their resolution reveal details about how the Web protocols and various Web software work. It also demonstrates how information and services are increasingly interdependent as they get tied into the World Wide Web.

How it started

Around mid-1995 our marketing manager made a request to have a notebook computer with a self-contained Web for demonstrations "on the road". He would use this notebook to run Web demonstrations to convince industry of the benefits of Electronic Commerce. Demonstrations would be held in places where no easy phone access, let alone network access, to our office computers was available. Therefore the notebook had to be a standalone subset of the World Wide Web. It was not deemed feasible to do the demonstration from files only because we wanted to use real live examples of pages from the Web, which change daily. It takes too much time to change all the references to HTTP URLs into FILE URLs and verify the links. The Web pages had to be used as-is. The notebook had to have graphics that could be projected onto a screen with a LCD panel. This meant a windowing system and a GUI browser.

Configuration

The required configuration of the notebook boiled down to using Web client talking to a proxy server with a cache. This meant the operating system had to be multitasking. At that time, Web browsers, like Netscape, did not have caches fit for our purpose so we needed a caching proxy. Also we thought it would be useful to create some server side scripts for some demonstrations.

Microsoft Windows was deemed unsuitable because its multitasking quality is poor. We didn't want to delve into Windows-NT as we had no experience with it at this site at the time. So Unix remained as the only serious choice.

The initial plan was to get a Sparc notebook and run SunOS on it. This had the advantage that we are familiar with SunOS because it is what we use on our desktop workstations. We have found it very easy to make Web tools work on Sun workstations. Unfortunately Sparc notebooks are not cheap, so we took the other alternative, which was to buy a Intel x86 CPU notebook and run Solaris x86 or one of the x86 free Unixes on it.

The hardware chosen was an IBM Thinkpad 755CE [HREF1] with a transmission LCD screen. It has a built-in 800 Mb disk, and an expansion connector for a docking station. In retrospect, this was a good choice because of the quality and reliability of the hardware. In particular, the transmission LCD screen is bright and has faster response to image changes, which makes it possible to see the cursor when it moves, unlike the older LCD technology.

A Thinkpad and docking station were ordered. The docking station was to be used to connect up network cards or SCSI devices, in particular the CDROM drive for installing the OS. We decided to install Solaris x86 2.3. For Mosaic we needed the Motif libraries. We had a source licence for it but it required a non-GNU C compiler to build. So ordered a copy of the Sparcworks C compiler. For httpd, we knew that it would build with gcc without any problems.

Installation problems

The Thinkpad arrived first, the docking station was backordered. Since we had to have a docking station to bring up the operating system from a SCSI CDROM drive, we were grateful when the distributors lent us a docking station for the installation.

The first problem encountered was trying to make a boot floppy for Solaris x86, which would then allow us to load the rest of the operating system from the CDROM. The one supplied was not sufficiently recent to recognise the Thinkpad 755CE as the model had been released recently. We then made an attempt to boot Solaris x86 over the network but this didn't work because we didn't have any other Solaris machines on site to be the boot server.

We had only the weekend---the docking station had to be returned soon---to do the installation and as time was running short and we would still have to install the compilers even if we got Solaris up and running, it was decided to install Linux [HREF2], a free Unix operating system, instead.

We still weren't on the home stretch because the 2.88 Mb floppy drive on the Thinkpad was sufficiently novel that older versions of Linux did not recognise it. A recent release of Linux solved the problem. After some hassles making a boot disk we finally got Linux up and running. Around the same time it was decided to ditch Mosaic in favour of Netscape, for which a Linux binary was available.

Problems with running a standalone Web

Our setup was in principle simple. While the notebook was connected to the LAN for fetching interesting pages off the Web, our proxy server would be chained onto the main proxy server in the office, and this would ensure that only one copy of any particular document would be fetched across the whole office. When the notebook was disconnected from the LAN---"on the road", it would serve pages off its cache.

Diagram 1: Configuration of our browser and proxy

The first thing we forgot when running standalone was that (DNS) Domain Name System lookups no longer work when in standalone mode. We had no desire or need to set up a DNS server on the notebook so we changed all references to the domain name of the Thinkpad to "localhost".

We also needed to configure the proxy so that when a document is not in the cache it returns an error to the client instead of waiting for the a network connection to timeout. Fortunately this option had been thoughtfully put in CERN httpd [HREF3] already. (Network clients cannot tell the difference between a host that is slow to respond to the initial connection and one that is unreachable until the timeout happens at around 30 seconds.) [Stevens 94] Since this mode of operation is a different configuration from when connected to a LAN, we had to detect whether the notebook was connected or not to the LAN. This we did by sending a "ping" to a well-known host on the office LAN and using the result to switch configuration files. This is done automatically at bootup to reduce the number of things the user needs to remember to do. The down side is that if the network cable is reconnected the notebook needs to be rebooted to make it detect the LAN. However booting is very quick anyway.

Usage Problems

The garbage collector built into CERN httpd gave us grief because it kept throwing out pages that we had cached. Sometimes sites set rather short expiry times on their pages to ensure that their audience doesn't retain out of date information. Other times the proxy sets a short expiry time because the end server does not send an Expires header line in the document and so the proxy has to make an estimate from the Last-modified header. So a document that has been recently modified will have a short expiry time. In any event this behaviour was especially unfortunate on the eve of an important presentation. It is not always easy to ascertain when a page is due for expiry, so to avoid nasty surprises, we turned off garbage collection. This meant that the Conditional Get invoked from normal browsing is the only way to update pages. So pages which are deleted from the server site are never cleaned out. We haven't run out of disk space so it's not a problem yet.

Results of CGI scripts are not cachable because the CERN server does not attempt to map the result of CGI URLs to cached documents, and performs a fetch from the server anyway. Unfortunately for us, many sites elect to use dynamic pages, for various reasons, ranging from being able to generate the page from the most recent information to being able to customise the look based on the identity of the user and maybe the phase of the moon. We could not cache these pages so we avoided these in presentations. It is not entirely useless to cache the results of CGI scripts, for example a query to the Australian White Pages [HREF4], and in fact Netscape does do this. Future proxy servers should be more diligent in this regard.

Imagemaps don't work because the client sends an (x,y) position on the image to the server and the server uses that information to decide what document to send in return. When running standalone there is no end server available to transact with. We had to be careful during demos not to click on imagemaps embedded in pages. Good page designs provide text links as alternatives to imagemap links, but we encountered some oversights. These are essential for users with text only browsers.

When a URL is a directory, HTTP allows the server to choose the document name that the directory URL is mapped to. Traditionally this is index.html, but differs from server to server and even from site to site. As the proxy does not have access to the actual name of the file transferred, the proxy saves the returned contents under a special name in the cache. Then when the notebook is running standalone, the translation cannot be made from the directory URL to this special name. This could be a bug in the proxy server. We solved this in some cases by editing the bookmark to remember the file URL and not the directory. Other times when we could not guess the original URL, we just had to forget about using that URL. This limitation was particularly irritating as there was no simple way to find out if a page had been cached other than to reboot the machine in standalone mode and try to access it.

We found that Netscape (version 2.0b6) can be too diligent in caching pages. This means that one may still be receiving old information even though one has pressed the reload button. It seems Netscape has implemented an aggressive and sometimes incorrect optimisation, and the cure is to press Shift-Reload to force a fetch from the server.

And as always, there was the problem of interesting URLs not being available, due to the unreliable nature of network connections, for fetching on the day before a presentation. Fortunately on something as large as the WWW, there is usually an alternative available.

Enhancements

At some point we got tired of not being able to access pages that could not be captured so we bought a 28.8kbaud PCMCIA (Personal Computer Memory Card Industry Association, but PCMCIA is used for more than memory extension cards now) modem so that we could be "online" via a SLIP link when a phone line is handy. This presented its own set of problems.

First we had to get PCMCIA configuration manager software for Linux. Fortunately this software (cardmgr) was available and worked well. Then we configured a script to dial up our terminal server and start a SLIP session. This went ok but we found that sometimes the modem would not initialise. We suspected everything, the software, the hardware settings. It was only after days of frustration before we noticed that login worked well when the modem was cold. A call to the manufacturers of the modem ascertained that some functions would not work if the modem got too hot. So while they develop a fix, we have developed the habit of pulling out the modem to cool it if we need to restart the hardware.

The HTTP server is normally run from the operating system bootup scripts. When the marketing manager dials in and makes a SLIP connection via the modem, the server needs to be reconfigured as if the notebook were connected to our office LAN (and indeed except for the much reduced speed, it might as well be). Since the HTTP server runs with superuser permissions and we did not wish to grant superuser privileges to everyone, we wrote a shell script that is invoked from a captive account (one that does not give the user an interactive session but runs a predetermined script) that has superuser privileges. The marketing manager "logs in" this captive account and the HTTP server gets restarted.

The SLIP link works well and on one occasion was used on a Melbourne to Sydney dialin. When the line went down it was actually some time before it was noticed because the proxy server continued to serve out of its cache.

Experiences in use

The first time the notebook was used for a business presentation, I tagged along as a "roadie" (a person who sets up equipment for a travelling performance). This was fortunate because we had a series of hiccups. One was to do with bookmarks. The marketing manager could not find the bookmarks he had laboriously saved. It turned out that the reason was Netscape was run from my account and not the marketing manager's and our bookmark files were slightly different. The lesson: don't run a demo as a different user from the one the demo was set up for, even if you think the configurations are identical.

Another hiccup was due to the fact that we had set up the window system (X Windowing System) so that we could pan a 640x480 pixel window, which was all that could be displayed on the LCD screen, in a larger virtual window (1024x1024) which was what the video card can actually support. Although it seemed a good way to have more screen real-estate at one's disposal, it was distracting to see the window pan when the cursor hit the display edge. We then disabled virtual windows, and furthermore we now start Netscape with a geometry slightly smaller than the screen size so that everything could be seen without panning. This limited the amount of material displayed but was more acceptable. But we encountered a Netscape bug where the bookmark pulldown menu doesn't take into account the smaller screen and bookmarks fall outside the screen. Fortunately the bookmark editor is still accessible from another menu.

Later we set up a home page for our marketing manager that was a link to his list of bookmarks. Whenever he wanted his bookmarks sorted in alphabetical order he ran a Perl script that we wrote to do this. He could then give a presentation directly from his saved bookmarks. This procedure is no longer necessary as recent versions of Netscape allow nested bookmarks (which appear as cascading menus from the bookmark pulldown menu). This feature is very handy for setting up different presentations for different audiences.

The notebook has been very useful for taking demonstrations on the road. It has been connected to LCD projection panels to display to large audiences. We have even translated a PowerPoint presentation into a set of Web pages using a homebrew PowerPoint to HTML translator [HREF5] written in Perl.

Finally, our difficulties with setting up a server with pages that are usable in standalone mode highlight the fact that in a Web connected world it is increasingly difficult for information and services to stand on their own, without ties to any other computers. In the future, no computer can be an information island. This is the world that our notebook island is an interim step towards.


Acknowledgements

Thanks to everybody who helped, especially: Douwe Lovius who lent us the docking station for installing software on the Thinkpad, Jeremy Fitzhardinge who gave some useful advice at a critical moment on how to make Linux boot, Bill Simpson-Young with his helpful advice on Web related matters and proofreading of this paper, and Phil McCrea who was the long-suffering user.

References

Stevens 94
TCP/IP illustrated. Vol 1: The protocols. W. Richard Stevens. Addison-Wesley, 1994. ISBN 0201633469.

Hypertext References

HREF1
http://www.pc.ibm.com/thinkpad/index.html - IBM Thinkpad pages
HREF2
http://www.linux.org - Linux home page
HREF3
http://www.w3.org/hypertext/WWW/Daemon - Documentation on CERN (now W3ORG) http daemon
HREF4
http://www.whitepages.com.au - Online Australian Telephone White Pages
HREF5
ftp://ftp.syd.dit.csiro.au/pub/ken/ppttohtml.pl - PowerPoint to HTML translator written in Perl.

Copyright

Ken Yap (CSIRO Australia) ©, 1996. The author assigns to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers, and for the document to be published on mirrors on the World Wide Web. Any other usage is prohibited without the express permission of the author.
Pointers to Abstract and Conference Presentation
Abstract Papers & posters in this theme All Papers & posters AusWeb96 Home Page

AusWeb96 The Second Australian WorldWideWeb Conference "ausweb96@scu.edu.au"