Some simple uses of this are: a "slide show" [HREF 5] , where an application requests Mosaic to display a sequence of URLs, with a simple scripting language to specify such things as display times, and in xwebteach [HREF 6] , a master/slave coupling of instances of Mosaic so that a "teacher" can fetch various URLs on the master instance and have them automatically shown on the "student" slaves.
Two things that are specifically not in CCI are: client side execution of applications, and control of the GUI (Graphical User Interface) components of Mosaic. So, for example, while the slaves can show the same URL as the master in xwebteach, other aspects of the interaction such as paging through a document or highlighting sections of a document cannot be passed from the master to the slaves.
This paper reports on an extension of the CCI specification and its API that allows control of the GUI components of Mosaic. For the X Window System (Scheifler, et al, 1988) this is done by using an object-oriented approach which is based on the replayXt library [HREF 7] (Newmarch, 1994). The replayXt system is designed for automated demonstrations or tests of Xt-based software systems using script files.
The extensions to CCI and replayXt allow direct control of the interface elements of Mosaic for X by the same protocol that can control the URLs displayed, and can allow an instance of Mosaic to deliver its GUI actions to an application. This allows, for example, a more sophisticated version of xwebteach to be built that also tracks highlighting and paging through a document.
The structure of this paper is as follows: The next section looks at the CCI protocol and its API. The following section details the principles behind the replayXt system and how it can be used. The next three sections consider the extensions made to both of these to allow GUI control within the CCI specification, and changes made to Mosaic to work with replayXt. This is followed by an example of use. The replayXt system actually uses the command language tcl (Ousterhout, 1994), which is a full programming language with the ability to run arbitrary applications. This introduces potential security problems on the Mosaic side, and the next section details how these are dealt with. Possible extensions that can bring this functionality to Microsoft and Apple versions of Mosaic are then discussed. Finally the paper offers some conclusions on the CCI extensions.
Although the emphasis within this paper is on user interface controls, the implementation can easily be extended to allow execution of programs from the Mosaic side. This is not dealt with here as it involves a set of slightly different issues to those of the user interface.
There is an API [HREF 8] and a library to implement the CCI protocol, and an application gains CCI functionality by using this API and linking in the CCI library. The only Web browser that currently supports CCI is NCSA Mosaic for X, version 2.5. The library is currently in an experimental stage.
The widgets use an object-oriented approach in their behaviour. User events
are mapped onto actions by a translation table. There is a
default translation table per widget, and for the Motif PushButton this
maps Button1Down to the action "Arm()" and Button1Up to the action
"Activate()". An action table that is
hard-coded into each widget associates
a C function to each action, and it is this C function that is responsible
for both changes in appearance such as appearing pressed in, and also for
calling application-specific code via callbacks.
The replayXt library is an interface to this that reads a script file of actions and calls XtCallActionProc() repeatedly. This is at a higher level than scripting systems for X that generate low-level X events and send them to the application. It is however, sometimes necessary to include X event information such as x, y coordinates of the mouse so that the widget can "verify" that the action occurred within its boundaries. A typical piece of script to press and release a button with a 500 millisecond delay would be
There are two main components to replayXt: the first acts as a player that reads the action script and makes the widgets act. The action script may be either a text file or be messages from another application sent using the "send" mechanism of Tk (Ousterhout, 1994), or the port of "send" to Xt [HREF 9] . The other component is a "record" mechanism that tracks widget actions as they occur in an interactive system, using XtAppAddActionHook(), and writes them to a text file. This record mechanism can allow quick generation of replay scripts, but can lose a little of the object-oriented possibilities of the replay side. The next section discusses this in more detail.
tcl has full language capabilities of loops, procedural abstraction, etc. One use of loops is to run a demonstration multiple times, or indefinitely. The use of procedural abstraction allows common sequences of actions such as "press a button, pause, release the button" to be captured as procedures.
In addition to this, a command has been added by replayXt
to allow X resource values to
be extracted from a widget. This allows "presentation independant" action
sequences to be written. For example, X applications may be resized by the
user or have their default configuration altered by the
user setting resources.
These two instances of Mosaic differ in colours, fonts
and placement of scrollbars, entirely controlled by user-set resources:
To see how these kinds of variation can be handled, consider the problem of ensuring that a button press always occurs within a widget. It is necessary to first get the widget resources of width and height, perform a calculation to say locate the centre of this, and then call the Arm() action with x and y set to these values. This can be easily done with scripts prepared by hand, using procedures as in
GET {URL|URN} <url> [OUTPUT {CURRENT|NEW|NONE}]
This forces Mosaic to resolve the URL, with additional parameters to
control the display. To get Mosaic to execute a tcl script of actions, a
suitable
client message from the application to Mosaic is
EXECUTE Content-Type: replayXt Content-Length: <size> <script>This sends a message of arbitrary length, whose Content-Type tells Mosaic how to deal with it. (Using different Content-Types can allow execution of other types of scripts, but this is beyond the scope of this paper.)
Execution of the tcl command will generate both a tcl command completion code and some result. These are wrapped together as "output" and returned to the client for any action it may choose to take based on this:
218 Execution output Content-Type: tcl-result Content-Length: <size> <output>
To make Mosaic tell an application which URLs it is fetching (or stop telling), CCI defines the requests
SEND ANCHOR SEND ANCHOR STOPTo get Mosaic to send (or stop sending) the sequence of actions, suitable requests are
SEND ACTIONS SEND ACTIONS STOPThe output from the browser as it sends each sequence is
303 Actions output Content-Type: replayXt Content-Length: <size> <script>Each message should consist of a syntactically correct command or set of commands. The receiving appplication may do what it likes with this, such as store it or send it on to another instance of Mosaic for execution.
In the case of Mosaic for X, the script and the actions will be in the tcl language, using replayXt. For other environments there may be another suitable language or alternative library. This may be determined from the "Content-Type" field in the messages.
There is also an API in C to support these additional components of the CCI protocol.
The API on the client side consists of two functions:
MCCISendExecute(MCCIPort serverPort, char *command,
char *contentType, (*callback)() )
MCCISendActions(serverPort,on,callBack,callBackData)
MCCISendExecute() sends a command to Mosaic of a specified Content-Type,
and registers a callback function to deal with any output from execution.
MCCISendActions() is a function that can toggle the sending of actions
from Mosaic.
MCCIExecute(MCCIPort client)which at present only recognises commands of type replayXt and executes them. Sending of actions is handled by a function
MCCISendAction(client, action)which returns actions to a requesting client.
To make the application act as a recorder is usually just a matter of calling RXt_StartRecorder(). This then outputs the actions to a file. In the case of Mosaic we want to send this on the TCP/IP port if sending actions is enabled. A reasonably general solution within replayXt is obtained by allowing an application to register an arbitrary function that takes the action string as parameter and is called on each action. For Mosaic this can then be a function that sends the action string to an application. This just required some internal reorganisation of the record code, and adding a function to replayXt to allow register the callback function.
Security required other changes to replayXt. This is discussed later.
The replayXt library uses XtNameToWidget() to locate widgets from their name. In order to work properly, this requires each widget with the same parent to have a different name. Mosaic uses a general menu creation procedure which uses a menu specification, and this assigns the same name to all PushButtons in a PulldownMenu, the same name to all ToggleButtons, etc.
The widgets can all have sensible names, derived from the string that is displayed (although all `.'s have to removed to avoid confusing XtNameToWidget()). This change was made to the Mosaic Xmx functions, so that the menu buttons have names like ``File'', ``Save As'', etc, instead of just ``pushbutton''.
It is not common to hard-code translations, and indeed this would normally be bad practice as it removes the possibility of user customisation. However, Mosaic hard-codes the translation table for the HTML widget, due to an unknown bug. This causes some problems for the Record side of replayXt as it relies on the user being able to override the translation table in resource files. The Record action had to be hardcoded into Mosaic to overcome the ``fix'' to the bug. This involved changing the translation table so that entries such as
The majority of applications do not bother to track mouse motion, as it usually is of no interest. Mosaic does track mouse motion while the pointer is in the view widget area, as it needs to determine if it is over an anchor so that it can display the URL pointed to. This is done by an action ``track-motion()''. Usually to record mouse motions by replayXt it is only necessary to override the Motion event. We must be careful here not to interfere with Mosaic tracking in the view widget, so the appropriate resource file entry is
Another possible use is for remote testing of Mosaic. The standard replayXt library allows a program to be run using a file that exists on the same filesystem, or to be controlled using "send" from another application that is using the same X server ("send" works by using properties on the X server). Since the CCI protocol only requires that both the client application and Mosaic be able to communicate by TCP/IP over a network, these restrictions are removed.
The addition of an EXECUTE mechanism to CCI carries the danger of what may be executed. The interpretation language used here is tcl, and this is a full programming language capable of reading and writing, and erasing files. In addition, shell programs can be run from tcl scripts. This is a major security hole, as an unscrupulous client could send a command
This problem with tcl has already been recognised in the context of active mail and a solution found in safe-tcl (Borenstein, 1994). This consists of a two-layer system. The outer layer is a stripped-down ``untrusted'' version of tcl in which all potentially dangerous commands have been removed. This interpreter is used to evaluate the CCI scripts. There is another ``trusted'' layer, which consists of the full tcl language, and this is capable of performing any task. However, this is not accessible directly from CCI scripts.
For useful application-specific tasks to be performed by the untrusted layer, the trusted layer may define to it commands that will be passed to the trusted interpreter. These obviously have to chosen very carefully! For example, in the original safe-tcl, access to the file system was only through commands such as SafeTcl_loadlibrary which will only access the file system for certain files.
The additional commands defined to the untrusted layer are those of replayXt, that is, sleep, callActionProc, etc. The original Tk commands and those related to mail handling are removed.
This ensures that no security hole comes from the untrusted tcl layer at all. There is only the question of how secure the trusted commands are. Any widget action can be invoked; this includes pressing any menu button. There is a "Save As" button on the file menu which invokes a dialog prompting for a file name. By choosing the right file name, there is a possible scenario in which an aggressive client loads a URL with contents
The danger occurs when the SaveAs dialog is invoked, a file name is chosen, and an attempt is made to press the "Ok" button.
To counteract this, the unsafe interpreter is not given direct access to the callActionProc() command of replayXt, but instead to a command that may examine the arguments before deciding what to do. This command is defined in the trusted interpreter on a per-application basis, and defaults to just calling the original callActionProc() of replayXt.
For Mosaic, the callActionProc() command is set to the original version of callActionProc, except for the Ok button of the FileSelectionBox, where instead the Cancel button is pressed:
The Microsoft Windows environment does not possess a scripting language common to all applications. Visual Basic controls do, but I doubt if the current browsers are built using these.
Within X, there are different widget toolkits, such as Tk, Interviews and Fresco. Tk is already based on tcl, so there should be no problems in adding CCI control to this using the tcl-dp extension (tcl-dp??). Fresco is designed above CORBA (???) and there is a tcl-interface to CORBA used to control Fresco widgets. However, for other toolkits it may be necessary to design an appropriate interface for whatever scripting language is used, as well as adding the CCI protocol.
This does require an interface between the windowing toolkit and the scripting language. This technique should be fairly easy to extend to any toolkit where this is the case. Of course, there is still a high dependance in scripting commands on the particular toolkit. It would be non-trivial to build a "universal" scripting language for all window systems.
Security issues arise in the capabilities of the scripting language and also in the extra features that any particular application may add via this scripting language. Given that the CCI security itself is fairly weak, it is important for each application to cover holes carefully. The safe-tcl system allows enough control to do this in a simple manner for replayXt.
AusWeb95 The First Australian WorldWideWeb Conference