Extending the Common Client Interface with User Interface Controls


Jan Newmarch, Faculty of Information Science and Engineering, University of Canberra, PO Box 1, Belconnen, ACT 2616. Phone: +61 6 201 2422 Fax: +61 6 201 5231 Email:jan@ise.canberra.edu.au Home Page: Jan Newmarch [HREF 1]
Keywords: WorldWideWeb, CCI, Mosaic, Motif, Xt

Introduction

The Common Gateway Interface (CGI) [HREF 2] is a well-known method for creating HTML documents on the fly by running an application on the HTTP server side. More recently, NCSA Mosaic has introduced the Common Client Interface (CCI) [HREF 3] for allowing control of the client side browser Mosaic. This allows an application to request such things as resolution and display of a URL, so that applications can cause documents to be displayed by the Mosaic browser. This is currently only available in Mosaic 2.5 for X [HREF 4] .

Some simple uses of this are: a "slide show" [HREF 5] , where an application requests Mosaic to display a sequence of URLs, with a simple scripting language to specify such things as display times, and in xwebteach [HREF 6] , a master/slave coupling of instances of Mosaic so that a "teacher" can fetch various URLs on the master instance and have them automatically shown on the "student" slaves.

Two things that are specifically not in CCI are: client side execution of applications, and control of the GUI (Graphical User Interface) components of Mosaic. So, for example, while the slaves can show the same URL as the master in xwebteach, other aspects of the interaction such as paging through a document or highlighting sections of a document cannot be passed from the master to the slaves.

This paper reports on an extension of the CCI specification and its API that allows control of the GUI components of Mosaic. For the X Window System (Scheifler, et al, 1988) this is done by using an object-oriented approach which is based on the replayXt library [HREF 7] (Newmarch, 1994). The replayXt system is designed for automated demonstrations or tests of Xt-based software systems using script files.

The extensions to CCI and replayXt allow direct control of the interface elements of Mosaic for X by the same protocol that can control the URLs displayed, and can allow an instance of Mosaic to deliver its GUI actions to an application. This allows, for example, a more sophisticated version of xwebteach to be built that also tracks highlighting and paging through a document.

The structure of this paper is as follows: The next section looks at the CCI protocol and its API. The following section details the principles behind the replayXt system and how it can be used. The next three sections consider the extensions made to both of these to allow GUI control within the CCI specification, and changes made to Mosaic to work with replayXt. This is followed by an example of use. The replayXt system actually uses the command language tcl (Ousterhout, 1994), which is a full programming language with the ability to run arbitrary applications. This introduces potential security problems on the Mosaic side, and the next section details how these are dealt with. Possible extensions that can bring this functionality to Microsoft and Apple versions of Mosaic are then discussed. Finally the paper offers some conclusions on the CCI extensions.

Although the emphasis within this paper is on user interface controls, the implementation can easily be extended to allow execution of programs from the Mosaic side. This is not dealt with here as it involves a set of slightly different issues to those of the user interface.

Common Client Interface

The NCSA Mosaic Common Client Interface is a new experimental feature that allows external applications to communicate with running instances of Mosaic via TCP/IP. Applications can use Mosaic to fetch URLs or ask Mosaic to report to the application which URLs an interactive user has selected. This allows the slideshow program to work by an application reading a script file of URLs and asking an instance of Mosaic to display them. The xwebteach application works by asking the master to report all URLs the teacher has selected, and then telling all the slaves to fetch and display the same URLs.

There is an API [HREF 8] and a library to implement the CCI protocol, and an application gains CCI functionality by using this API and linking in the CCI library. The only Web browser that currently supports CCI is NCSA Mosaic for X, version 2.5. The library is currently in an experimental stage.

Replaying Xt Applications

X Window System Toolkit

Applications such as Mosaic for X that are built using the X Window System Toolkit (Xt, the Intrinsics) use interface elements called widgets (Asente and Swick, 1990). Typical Motif widgets are PushButtons, Labels, Lists and containers such as Form and RowColumn (Open Software Foundation, 1993). Widgets are controlled by an event loop mechanism that looks for user events such as button presses and dispatches the events to the appropriate widgets. The widget takes an action based on this, so that a Motif PushButton appears to sink into the surface when the left mouse button is pressed over it.

The widgets use an object-oriented approach in their behaviour. User events are mapped onto actions by a translation table. There is a default translation table per widget, and for the Motif PushButton this maps Button1Down to the action "Arm()" and Button1Up to the action "Activate()". An action table that is hard-coded into each widget associates a C function to each action, and it is this C function that is responsible for both changes in appearance such as appearing pressed in, and also for calling application-specific code via callbacks.

replayXt

The public interface to the widgets is defined by the set of actions. The actions are in fact strings, and these are related to the internal C functions of the widget by means of the action table. There is also a function, XtCallActionProc() which takes the action as parameter and calls the internal C function itself. To control the behaviour of an Xt widget calls need only be made to this function.

The replayXt library is an interface to this that reads a script file of actions and calls XtCallActionProc() repeatedly. This is at a higher level than scripting systems for X that generate low-level X events and send them to the application. It is however, sometimes necessary to include X event information such as x, y coordinates of the mouse so that the widget can "verify" that the action occurred within its boundaries. A typical piece of script to press and release a button with a 500 millisecond delay would be

callActionProc rowcol.button Arm() sleep 500 callActionProc rowcol.button Activate() \ -type ButtonRelease \ -x 20 -y 10

There are two main components to replayXt: the first acts as a player that reads the action script and makes the widgets act. The action script may be either a text file or be messages from another application sent using the "send" mechanism of Tk (Ousterhout, 1994), or the port of "send" to Xt [HREF 9] . The other component is a "record" mechanism that tracks widget actions as they occur in an interactive system, using XtAppAddActionHook(), and writes them to a text file. This record mechanism can allow quick generation of replay scripts, but can lose a little of the object-oriented possibilities of the replay side. The next section discusses this in more detail.

Tcl Command Language

The scripting language used by replayXt is the tcl command language (Ousterhout, 1994). This is a convenient language to use for many reasons: it can be embedded into arbitrary applications; source code is freely available; it is actively maintained with a large user group; there are a large number of applications using tcl as command language; there is an active newsgroup comp.lang.tcl HREF 10 .

tcl has full language capabilities of loops, procedural abstraction, etc. One use of loops is to run a demonstration multiple times, or indefinitely. The use of procedural abstraction allows common sequences of actions such as "press a button, pause, release the button" to be captured as procedures.

In addition to this, a command has been added by replayXt to allow X resource values to be extracted from a widget. This allows "presentation independant" action sequences to be written. For example, X applications may be resized by the user or have their default configuration altered by the user setting resources. These two instances of Mosaic differ in colours, fonts and placement of scrollbars, entirely controlled by user-set resources:

To see how these kinds of variation can be handled, consider the problem of ensuring that a button press always occurs within a widget. It is necessary to first get the widget resources of width and height, perform a calculation to say locate the centre of this, and then call the Arm() action with x and y set to these values. This can be easily done with scripts prepared by hand, using procedures as in

proc buttonClick {widget} { getValues $widget -width w -height h set mid_x [expr {$w / 2}] set mid_y [expr {$h / 2}] # move pointer to middle warpPointer $widget $mid_x $mid_y # press callActionProc $widget Arm() \ -type ButtonPress \ -button Button1 \ -x $mid_x -y $mid_y sleep 500 # release callActionProc $widget Activate() \ -type ButtonRelease \ -x $mid_x -y $mid_y } This is not so easy with automatically generated scripts from the record side of replayXt, so such scripts are not so resilient to user configuration.

Extending CCI

Protocol

To make Mosaic fetch a URL, CCI defines the request
GET {URL|URN} <url> [OUTPUT {CURRENT|NEW|NONE}]
This forces Mosaic to resolve the URL, with additional parameters to control the display. To get Mosaic to execute a tcl script of actions, a suitable client message from the application to Mosaic is
EXECUTE 
Content-Type: replayXt
Content-Length: <size>
<script>
This sends a message of arbitrary length, whose Content-Type tells Mosaic how to deal with it. (Using different Content-Types can allow execution of other types of scripts, but this is beyond the scope of this paper.)

Execution of the tcl command will generate both a tcl command completion code and some result. These are wrapped together as "output" and returned to the client for any action it may choose to take based on this:

218 Execution output
Content-Type: tcl-result
Content-Length: <size>
<output>

To make Mosaic tell an application which URLs it is fetching (or stop telling), CCI defines the requests

SEND ANCHOR
SEND ANCHOR STOP
To get Mosaic to send (or stop sending) the sequence of actions, suitable requests are
SEND ACTIONS
SEND ACTIONS STOP
The output from the browser as it sends each sequence is
303 Actions output
Content-Type: replayXt
Content-Length: <size>
<script>
Each message should consist of a syntactically correct command or set of commands. The receiving appplication may do what it likes with this, such as store it or send it on to another instance of Mosaic for execution.

In the case of Mosaic for X, the script and the actions will be in the tcl language, using replayXt. For other environments there may be another suitable language or alternative library. This may be determined from the "Content-Type" field in the messages.

API

There is also an API in C to support these additional components of the CCI protocol.

The API on the client side consists of two functions:

MCCISendExecute(MCCIPort serverPort, char *command,
                char *contentType, (*callback)() )
MCCISendActions(serverPort,on,callBack,callBackData)
MCCISendExecute() sends a command to Mosaic of a specified Content-Type, and registers a callback function to deal with any output from execution. MCCISendActions() is a function that can toggle the sending of actions from Mosaic.
The API on the Mosaic side defines a function
MCCIExecute(MCCIPort client)
which at present only recognises commands of type replayXt and executes them. Sending of actions is handled by a function
MCCISendAction(client, action)
which returns actions to a requesting client.

Extending replayXt

To turn an application such as Mosaic into a player, it is usually only necessary to make a call to RXt_RegisterPlayer(). This will allow the application to be driven from a script in a file or using the "send" protocol. However, the CCI library uses TCP/IP on an arbitrary port. A function on the server (Mosaic) side, MCCIHandleInput() needs to be modified to recognise the new commands, and then call a function that will execute a script passed in as a string. This can be done very easily by passing the string to the tcl function Tcl_Eval(). This adds another function entry point to the replayXt library by adding in a function RXt_DoCommand() which calls Tcl_Eval().

To make the application act as a recorder is usually just a matter of calling RXt_StartRecorder(). This then outputs the actions to a file. In the case of Mosaic we want to send this on the TCP/IP port if sending actions is enabled. A reasonably general solution within replayXt is obtained by allowing an application to register an arbitrary function that takes the action string as parameter and is called on each action. For Mosaic this can then be a function that sends the action string to an application. This just required some internal reorganisation of the record code, and adding a function to replayXt to allow register the callback function.

Security required other changes to replayXt. This is discussed later.

Fixing Mosaic

Mosaic was not designed with the intent that it should be controlled by other applications. Consequently, some of the implementation needs to be modified to allow this.

The replayXt library uses XtNameToWidget() to locate widgets from their name. In order to work properly, this requires each widget with the same parent to have a different name. Mosaic uses a general menu creation procedure which uses a menu specification, and this assigns the same name to all PushButtons in a PulldownMenu, the same name to all ToggleButtons, etc.

The widgets can all have sensible names, derived from the string that is displayed (although all `.'s have to removed to avoid confusing XtNameToWidget()). This change was made to the Mosaic Xmx functions, so that the menu buttons have names like ``File'', ``Save As'', etc, instead of just ``pushbutton''.

It is not common to hard-code translations, and indeed this would normally be bad practice as it removes the possibility of user customisation. However, Mosaic hard-codes the translation table for the HTML widget, due to an unknown bug. This causes some problems for the Record side of replayXt as it relies on the user being able to override the translation table in resource files. The Record action had to be hardcoded into Mosaic to overcome the ``fix'' to the bug. This involved changing the translation table so that entries such as

<Btn2Motion>: extend-adjust() become <Btn2Motion>: RecordMotionEvent() extend-adjust() It still does not allow user customisation, though.

The majority of applications do not bother to track mouse motion, as it usually is of no interest. Mosaic does track mouse motion while the pointer is in the view widget area, as it needs to determine if it is over an anchor so that it can display the URL pointed to. This is done by an action ``track-motion()''. Usually to record mouse motions by replayXt it is only necessary to override the Motion event. We must be careful here not to interfere with Mosaic tracking in the view widget, so the appropriate resource file entry is

Mosaic*translations: #override\n\ <Motion>: RecordMotionEvent() Mosaic*view.translations: #override\n\ <Motion>: RecordMotionEvent() track-motion()

Examples

Using the example program, xwebteach, it is a fairly simple matter to adapt it to send actions from the master and forward them to all of the slaves. This allows such things as selections to also show up on the slave machines. Some care will need to be taken to ensure that URLs are not resolved twice. In addition, if one of the slaves is resized, then x, y coordinates from the master may not be correct for this slave. This is not an inherent deficiency, but it means that a more complex command should be sent from Mosaic, containing commands to get size information.

Another possible use is for remote testing of Mosaic. The standard replayXt library allows a program to be run using a file that exists on the same filesystem, or to be controlled using "send" from another application that is using the same X server ("send" works by using properties on the X server). Since the CCI protocol only requires that both the client application and Mosaic be able to communicate by TCP/IP over a network, these restrictions are removed.

Security

There are a number of aspects to security. The first concerns the nature of CCI even in unmodified form. A user of Mosaic can select, either from the menu or by a resource file, that CCI is to be enabled. This then allows any application anywhere on the network to communicate with this instance of Mosaic, without any further security checks. This is in contrast, say, to the ICE protocol library in X11R6 where each new client trying to connect has to pass a security test.

The addition of an EXECUTE mechanism to CCI carries the danger of what may be executed. The interpretation language used here is tcl, and this is a full programming language capable of reading and writing, and erasing files. In addition, shell programs can be run from tcl scripts. This is a major security hole, as an unscrupulous client could send a command

exec rm -rf * to Mosaic, with a consequent damage to your file system!

This problem with tcl has already been recognised in the context of active mail and a solution found in safe-tcl (Borenstein, 1994). This consists of a two-layer system. The outer layer is a stripped-down ``untrusted'' version of tcl in which all potentially dangerous commands have been removed. This interpreter is used to evaluate the CCI scripts. There is another ``trusted'' layer, which consists of the full tcl language, and this is capable of performing any task. However, this is not accessible directly from CCI scripts.

For useful application-specific tasks to be performed by the untrusted layer, the trusted layer may define to it commands that will be passed to the trusted interpreter. These obviously have to chosen very carefully! For example, in the original safe-tcl, access to the file system was only through commands such as SafeTcl_loadlibrary which will only access the file system for certain files.

The additional commands defined to the untrusted layer are those of replayXt, that is, sleep, callActionProc, etc. The original Tk commands and those related to mail handling are removed.

This ensures that no security hole comes from the untrusted tcl layer at all. There is only the question of how secure the trusted commands are. Any widget action can be invoked; this includes pressing any menu button. There is a "Save As" button on the file menu which invokes a dialog prompting for a file name. By choosing the right file name, there is a possible scenario in which an aggressive client loads a URL with contents

jan pandonia.canberra.edu.au and then saves it to $HOME/.rhosts. This would then give rlogin access to your account by jan without requiring a password. Even worse, if Mosaic were run by root, and an attempt was made to write to /etc/passwd, then the security of the entire system would be compromised.

The danger occurs when the SaveAs dialog is invoked, a file name is chosen, and an attempt is made to press the "Ok" button.

To counteract this, the unsafe interpreter is not given direct access to the callActionProc() command of replayXt, but instead to a command that may examine the arguments before deciding what to do. This command is defined in the trusted interpreter on a per-application basis, and defaults to just calling the original callActionProc() of replayXt.

For Mosaic, the callActionProc() command is set to the original version of callActionProc, except for the Ok button of the FileSelectionBox, where instead the Cancel button is pressed:

proc callActionProc {widget action args} { if {[widgetPath $widget] == "Mosaic.fsb_popup.fsb.Ok"} { eval unsafeCallActionProc Mosaic.fsb_popup.fsb.Cancel \ [list $action] $args } else { eval unsafeCallActionProc [list $widget] \ [list $action] $args } } declareharmless callActionProc

Other Window Systems

In the Apple system, there is a major scripting language available to applications called AppleScript (Apple, 1993). However, this is designed within the Apple environment, and to use it within a TCP/IP environment would probably require browsers to listen on the CCI port and then send themselves the AppleScript messages.

The Microsoft Windows environment does not possess a scripting language common to all applications. Visual Basic controls do, but I doubt if the current browsers are built using these.

Within X, there are different widget toolkits, such as Tk, Interviews and Fresco. Tk is already based on tcl, so there should be no problems in adding CCI control to this using the tcl-dp extension (tcl-dp??). Fresco is designed above CORBA (???) and there is a tcl-interface to CORBA used to control Fresco widgets. However, for other toolkits it may be necessary to design an appropriate interface for whatever scripting language is used, as well as adding the CCI protocol.

Conclusion

It is a relatively straight-forward matter to extend CCI to include additional message types. These can be channeled through the replayXt library to give a simple method of controlling Xt applications.

This does require an interface between the windowing toolkit and the scripting language. This technique should be fairly easy to extend to any toolkit where this is the case. Of course, there is still a high dependance in scripting commands on the particular toolkit. It would be non-trivial to build a "universal" scripting language for all window systems.

Security issues arise in the capabilities of the scripting language and also in the extra features that any particular application may add via this scripting language. Given that the CCI security itself is fairly weak, it is important for each application to cover holes carefully. The safe-tcl system allows enough control to do this in a simple manner for replayXt.

Availability

The version of replayXt extended as described here (version 1.3) is available from ftp://ftp.canberra.edu.au/pub/motif/replayXt The modified version of Mosaic and the CCI library have been submitted to NCSA. They are also available from ftp://ftp.canberra.edu.au/pub/motif/cci

References

Apple Computer, Inc (1993)
"AppleScript Language Guide", Addison-Wesley.
N. S. Borenstein (1994)
"EMail with a Mind of its Own: The Safe-Tcl Language for Enabled Mail", ULPAA 94, Barcelona.
P. Asente and R. Swick (1990)
"X Window System Toolkit", Digital Press.
J. D. Newmarch (1994)
"Using tcl to Replay Xt Applications", Proc AUUG94, Melbourne, pp. 137-143.
J. Ousterhout (1994)
"Tcl and the Tk Toolkit", Prentice-Hall.
Open Software Foundation (1993)
"OSF/Motif Programmers Reference Manual", Prentice-Hall.
R. W. Scheifler, et al (1988)
"X Window System - C Library and Protocol Reference", Digital Press.

Hypertext References

HREF 1
http://pandonia.canberra.edu.au - home page for Jan Newmarch.
HREF 2
http://hoohoo.ncsa.uiuc.edu/cgi/overview.html - Common Gateway Interface (CGI).
HREF 3
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/CCI/cci-spec.html - Common Client Interface (CCI).
HREF 4
http://www.ncsa.uiuc.edu/SDG/Software/XMosaic/mosaic-docs.html - NCSA Mosaic for X version 2.5.
HREF 5
http://www.ncsa.uiuc.edu/SDG/Mosaic/CCI/cci-slide-show.html - a slide show built using CCI.
HREF 6
http://www.ncsa.uiuc.edu/SDG/Mosaic/CCI/x-web-teach.html - xwebteach.
HREF 7
http://pandonia.canberra.edu.au/SW.html#replayXt - the replayXt system for Xt applications.
HREF 8
http://www.ncsa.uiuc.edu/SDG/Software/Mosaic/CCI/cci-api.html - CCI Application Programmer's Interface
HREF 9
http://pandonia.canberra.edu.au/SW.html#tclXtSend - the tclXtSend library, a port of Tk's "send" to Xt.
HREF 10
news:comp.lang.tcl - the tcl newsgroup.

Copyright

© Southern Cross University, 1994. Permission is hereby granted to use this document for personal use and in courses of instruction at educational institutions provided that the article is used in full and this copyright statement is reproduced. Permission is also given to mirror this document on WorldWideWeb servers. Any other usage is expressly prohibited without the express permission of Southern Cross University.
Return to the AusWeb95 Table of Contents

AusWeb95 The First Australian WorldWideWeb Conference