With the emerging software component technologies, increasing numbers of small components are expected to be developed, and fewer monolithic applications. In order to be able to successfully combine components to applications, issues of component interoperability will arise. Software component developers will require information about interoperating with existing components, and users will need to know what combinations of components they may employ in order to fulfil a certain task. While there does not yet exist a general established way of describing complete semantic information about components, we propose a more pragmatic approach to describe a level of interoperability based on information inherent in the software development process. The GIPSY (Generating Integrated Process support SYstems) [HREF3] project provides support for distributed software development processes, including extraction of interoperability information from the process. This paper presents a way of making such information available world-wide, by means of the WWW. Methods are proposed to map the information from the software development process in GIPSY to formats suitable for presentation on the WWW.
As software component technologies are receiving increasing attention, more and more individual software components are being produced (Adler 1995) (Orfali et al 1996). These components are combined by users to fulfil specific tasks, replacing large monolithic applications. Due to development and distribution of components via the WWW, usage of components will likely increase rapidly and herewith multiply the importance of component interoperability and configuration management. It is no more sufficient that an application is compatible just with one operating system, but a component must be compatible and interoperate with a whole range of other components (Manola 1995) (Heiler 1995). Interoperability information includes information about how different versions of different components interoperate, which documentation refers to which components, etc. This is required both for users and for developers of new components.
Developers of different components need to cooperate in order to produce properly interoperating components. As producers weaken up their enterprise boundaries to create so-called Virtual Enterprises, new methods to share and connect different software development processes will be sought. Developers require ways to find existing components, to find out how to interoperate with them, and methods to refer to them from their own development process.
Users need to find a suitable and consistent set of components for a given purpose, i.e. they need information for component configuration management. Already today this is difficult, e.g. just to manage the extensions of an operating system is often too much to ask from a user, but with the coming large numbers of components, this problem will become even more serious. It encompasses the functional aspect of finding components to solve a given set of problems, and the technical aspect of ensuring that components work together.
While the issue of finding components for given functions is not further treated here, we propose methods to deal with component interoperability aspects. Since interoperability information is often available during development, we propose to extract such information from the development process and make it accessible to others. This requires a common understanding (meaning) of this information to be defined, and for the information to be published. Two main issues therefore arise:
In order to solve the problem of describing interoperability, an understanding of interoperability information is required. This may be achieved by classifying interoperability information in different levels. We propose a new intermediary level between two established ones, thus identifying three levels:
From the interface level to the semantic level, every subsequent level reduces the set of potentially interoperating components, as they are described more precisely. It is clear that the interface level is not sufficient for a proper employment of a component, as it does not provide complete information about how to correctly use it (Brown et al 1991). On this level, interoperability information is left to the accompanying documentation, or worse, to the interpretation of the person working with the component. The semantic level, on the other hand, provides full information about the meanings and behaviour of a component. In order to access this information, its implemenation would need to be studied in-depth, but the source code is rarely published and is not easily understandable. Besides, a component's user often is not interested in the implementation details, but only in a more abstract description of what it does. Unfortunately, no general established and understandable abstract way to describe complete semantics exists yet.
Our more pragmatic approach focusses on the development process and makes use of the fact that interoperability issues are often investigated during development already, but unlike interfaces and semantics, are usually not retained. It therefore assumes a certain confidence in developers to correctly provide this information. We postulate that the approach to regard the development context represents an improvement over current practice in managing interoperability which often only deals with the interface level, while bearing in mind that the next improvement, dealing with the semantic level, is at present usually not well feasible. Furthermore, the reasons information is required from the semantic level frequently have to do with understanding the correct usage of a component, and these questions can often be addressed by information in the originator level without specific semantic knowledge. The proposed intermediary level thus offers steps towards a solution to find correctly interoperating sets of components that is feasible with today's technology.
In order for interoperability information to be expressed, visualized, and inspected and utilized by developers, users, etc., a common understanding of this information is required. This is achieved by defining a common model of software development processes which will carry this information. The GIPSY (Generating Integrated Process support SYstems) [HREF3] (Murer at al 1996) project, dealing with distributed software engineering, defines such a process model.
The working hypothesis for the model is that all data (contract, specification, implementation, code, test case, manual, etc.) produced during development and maintenance of a software product can be represented as a partially ordered set of objects. Every object carries the information of part of the software product and typically depends on other objects. These dependencies define the ordering of the set. The definition of the process structure is then derived from the product structure, leading to a 2-dimensional directed acyclic graph (DAG) (fig. 1).
Fig. 1: Product Structure Induces Process Structure
In the DAG, a node is either an atomic process representing an individual planned object, or it is a compound process containing other processes, thus defining a hierarchy of processes (sub- and superprocesses). During process enactment, objects are created and assigned to atomic processes, and dependencies to objects on neighbouring processes are validated. The edges in the DAG represent these dependencies (to predecessor and successor processes). Objects can assume different states, represented by colours in the DAG. The completion of an object requires a well-defined formal condition associated with the object to be fulfilled; a dependent object can only be completed after the previous ones have been done.
The process structure may change either when it is edited, i.e. when new process and dependencies are added or existing ones are deleted, or when previously done process steps are redone. In order to record the previous results, the process model provides a third dimension, the history dimension. The object to be revised and all its dependants move down the history dimension, now creating a history layer, and a new part of the top process layer is created (fig. 2). Different versions of an object may thus be found by navigating along the history dimension of a process.
Fig. 2: Sample 3-Dimensional Process Structure
Since full versioning and dependency information is contained in the process structure, this enables straightforward integration of configuration management in an architecture that implements the proposed process model.
Typically, a component cannot be used on its own, but has to work together with others. This is reflected in the fact that already during development of a component, information about existing components is accessed in their development processes, either explicitly or implicitly. These references to other components' processes may be registered in the development process by installing links to them. Thus, dependencies among different components' processes may be defined in a similar way as the dependencies within one component's process. This powerful feature allows interoperability information among different components to be established. As a vision, the development efforts of many individual developers (companies) together may be regarded as one large global development process consisting of many linked processes.
Interoperability information may be defined both by defining dependencies within a component's process and by defining dependencies (links) among different components' processes. These dependencies in and among processes represent transitive information along which developers and users may navigate to find out about the development context of a component. Since the process structure is derived from the product structure, they also represent information about the structure of the product, i.e. the component.
Examples of interoperability information are presented using the following multi-purpose figure (fig. 3). It shows a component's process, "Process 2", consisting of the four processes A, B, C and D that each have been redone once, i.e. every process currently has two versions of an object (the newer one being in the top layer). Process D depends on B and C, and these two depend on A. A is linked to process Y contained in another component's process, "Process 1", which consists of the two processes X and Y.
Fig. 3: Sample Linked Processes
The two components' processes "Process 1" and "Process 2" could be owned by different developers. For instance, X could represent a library and A could represent a component that uses this library.
If B and C are source code objects (module implementations) and D is also an implementation, the dependencies will associate the correct versions of B and C with D, i.e. only the new B and C may be used with the new D, and only the old B and C may be used with the old D.
If A is a component and B is its documentation, the dependency will associate the correct version of the documentation with the component.
If A is a compiler front-end, B is a compiler back-end and C is a browser, the dependencies signify that compiler and browser refer to the same language and share a common understanding of that language, as opposed to every component interpreting the language semantics individually which may lead to subtle differences.
If A is a file format (or an object model, or a (link to a) compound document architecture, or a component, etc...) and B and C are applications (or components, etc...), the dependencies signify that B and C have a common understanding of A on a specific integration level and that they will not interpret A in different ways on that level, i.e. they are able to exchange and interpret data that conforms to A's specifications. For such complex examples, A, B and C would typically be compound processes containing again many atomic or compound processes to represent numerous objects with specifications, source code, object code, test cases, etc.
Interoperability information may also be used to define configurations, consistent sets containing components that may be used together. Still referring to fig. 3, this information is used if for example A, B and C are to be combined in a configuration. The new B and the old C may not both be in the same configuration, as the new B requires the new A and the old C requires the old A. Two different versions of the same object, here A, cannot be in the same configuration, therefore such a configuration can be recognized as invalid.
In order for developers and users to be able to navigate through 3-dimensional processes independently of a GIPSY environment to find interoperability information, the information is mapped to the WWW. The WWW presents itself as an ideal medium to make interoperability information accessible world-wide to developers and users, as it provides a coherent view that abstracts from heterogeneity and location of the information. It may be accessed with minimal effort by clients - they do not need a GIPSY environment or other specialized software, a WWW browser is sufficient. Furthermore, electronic publishing enables automated navigation and searches to be performed on the data.
While a software development process as a whole is a dynamic structure which grows incrementally, the developer can make information about the completed parts available on the WWW as soon as they are ready (producing a static snapshot of process information), as the recorded process history is not modified later on. Publishing information involves traversing the process to extract information, translating this information to HTML and other formats as detailed below and placing it on WWW servers. This may either be initiated by the developer from time to time, e.g. when it is decided to release information about a new part of the process, or requested by a client (e.g. via CGI), if the developer services such requests. In either case, the developer has full control over what information is released in public and where to impose access restrictions, such as on source code objects.
According to the information required, and limited by access restrictions, only a subset of a component's process may be handled, or only a specific part of information about an individual object may be used. In this way, individual configurations may be highlighted. For instance, users will only be interested in a specific combination of object code and documentation, while developers will also be interested in programmer's reference information, and developers in the same company will have full access to specification and source code objects.
A possible mapping of process information to the WWW is described below. An algorithm is used that traverses a component's process and handles every process contained within at most once, or it handles only specific parts of the process as required or permitted.
When the algorithm visits a process, the process is requested to generate a HTML page containing its required information, including its dependencies. The process translates its dependencies into a list of hyperlinks to all its neighbouring processes (any number of predecessors, successors, linked processes, subprocesses, up to one superprocess, older version, newer version). These hyperlinks may be used for straightforward navigation in the 3-dimensional process structure. Further information typically included in the HTML page is information about the processes' relative position in the process structure and information about its object, such as its name, type, possibly also state, responsible developer, access privileges, planned and real time of completion, required conditions for completion, etc. This information is filled into the first part of the HTML page, followed by the process content, which depends on the process type (atomic or compound), that generates the rest of the page.
A compound process generates a 2-dimensional GIF-graphic of the dependency graph of its subprocesses which are in the same history layer (as in fig. 1, right half). Since the coordinates of every process are known, this is used to build a map that allows activation of the appropriate hyperlinks via a CGI script. Furthermore, a 3-dimensional graphic of the process structure is also generated to visualize the process in a 3-dimensional view (such as Apple QuickDraw 3D or VRML), which can (optionally) be used by those clients having appropriate viewers. The 2- and 3-dimensional graphics provide overviews of the process history, configurations, etc. to enable clients to easily orient themselves when navigating through the process structure by following hyperlinks. Clients having Apple QuickDraw 3D viewers can even zoom, pan and turn 3-dimensional process views in real time.
An atomic process requests its associated object - via the host where the object is located - to generate its part of the HTML page that represents its content (or a hyperlink to its content), whereby the object is free to respond to that request in an appropriate manner. While generating its part, the object may also refer to its context in the process, i.e. to other objects, even in a different component's process, in order to be able to generate appropriate hyperlinks to referenced information. Depending on the type of object, different content information may be generated, for example:
An example [HREF8] of the visualization of a component's process on the WWW is available.
Once interoperability information from the common 3-dimensional process model has been presented on the WWW, a range of applications that make further use of this information may be envisaged, of benefit both to developers and users, as the following examples show. This is in addition to the basic application of process linking for component developers.
The WWW provides a simple and coherent interface to software development processes to allow navigation through the 3-dimensional process structure with minimal effort. No new software is required for the user, no specific viewer applications are required for the heterogeneous objects (e.g. components), no knowledge is required about the location of objects. The hierarchic structure of processes is presented in a clear way and allows straightforward navigation by following hyperlinks. Through navigation, the existence, versions, documentation, dependencies, specific content information, etc. of components can be found.
A more fine-grained level of navigation not only among but also inside components is possible in those cases where object code and/or source code objects of atomic processes are able to translate their interface definition or source code text into HTML and translate references to other components into the appropriate hyperlinks. Source code and interface definition browsing may be performed simply by using WWW browsers.
For languages that do not qualify externally declared identifiers by their module name (such as the language C), it is a great advantage if a hyperlink is used for such an identifier that points to the correct module. And not only the module can be found, but its correct version: Again regarding fig. 3, if B and C are source code objects imported by D, then the hyperlink references to B and C in D's source text will point to the correct versions of B and C. Furthermore, identifiers declared in B and C and used in D may be expressed in D as hyperlinks to the correct locations inside the imported modules' source texts.
To search for specific information following search criteria and navigation rules, programs based on process navigation techniques may be written, even by independent companies. Automatic agents may navigate through processes and collect such information in specific databanks or build indexes into processes, or searches may be performed on request.
A developer may specify certain sets of his components as valid configurations and present this information on the WWW. It would imply that the suggested combination has been determined to function together properly and has been tested, possibly by making use of semantic knowledge about the components available only to the developer, such as the source code. Since large numbers of valid configurations could exist, the information could be translated into HTML pages only upon request (from a user request processed by a CGI script).
Managing components, in particular finding out whether a given set of components is consistent, is often a difficult task. Here, software agents may be envisaged that roam through the global network of linked processes to search for this information, as consistency checking can be performed automatically according to fixed rules. Similarly, developers or third parties may offer to automatically process queries about valid configurations, so that a user may check the consistency of a specific set of components, i.e. those currently on his machine, or a set assembled by himself, possibly containing components of many different developers.
Electronic marketing of components on the WWW requires product information to presented in an electronic catalog, of which interoperability and consistency information are very important. Attractive 3-dimensional navigation through virtual software stores may be performed in order to find consistent sets of components. Once a user has determined a configuration of components to be valid, they can be automatically downloaded and installed. When issues of security and electronic payment (for purchasing and/or leasing of components) are also resolved, the presented possibilites open up a new world of electronic marketing of interdependent software components, examples of which [HREF6] are already emerging.
In the GIPSY (Generating Integrated Process support SYstems) [HREF3] (Murer at al 1996) project, we have built a prototype implementation of a distributed integrated software engineering environment based on the presented process model. By using the environment to represent the development process of the GIPSY project itself, the model has proven to fulfil expectations. Version and configuration management is possible in a straightforward way. The translation of process information to formats suitable for the WWW has partially been achieved already (example [HREF8]); there are no principle obstacles to the complete automatic translation as described in this paper.
Navigation through software processes has been performed on the WWW, and hyperlink browser possibilites have been demonstrated successfully. Using an automatically generated compiler front-end for the Oberon [HREF9] language that performs declaration analysis, Oberon source code modules have been translated to HTML whereby every imported identifier has been represented as a hyperlink to the specific location of the referenced module, thus making use of the dependency information in the process.
Due to the emerging software component technologies much effort needs to be made to manage interoperability and configuration aspects for huge amounts of software components. With the proliferance of small interdependent software components to be expected in the near future, it is essential that interoperability information is provided. Developers and users will need access to interoperability and version information provided by other developers.
We introduced a pragmatic approach to manage components' interoperability on a certain level by extracting interoperability information out of the corresponding component development process. Users can apply this information without having to learn about details not of interest to them. Considering the quality of much of today's software, it would be reassuring for a user to know that a developer has specifically approved a set of components as a valid configuration. This is certainly preferable to an accompanying "readme" text only listing known incompatibilites. Thus, essential contributions to the success of users' software component management can be made. A common understanding of the software development process is recommended and achieved by the proposed 3-dimensional model of software process state, progress and history. This model is the underlying concept of the GIPSY project that focusses on the integrated support of distributed software engineering.
We believe that the WWW is the optimal medium to provide interoperability and configuration information of components to developers and users. We showed how interoperability information can be presented on the WWW by providing access to process information. The WWW provides a coherent interface to the diverse data that exists in software development processes, it abstracts from the heterogeneity and location of this data. The dynamic nature of the WWW allows immediate addition of such information as soon as new components become available. The mapping of process information to the WWW is outlined and can be done with present technology. This is preferable to having only interface level interoperability information, while no established method exists yet to present the higher level of semantic interoperability information.
The outlined applications and the following outlook give an idea of how the presented approach can be transformed into commercial services on the WWW.
A possible extension to the proposed scheme is to add type information to the generated HTML hyperlinks; while the dependencies they represent are well-defined in the process, their exact meaning is not translated to the hyperlinks. This may also lead to the definition of an interoperability definition language (as an extension to an interface definition language) to formally define interoperability.
As a next step to specifying a common process model for software development processes, the mapping to the WWW of a complete architecture implementing such a process model may be studied. Software development may itself be regarded as a world-wide distributed application, running on a global network such as the WWW does. The idea would be to enable platform-independent world-wide participation in software development processes, so that not only static process information is mapped to the WWW but that such information is dynamically created itself on the WWW. It may make use of existing [HREF4] or upcoming [HREF7] ideas for integrating distributed applications in the WWW.
F Manola (1995) "Interoperability Issues in Large-Scale Distributed Object Systems" in ACM Computing Surveys, Vol.27, No.2, June 1995, pp. 268-270.
S Heiler (1995) "Semantic Interoperability" in ACM Computing Surveys, Vol.27, No.2, June 1995, pp. 271-273.
R Orfali, D Harkey, J Edwards (1996) "The Essential Distributed Objects Survival Guide", Wiley, 1996.
R M Adler (1995) "Emerging Standards for Component Software" in IEEE Computer, March 1995, pp. 68-77.
A W Brown, J A McDermid (1991) "On Integration and Reuse in a Software Development Environment" in Software Engineering Environments, Vol. 3, Ed. Fred Long, Conference Proceedings, Aberystwyth, Wales, Ellis Horwood, 1991, pp. 171-194.
T Murer, A Würtz, D Scherer, D Schweizer (1996) "Generating Integrated Process support SYstems", submitted to ASWEC'96 (Australian Software Engineering Conference).
| Pointers to Abstract and Conference Presentation | |||||
|---|---|---|---|---|---|
| Abstract | Conference Presentation | Interactive Version | Papers & posters in this theme | All Papers & posters | AusWeb96 Home Page |