Donating Cycles over the Internet using Web Services

Dr Wayne Kelly [HREF1], Center for Information Technology Innovation, QUT, w.kelly@qut.edu.au

Assoc. Prof. Paul Roe [HREF2], Center for Information Technology Innovation, QUT, p.roe@qut.edu.au

(This work was supported by Microsoft Research)

Abstract

Web services have been widely promoted as the next major WWW innovation. Web services basically provide a remote procedure call mechanism, but unlike earlier approaches are implemented using simple open standards such as XML that allow them to be used across platforms, programming languages and object models. Web services were originally designed to develop distributed web-based applications and to integrate heterogeneous legacy applications. We, however, are proposing to use the web service model to create a parallel computing framework based around cycle donation. Our framework is identical to web services in respect to both the programming model exposed to application programmers and the underlying communication mechanisms used. Compared to other parallel programming frameworks our framework is designed to be simple and lightweight. For application programmers, exploiting the additional computation power provided by a dynamically changing set of volunteer machines is no more complex that exploiting a simple web service. By hosting volunteer components in a web browser, volunteers are able to donate cycles with a minimum of pain - absolutely no special software needs to be pre-installed on the volunteer machines. The use of open standards such as SOAP and HTTP means that components created on different platforms, such as .NET or the JVM, can be freely substituted.

1 Introduction

The idle cycles of networked PCs is increasingly being recognised as a huge and largely untapped source of compute power[1]. There are many applications that could seriously benefit from exploiting such idle cycles. Unfortunately, writing parallel programs is difficult; writing programs to run on changing sets of unreliable heterogeneous PCs is harder still. Some specific applications, the best known being SETI@Home have been able to harness the elusive idle cycles of networked PCs. Until now, there have been no simple frameworks which allow programmers to use commodity programming languages and operating systems to easily exploit the processing power of idle PCs. Existing systems are either difficult to use, e.g. they have complex and unfamiliar programming models, or are based on custom programming languages and systems which isolate programmers from conventional software technology - they are environmentally unfriendly. The goal of our system, G2, is to provide a framework that leverages existing technology so that a programmer familiar with conventional client-server and OO technology can write programs that run adaptively on idle PCs across the Internet. The framework abstracts away all issues to do with adaptive utilisation of idle PCs and enables programming in much the same way as conventional distributed programming e.g. client/server. Our framework uses XML, the lingua franca of the Internet and web services for communication.

2 Web Services

Web services are part of the next generation of web technologies designed to allow software components to be accessed across the Internet in a programming language, object model and platform independent manner. They therefore hold great promise for supporting long awaited business-to-business electronic commerce systems. Mechanisms such as TCP/IP have provided the necessary transport layer for such communication for some time, but impose no standards on the content of such messages. Protocols such as DCOM, CORBA and Java RMI impose standards on the message content, but additionally impose a particular object model, and in some cases a programming language on the communicating applications. Web Services use XML based protocols such as SOAP (Simple Object Access Protocol) to encode parameters passed to and from web methods. The open nature of these protocols means that they have received almost unprecedented broad industry support [ HREF3, HREF4, HREF5]. The format of web service messages must conform to published service descriptions described using an XML based Web Service Description Language (WSDL) [HREF6]. The application that implements the web methods only needs to be able to generate and parse XML messages. They are therefore free to be implemented using any language and object model. Web service messages are commonly communicated using HTTP which gives them the greatest chance of being able to pass through firewalls.

3 Creating and Using Web Services

Development environments for web-based applications, such as Microsoft’s Visual Studio.NET and the Sun’s Web Services Developer Pack have made the process of creating and using web services much simpler. The generation and parsing of SOAP messages at the client and server ends are taken care of by automatically generated proxy and stub classes respectively.

Figure 1: Web Service proxies and stubs

The following example shows how to create a simple web service using Microsoft's .NET platform. Any common language runtime (CLR) conformant language can be used to implement a web service, in this example we use the new C# language. Methods that are to be exposed as part of the web service are marked with a WebMethod attribute.

<%@ WebService Language="C#" Class="MathService" %>

public class MathService {
     [ WebMethod ]
     public int Add(int num1, int num2) {
          return num1+num2;
     }
}

From this information, a WSDL specification can be automatically generated (click here to view MathService.wsdl). Given a WSDL specification, a utility program can used to create a client proxy class for accessing the web service.  

public class MathServiceProxy : System.Web.Services.Protocols.SoapHttpClientProtocol {
    
    public MathService() {
        this.Url = "http://www.servername.com/LocalPath/MathService.asmx";
    }
    
    public int Add(int num1, int num2) {
        object[] results = this.Invoke("Add", new object[] {num1, num2});
        return ((int)(results[0]));
    }
    
    public System.IAsyncResult BeginAdd(int num1, int num2, 
                                        System.AsyncCallback callback, object asyncState) {
        return this.BeginInvoke("Add", new object[] {num1, num2}, callback, asyncState);
    }
    
    public int EndAdd(System.IAsyncResult asyncResult) {
        object[] results = this.EndInvoke(asyncResult);
        return ((int)(results[0]));
    }
}
The following example shows how this proxy class can be used to invoke web methods synchronously or asynchronously.
class ExampleClient
{
    static MathService mathProxy = new MathServiceProxy();
	
    static void Main(string[] args) {
        // invoke Add method synchronously 
        int result = mathProxy.Add(1, 2);
        Console.WriteLine("1 + 2 = {0}", result);

        // invoke Add method asynchronously
        IAsyncResult resultHandle = mathProxy.BeginAdd(1, 2, new AsyncCallback(MyHandler), 
                                                       new int[] {1, 2});
		
        // processing continues without a result ...

        // later, can block until result becomes available
        resultHandle.AsyncWaitHandle.WaitOne();
        // ...
    }

    // callback function used to process asynchronous results
    static void MyHandler(IAsyncResult ar) {
        int [] savedState = (int[]) ar.AsyncState;
        int result = mathProxy.EndAdd(ar);
        Console.WriteLine("{0} + {1} = {2}", savedState[0], savedState[1], result);
    }
}

Invoking web methods asynchronously is obviously more complex than invoking them synchronously, however, it conforms to the design pattern used for asynchronous operations throughout the .NET platform, and will therefore soon be familiar to many programmers.

4 Parallel Computing using Web Services

Web services allow clients to initiate a computation on a remote computer and to have the results of that computation returned. Web services were mainly designed to support distributed/client-server type applications. If, however, web methods are invoked asynchronously on a number of different server machines at the same time, then parallel computation can be achieved. Note, this is atypical of client-server computing, normally a large number of clients access a single server; here we have each client accessing a large number of servers.

Figure 2: Multiple Server Architecture

static void Main(string args[]) {
    // setup proxies for each server 
    MathService[] mathProxy = new MathServiceProxy[args.Length];
    for (int i=0; i<args.Length; i++) {
        mathProxy[i] = new MathServiceProxy();
        mathProxy[i].Url = args[i];
    }
    // ...

    WaitHandle[] waitHandles = new WaitHandle[args.Length];
    // invoke Add method asynchronously N times on different servers.
    for (int i=0; i<args.Length; i++) {
        IAsyncResult resultHandle = mathProxy[i].BeginAdd(i, i+1, new AsyncCallback(MyHandler),
                                                          new int[] {i, i+1});
        waitHandles[i] = resultHandle.AsyncWaitHandle;
    }

    // wait for all results 
    WaitHandle.WaitAll(waitHandles);
    // ...
}

While this architecture could be used to create parallel applications, it is very heavyweight with respect to the amount of installation and configuration required on each volunteer machine. Firstly, each volunteer machine must be set up as a web server. Secondly, a web service that implements the parallel portion of the application in question must be deployed on each volunteer machine. This architecture makes it difficult for additional, individually "owned" machines to donate their cycles as they become idle. Most workstation owners won’t have, or won’t be willing to install a web server, and won’t be willing to hand over the administrative privileges necessary to allow others to deploy new web services on their machines. Firewalls, pose an even greater problem for this architecture - even if a volunteer machine behind a firewall has a web server installed, clients outside of the firewall still won’t be able to contact it.

5 The G2 Approach to Parallel Computing using Web Services

This paper describes a system called G2 that uses the basic facilities of web services to implement a framework for parallel computing over the Internet and requires nothing more than a web browser to be installed on the volunteer machines. In our system, when a web method is invoked, rather than being executed on a specific machine, it is queued in a central job repository, to be executed by the next available volunteer machine.

Figure 3: Single Server Architecture

The server is the only machine that exposes actual web services through a web server. Client applications push "jobs" (web method invocations) into the job repository and volunteer machines pull jobs out when they wish. From the client’s perspective, the web service appears to be implemented on a single server machine. The server transparently delegates the actual execution of web services to whichever volunteer machines are available at the time. Neither the client nor the server need know in advance who the potential volunteers might be. Volunteers located anywhere on the Internet can contact the server at any time and thereby participate in parallel computations, even if they are behind a firewall (in most cases). Clients also can connect to the server from any machine on the Internet. Clients don’t have to log on to the server in order to deploy or launch applications, client applications execute directly on the end user’s desktop and seamlessly tap into the computational resources made available via the server. Note, this architecture allows clients to communicate (indirectly) with Volunteers, even if both parties are behind a (different) firewall. Such communication is generally not possible for pure peer-to-peer systems.

Our framework is currently implemented using Microsoft’s new .NET platform, however, all communication is performed using the open SOAP protocol, so any of the components (client, server or volunteer) could easily be implemented using alternate technologies (such as Java).

6 Creating Parallel Applications using G2

The process of creating a G2 volunteer component is exactly the same as creating a web service using .NET. In fact, any web service implemented using .NET can be used totally unchanged as a volunteer component. The process of creating a client application that invokes methods of a volunteer component is also no different to creating a client that invokes methods of a normal web service. The way in which volunteer components are deployed is however, fundamentally different from the way in which web services are deployed.

Web services are typically developed by third parties and deployed on a particular machine. Volunteer components, however, are typically not black-box components provided by third parties, and they are not pre-deployed on a particular machine. Volunteers component are simply the parts of an application that need to be executed in parallel. Volunteer components are typically created by the same programmer as the client component, and as we will see, need to present at runtime on the same machine as the client.

Visual Studio.NET allows two types of runtime dependences to be specified for a component: local component references and web service references. Local references are dependences on .NET components that are expected to exist on the local machine at runtime. Local references are added to a project by specifying a local path to the actual component. The referenced component must exist at compile time, but it is used at that time only for the purposes of determining its strong name (which is compiled into the dependent component). Web references, by comparison, are added by referring to their WSDL specification. A client-side proxy class is automatically generated from the WSDL specification and added to the project whenever a web reference is added.

We have integrated into Visual Studio.NET a third type of reference, a G2 reference. Figure 4 shows a screen shot of a G2 reference being added to a project.

Figure 4: Adding a G2 reference in Visual Studio.NET

Like local references, G2 references are added by specifying a local path to an actual volunteer component. However, rather than simply extracting the volunteer component’s strong name, we use reflection to determine the signature of web methods implemented within it. We use this information to generate our own custom client-side proxy class, analogously to how WSDL specifications are used to create client-side proxy classes for web services. The inclusion of a G2Reference also triggers the automatic creation of a volunteer stub class. This is required because the volunteer component will execute on volunteer machines, and so can not rely on the functionality Microsoft’s Internet Information Server (IIS) would normally supply to marshal and unmarshal web service messages on the server.

At runtime, the volunteer components of the application are placed in the same directory as the client application. This enables the middleware to locate volunteer components and lazily upload them to the server. We refer to this as automatic code upload. This spares the application programmer from having to manually deploy volunteer components to the server each time they are updated and also makes it easy to switch between servers from one run to the next. From the client’s perspective, the entire application (both client and volunteer components) logically resides on their desktop - everything to do with the server and volunteers is kept transparent. The server, by comparison, contains no pre-deployed application specific code - it acts like a BYO restaurant, clients on remote machines provide their own code to be executed on the volunteers. For efficiency reasons, volunteer components are cached on the server, so code upload to the server only normally takes place the first time a new version of a component is used on a particular server (old versions are automatically flushed).

7 Donating Cycles using G2

Volunteers donate cycles by pointing their web browser at an ASP.NET page on the server. The server retrieves information from the job repository and uses it to dynamically build a web page for the volunteer. If there are no jobs currently awaiting execution in the job repository, a web page containing a message to this effect is generated. The page contains a count-down display and a meta tag that causes it to automatically reload after a given number of seconds. In this way, the volunteer continues to poll the server (at an appropriate interval) to check for newly arrived jobs.

Figure 5: No Jobs Available

If jobs are available, then the server determines the oldest waiting job (other scheduling policies could be used) and creates a web page for it. The HTML page uses an <object> tag to embedded a .NET object of the class that implements the method corresponding to the oldest job. The presence of the object tag results in .NET code being automatically downloaded to the volunteer machine and cached for possible future use. Note: this is the last step of the volunteer component’s journey; from client, to server, to volunteer.

Figure 6: Automatic code upload and download

JavaScript code executed at load time calls a generic Process method on this object and waits for all jobs associated with that class to be processed. The Process method repeatedly retrieves jobs associated with that class of object from the server (via a generic G2 web service), executes them locally, and returns the results to the server. The Process method creates a new thread to do the majority of its processing, so as to allow the web browser’s main thread of control to continue (otherwise the web browser would "hang"). When jobs for the current class of object are exhausted, a new page is loaded containing an object for the next job class (if any). The following shows the contents of such a page:

<html>
    <head>
        <title>G2 Volunteer #915
        <script language="javascript">
            function Process() {
                vol.Process(915, 14, "http://g2.fit.qut.edu.au/g2/G2Server/");
                setTimeout("CheckFinished()", 1000);					
            }
				
            function CheckFinished() {
                if (vol.Finished()) {
                    window.location.search="?volunteerId=915";
                    window.location.reload(true);
                }
                else
                    setTimeout("CheckFinished()", 1000);
            }
        </script>	
    </head>
    <body onload="Process()">	
        <div> RayTracer (version 1.0.741.17882)</div>
        <object id="vol" classid="http:/Root/RayTrace/1.0.741.17882/RayTrace.dll#RayTracer">
        </object>
    </body>
</html>
Volunteer classes inherit from Windows.Forms.Control and so can implement a graphical user interface that provides a visualization of whatever computation is being performed on that volunteer. The following Figures show screen shots of two of our applications (TSP and RayTracing) executing on volunteers.
Figure7: RayTracing executing on a volunteer
Figure 8: TSP executing on a volunteer

8 Sample Applications and Results

The proposed system is fully implemented and available for the general public to try out at http://g2.fit.qut.edu.au/

Figure 9: G2 Home page

The demonstration system allows people to volunteer their PCs, and also to download and execute a number of sample client applications. This section briefly describes two of our sample applications, RayTracing and TSP. Figures 10 and 11 show screen shots of the client interfaces.

Figure 10: RayTracing client
Figure 11: TSP client

Our implementation of the travelling salesperson problem uses a parallel genetic algorithm[7]. Informally, the algorithm works as follows. An initial randomly generated population is partitioned and migrated to separate “islands” where they evolve in isolation for some number of generations. The populations then return to the mainland where they are mixed with individuals from other islands and migrated back to the islands to further evolve. This process continues until the person running the client is happy with the result. It is of course, not possible to determine when the optimal solution has been reached. The algorithm used is actually a hybrid algorithm [12] that uses a heuristic called 2-opt to augment the random mutation process.

Figure 13 shows the performance of our system for various numbers of volunteers. The experiment involved solving a randomly generated TSP problem consisting of 200 cities. The initial randomly generated population consisted of 5000 individuals divided into 50 groups of 100. The 2-opt heuristic, being relatively expensive is applied only randomly to populations, with a probability of 0.01.

Clearly, not all parallel applications will perform well on our system. As with all parallel systems, the speedup obtained depends of the ratio of computation to communication. By varying the number of generations that a population evolves for between migrations we can vary the computation to communication ratio. As the number of generations increases the rate of processing generations increases, but the rate of genetic crossover between populations and therefore the efficiency of the algorithm decreases. The machines used for the experiments were all 864MHz Pentium IIIs running Windows 2000 Professional, connected by 100Mb/s Ethernet. Separate machines were used for the client and the server.

Our RayTracing application works by partitioning each image into a number of rectangular segments, and computing each segment in parallel. The computational core of the system is based on code from the Intel peer to peer RayTracing demo [HREF7] written by Bryan Wilkerson. The algorithm is embarrassingly parallel, but still presents a challenge in that the size of the image data returned is high compared to the amount of ray tracing computation required. In this case, increasing the size of the tasks doesn't help greatly, as the size of the image data returned increased linearly with the amount of ray tracing computation required. The computation to communication ratio can, however, be affected by the complexity of the image being raytraced. The number of objects, and their relationship to one another can affect the amount of ray tracing computation required (see Figure 12).

So, as can be seen with both examples, it’s not simply a question of how many volunteers the system will scale to; as always, it all depends on the granularity of the tasks and the ratio of communication to computation. SETI@Home, for example, scales to thousands of volunteers by generating tasks that take 12 or more hours to complete on a typical PC. We are primarily targeting applications that require far fewer volunteers, but much smaller task sizes (in the order of seconds).

Figure 12: Speedup for RayTracing
Figure 13: Speedup for TSP

9 Related Work

Early cycle stealing systems such as Condor[10] and Piranha[4] relied on native code being executed, and used raw TCP/IP for communication. Condor used automatic check pointing and task migration to load balance tasks amongst a fixed set of possible volunteers. Piranha supported adaptive parallelism by using the Linda tuple space model to decouple computation from processes.

Early Internet based cycle stealing systems such as GIMPS[HREF8] and SETI@HOME[9] were specific to particular applications. Subsequent general-purpose internet-based systems have mostly been specific to the Java language. They include the Charlotte[2] and subsequent Knitting Factory projects[8] at NYU and Arizona State, the Javelin[5] and CX projects[3] at UCSB and Parabon[HREF9]. The Charlotte system was the first to use Java applets embedded in HTML pages to distribute code to volunteers. While these systems internally operate very similarly to ours, they require programmers to create jobs using (in some cases, quite) complex APIs. They also generally require volunteer components to be manually deployed on a web server.

Direct connection between clients and volunteers requires the client to be running a web server (Charlotte), or rely on RMI using TCP/IP (Knitting Factory). This bypasses the need to channel messages through a server, but all client machines must be directly accessible to the volunteer machines, i.e. not behind a firewall.

Grid systems such as Globus[6] and Legion[11] provide an interface through which distributed computing resources such as supercomputers, scientific instruments and archive data storage can be easily shared and used across organisational boundaries. Our project has a tighter focus than these projects and concentrates on cycle stealing and its associated problems especially programming and deployment of such applications. 

10 Conclusions and Future Work

Our framework provides a simple and language independent programming model based on web services where creating and communicating with processes on volunteer machines is as simple as a regular method call. The middleware works through firewalls, takes care of fault tolerance and shields the application programmer from the changing set of volunteer machines. Mechanisms such as our automatic code upload facility further simplify the use of our system. The use of off the shelf technology including web servers and relational databases means we leverage the significant effort that has gone into optimising such systems.

Our current single server implementation of the middleware is obviously a hindrance to scalability. Future work will focus on investigating alternative network topologies to minimise or eliminate this bottleneck while retaining the positive elements of our current system.

References

  1. Anderson, T.E., D.E. Culler, and D.A. Patterson, A Case for Networks of Workstations: NOW IEEE Micro, 1995(February).
  2. Baratloo, A., et al. Charlotte: Metacomputing on the Web in 9th International Conference on Parallel and Distributed Computing Systems. 1996.
  3. Cappello, P. and D. Mourloukos. A Scalable, Robust Network for Parallel Computing. in Joint ACM JavaGrande - ISCOPE 2001. 2001. Stanford University, California.
  4. Carriero, N., et al., Adaptive Parallelism and Piranha. IEEE Computer, 1995. 28(1): p. 40-19.
  5. Christiansen, B.O., et al., Javelin: Internet-Based Parallel Computing Using Java. Concurrency: Practice and Experience, 1997. 9(11): p. 1139-1160.
  6. Foster, I. and C. Kesselman, Globus: A Metacomputing Infrastructure Toolkit. International Journal of Supercomputer Applications, 1997. 11(2): p. 115-128.
  7. Jog, P., J.Y. Suh, and D. Van Gucht, Parallel genetic algorithms applied to the travelling salesman problem. SIAM Journal of Optimization, 1991. 1(4): p. 515-529.
  8. Karaul, M., Metacomputing and Resource Allocation on the World Wide Web, in Computer Science. 1998, New York University: New York.
  9. Korpela, E.J., et al., SETI@home-Massively distributed computing for SETI. Computing in Science and Engineering, 2001. 3(1): p. 79.
  10. Livny, M., et al., Mechanisms for High Throughput Computing. SPEEDUP, 1997. 11(2).
  11. Natrajan, A., M. Humphrey, and A. Grimshaw. Capacity and Capability Computing in Legion. in International Conference on Computational Science. 2001.
  12. Sengoku, H. and I. Yoshihara. A Fast TSP Solver Using GA on JAVA. in Third International Symposium on Artificial Life, and Robotics(AROB III’98). 1998.

Hypertext References

HREF1
Wayne Kelly's home page: http://www.fit.qut.edu.au/~kellyw
HREF2
Paul Roe's home page: http://www.fit.qut.edu.au/~proe
HREF3
IBM Web services: http://www.ibm.com/webservices
HREF4
Sun web services: http://java.sun.com/webservices
HREF5
Microsoft web services: http://msdn.microsoft.com/webservices
HREF6
Web Services standard http://www.w3.org/TR/wsdl
HREF7
Intel Peer-to-Peer Shared Cycles Demo: http://www.intel.com/ebusiness/products/peertopeer
HREF8
GIMPS - Great Internet Mersenne Prime Search http://www.mersenne.org
HREF9
Parabon: http://www.parabon.com

Copyright

Andrew Treloar, © 2000. The authors assign to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The authors also grant a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.