A study of web-based application architecture and performance measurements


Weigang Zhao, School of Information Systems Curtin University, GPO Box U1987 Perth 6845 WA Australia, zhaoc@cbs.curtin.edu.au


Abstract

The corporate databases can be linked to the Web in a manner that allows clients or employees to access to corporate data through a Web browser. This paper first describes the bridge between the Web and corporate databases and discusses a series of related concepts. Secondly, a number of linking methods and their analysis are presented. Thirdly, an example web-based application developed using different linking methods is described. Finally, application architecture analysis and preliminary performance measurement results are reported.

Keywords

World Wide Web (Web), Database connectivity, Performance Measurement

Introduction

The World Wide Web (known as "WWW" or "Web") is growing at a phenomenal rate. The current Web is largely based on file system technology, which can deal well with the resources that are primarily static. However, with the unprecedented growth of resources, it is no longer adequate to rely on this conventional file technology for organising, storing and accessing large amount of information on the Web. Thus, many large Web sites today are turning to database technology to keep track of the increasing amount of data. Database technology has played a critical role in the information management field during the past years. It is believed that the integration of the Web and database technology will bring many opportunities for creating advanced information management applications (Feng and Lu 1998).

With the increasing popularity and advancement of Web technology, many organisations want to Web-enable their existing applications and databases without having to modify existing host-based applications. This not only gives all of the existing applications a common, modern look and feel but also can deploy them on corporate Intranets, the public Internet, and newer Extranets (Lu et al.1998).

Taking simple data from a database and placing it on the Web is a relatively simple task. However, in most cases, the corporate data is maintained in a variety of sources, including legacy, relational, and object databases. It is much more complicated when these diverse data sources must be queried or updated (Carriere and Kazman1997). The methods, techniques, and tools are in great demand to bridge the gap between the Web and database applications so that smooth, interactive, and integrated Web-to-database applications are made possible (Frey 1996).
There are many players in the industry taking this challenge. These include major database vendors, mainframe vendors, third party software firms, Web browser vendors, and Web server vendors. A wide range of tools and philosophies has been proposed for connecting and integrating the Web and databases (Kim 1997). In last paper (Lu et al. 1998), we presented a formal specification of web-to-database interfacing models. It is believed that web-based application architecture using different interfacing and integrating methods has much impact on the application's performance (Lazar and Holfelder 1997). This paper is to present our study on this issue.

This paper discusses the approaches and models in Web-to-database connecting technologies based on some results of the last paper. The remainder of the paper is organised in four sections. Section 2 describes the bridge between the Web and corporate databases and gives related concepts used. A number of linking methods and their analysis are provided in Section 3. An example web-based application is described in Section 4. Application architecture analysis and preliminary performance measurement results are also presented in Section 4. Conclusions and future work are reported in Section 5.

The Bridge Between The Web And Databases

Delivering data over the Web is cost effective and fast, and gives Internet users easy access to databases from any locations. Users hope to access databases via Web browsers with the same functions as provided by normal database application software. Businesses want to provide their users or customers various functions such as purchasing goods, tracking orders, searching through catalogues, receiving customised content, and viewing interesting graphics. The Web-to-database integration has become central to the jobs of corporate information systems construction.

Making database information available to Web users requires converting it from the database format to a markup language such as HTML or XML. Database packages store information in files optimised for quick access by front-end programs. When the Web server sends information to a client, the internal database format must be converted to HTML so that it is displayed correctly (Reichard 1996). A bridge between the Web and databases needs to be built. This bridge lets the Web browser replace the front-end program normally used to access the corporate databases.

Web-to-database connecting technology

To build a bridge between Web and enterprise database, a number of alternative technologies and architectures have been available. These include:

Each of the above technologies has strengths and weaknesses. Several factors should be considered when making selections. These include the complexity of data, the speed of deployment, the expected number of simultaneous users, and the frequency of database updates. However, new technology is emerging and several tools are already available that make this Web-to-database access optimised for improved performance (Carriere and Kazman 1997).

Database middleware

Generally, middleware can be said to be the glue (or logic) that lies between clients and servers. It deals with all the "grim stuff" of incompatible operating systems and file structures (Bernstein 1996). Programmers on both client and server ends use APIs for requesting or receiving services and data. Middleware is used to connect diverse products that do not have a common language. There are five different kinds of middleware: object request brokers (ORB), message-oriented middleware (MOM), database middleware, transaction-processing (TP) monitors middleware, and remote procedure call (RPC) middleware (Lu et al. 1998).

Middleware technology is becoming popular to connect databases with the Web. Middleware is in the midst of an evolutionary growth spurt. As it relates to the Web, the middle tier will evolve to play an important role in things such as enabling advanced multitier-application deployment, using the Web for distributed transactional systems, managing multiple execution environments with Java, C++, and ActiveX, and providing the links to existing mission-critical information resources.

Analysis of Different Connecting Methods

CGI

CGI is a standard for interfacing external programs with Web servers. The server submits client requests encoded in URLs to the appropriate registered CGI program, which executes and returns results encoded as MIME messages back to the server. CGI's openness avoids the need to extend HTTP. Most vendors of Web server extension tools continue to support CGI even as more advanced APIs have been added. This is due to the fact that many prewritten scripts are freely available for a variety of platforms and most of the popular Web servers.

CGI programs are executable programs that run on the Web server. They can be written in any scripting language (interpreted) or programming language (must be compiled first) available to be executed on a Web server, including C, C++, Fortran, PERL, TCL, Unix shells, Visual Basic, Applescript, and others. Arguments to CGI programs are transmitted from client to server via environment variables encoded in URLs. The CGI program typically returns HTML pages on the fly (Deep and Holfelder 1996). CGI lets Webmasters add common features, such as counters and date/time displays, on-line order forms, chat pages and search engines.

CGI also has several drawbacks. Each time a CGI script is spawned, it creates an additional process on the server machine, slowing the server's response time. Also, if the CGI script is not set up correctly, security holes can occur on the server, rendering the Web site vulnerable to attacks by hackers. Another problem is that it is difficult to maintain state - that is, to preserve information about the client from one HTTP request to the next (Deep and Holfelder 1996).

CGI is an early Web-to-database integration mechanism that is being replaced by more complex software programs that lie between the Web and database servers.

Server API

An alternative to modifying or extending the abilities of the server is to use its API. APIs allow the developer to modify the server's default behaviour and give it new capabilities. In addition to addressing some of the drawbacks of CGI, the use of an API offers other features and benefits, such as the ability to share data and communications resources with a server, the ability to share function libraries, and additional capabilities in authentication and error handling. Because an API application remains in memory between client requests, information about a client can be stored and used again when the client makes another request (Frey 1996).

There are, however, some drawbacks to this approach. Unlike CGI, API functions are server-specific, because each server has a different API. Buggy API code can crash a server. And more complexity is involved in developing the code, which must manage multiple process threads and clean up memory after it is run.

ODBC and JDBC

ODBC and JDBC are types of database access middleware. ODBC is, by far, the most popular database access middleware in use today. Vendor support for ODBC is pervasive. JDBC support isn't quite at the level of ODBC support, but JDBC is growing and flourishing. Database vendors and several third-party software houses offer ODBC and JDBC drivers for a variety of databases and operating environments.

From a network administrator's point of view, they consist of client and server driver software (i.e., program files). From a programmer's point of view, they are APIs that the programmer inserts in his or her software to store and retrieve database content. While a system analyst perceives ODBC or JDBC as a conceptual connection between the application and the database, database vendors regard ODBC and JDBC as ways to entice customers who say they want to use industry standard interfaces rather than proprietary ones. And managers of data processing department view ODBC and JDBC as insurance interfaces that offer managers some measure of flexibility should they find it necessary to replace one database product with another (Wong 1997).

ODBC technology now allows Web servers to be used to directly connect with databases, rather than using third party solutions. JDBC can also directly access server ODBC drivers through a JDBC/ODBC Bridge driver, available from SunSoft. ODBC driver vendors are also building bridges from ODBC to JDBC. JDBC is intended for developing client/server applications to access a wide range of backend database resources.

As more and more web-based applications are built by using different bridging methods as discussed above, it is significant to investigate how to measure the performance of each method in a consistent and fair manner. Next section will describe an application implemented by using three main bridging methods discussed in this section.

Web-based network configuration database development

To measure the performance of each of the bridging methods, a network configuration database application is selected. The database that underlies the application consists of 5 tables. They are given as follows:

DeviceTable
AssetNumber
Make
Model
SerialNumber
PurchaseDate
WarrentyDate
PurchasePrice
VendorID
ConfigFilename

VendorTable

VendorID

Address
Suburb
PostCode
Telephone
Fax
ContactKey

ContactTable
ContactKey
Name
Position
Comments

 EmployeeTable
EmployeeID
Name
RoomNumber
Position

DeviceLocation
AssetNumber
RoomNumber

 

 

The three interfacing methods we have chosen are: traditional CGI programming with Perl, Netscape LiveWire applications based on JavaScript, and Java Servlet with JDBC. These three particular tool sets were chosen because they highlight key design aspects of most Web database tool sets. These aspects include the level of Web server integration, the level of dynamic database integration, the extent to which code is generated automatically and the code structure (Lazar and Holfelder 1997). The database used is Microsoft Access database. Figure 1 is a screen layout of adding network configuration data to the database. This application allows the network administrator to add, remove, modify, and query the network configuration database by using any of the existing Web browsers.

Application Architecture Analysis

Perl as development tool has clear advantages in terms of portability. Perl scripts and other traditional CGI applications are portable across different Web servers. Using Perl with character-delimited copies of database tables limits its extensibility. Without built-in SQL support provided by specialised database connectivity extensions, database operations such as JOINs and complex SELECT statements spanning multiple tables require complex programming. Perl code is easy to read because there are few components and concepts. Thus this simplicity enhances maintainability.

LiveWire source code is either embedded within special tags in a standard HTML file, or placed in plain text files. LiveWire source code needs to be compiled into a proprietary format to be used by the Application Manager of the Netscape server. The JavaScript source code is fully portable between all supported platforms. LiveWire's only portability drawback is that it can only be used with the Netscape Enterprise and FastTrack servers.

LiveWire's sophisticated database features and development approach make it very extendable. LiveWire developers are able to perform sophisticated database functions, including JOINs, complex multi-table SELECTs, INSERTs, UPDATEs and DELETEs. Integration of database cursors, functions and looping constructs into the HTML file give LiveWire programmers detailed control over the HTML formatting of database data.

JavaScript functions and user-defined objects supported in LiveWire allow developers to produce clear, maintainable code. In addition, HTML formatting in LiveWire is distinct from functionality. All JavaScript code is embedded in <SERVER> and </SERVER> tags.

Java as a development environment has many advantages in terms of portability, extensibility, and maintainability. Database access source code using JDBC is highly portable across different Web servers. JDBC developers have flexible control of the databases being accessed. The neat code structures and pure object-oriented features make the code highly maintainable.

Screenshot

Figure 1. Screen layout of adding configuration data

Performance Measurement Mechanisms

Based on the high-level architecture analysis given in last section, we are interested in investigating performance implications of different bridging methods. This is vastly different from the general database system benchmark, general Web server benchmark, and WWW caching performance measurements reported in literature (Kim and Garza 1995; Saleeb1997). In view of database benchmarking, things like query optimisation, object-oriented features such as pointer traversal search and complex object manipulation, complex query processing may be target of performance measurement. As for Web server benchmark, items such as server connectivity, HTML document generation, proxy server, average response time, throughput, and security might be of interest.

The performance measurement factors considered in this paper are to be designed to measure each bridging method's performance. The test is performed on a 120Mhz Pentium machine with 80 MB of RAM running Windows 98, Microsoft Web Server, Jrun 2.1 and Microsoft Access 7.0. This machine was connected to a 300A MHz Celeron computer with 64Mb RAM running Windows 98. Automated test tools are used to test the performance of the CGI script in Perl with DBI and Java Servlet with JDBC.

First of all, a tool called Stop Watch was used to measure the time for 1 single request for CGI and Java Servlet. The tests are done on retrieving 50 records, 500 records and 1000 records from the database. There are a total of 5 tables in the database. For retrieving 50 records, Java Servlet apparently is faster than CGI. However, for retrieving 500 and 1000 records, CGI runs faster than Java Servlet. The average time for each case, in seconds, is shown in Table 1.

Table 1. Average time of retrieving different number of records


 

50 records retrieval
500 records retrieval
1000 records retrieval

CGI/Perl

6.5
13.1
30.5

Java Servlet/JDBC

2
25.8
50.6

In the second approach, a tool called Jmeter was used to measure the performance on both CGI and Java Servlet. Each of the tests runs about 5 minutes. For 50 records, Java Servlet runs faster than CGI on both 1 thread and 10 threads test. This demonstrates that the database communications overhead of Java Servlet approach is outweighed by other intrinsic advantages of the architecture even for small data sets. For 500 records, CGI is faster when running 1 thread but Java Servlet is faster when there are 10 threads running. For 1000 records, CGI is still faster when running 1 thread, but Java Servlet runs faster in 10 threads. We have expected to see performance decrease significantly under multiple simultaneous accesses as CGI requires a new process to be spawned for each simultaneous request.

The performance test of LiveWire/JavaScript approach is to be conducted. We are to work out query types with different level of complexity to further test each bridging method's performance. The performance measurement factors include connections per second, throughput (bytes per second), average response time (round-trip time), error rate, and web overhead ratio. Typical web-based database application usage will also be simulated. Currently, we are also working on ways to minimise or eliminate potential factors that may influence the performance measurements.

Conclusions

This paper first has described the bridge between the Web and corporate databases. Then a number of linking methods and their analysis have been presented. Finally, An application developed by using three different interfacing methods is described. Application architecture and factors for performance measurements have also been reported. Future work would include further refinement of performance factors and conducting experiments to provide comprehensive quantitative performance comparisons. In addition, porting the application to different platforms would produce valuable feedback to the refinement of performance measurement framework, hence an extensive benchmark test could be conducted.

References

Bernstein, P. A.(1996). Middleware: A Model for Distributed Services, Communications of the ACM, vol. 39, no. 2, pp. 86-87, February.

Carriere, J., & Kazman, R. (1997). WebQuery: searching and visualizing the Web through connectivity, in: Proc. Of the 6th International WWW Conference, pp. 701-711.

Deep, J., & Holfelder, P. (1996). Developing CGI Applications with Perl. Wiley Computer Publishing.

Duan, N. N. (1996). Distributed Database Access in a Corporate Environment Using Java, in the 5th International World Wide Web Conference, May 6-10, Paris, France.

Feng, L., & Lu, H. (1998). Integrating Database and Web Technologies, International Journal of World Wide Web, Vol.1, No.2, pp. 73-86.

Frey, A. (1996). Web-to-database communication with API based connectivity software. Network Computing Nov 15 v7 n18: 134(7).

Kim, W., & Garza, J. (1995). Requirements for a Performance Benchmark for Objected-Oriented Database Systems, Modern Database Systems: 203-215, Addison Wesley.

Kim, P. C. (1997). A Taxonomy on the Architecture of Database Gateways for the Web. In Proceedings of The 13th International Conference on Advanced Science and Technology (ICAST97). pp. 226-232.

Lazar, Z. P., & Holfelder, P. (1997). Web Database Connectivity with Scripting Languages. Web Journal, Vol. 2, Issue 2.

Lu, J., Zhao, W. G., & Glasson, B. C. (1998). Formal specifications of Web-to-database interfacing models. In Proceedings of Asia Pacific Web Conference (APWeb98). International Academic Publishers. pp. 133-140.

Rao, B. R. (1995). Making the most of middleware. Data Communications International, vol. 24, no. 12, pp. 89-96.

Reichard, K. (1996). Web servers for database applications. DBMS v9(n11), p31-36.

Saleeb, H. (1997). Real-Time Database Theory and World Wide Web Caching, Harvard University.

Whetzel, J. K. (1996). Integrating the World Wide Web and Database Technology. AT&T technical journal 75(2): 38-46.

Wong, W. (1997). Back-end Web Databases (Making corporate data available through Web servers). Network VAR, 5(12), pp. 67-72.
 

Copyright



Weigang Zhao, © 1999. The author assigns to Southern Cross University and other educational and non-profit institutions a non-exclusive licence to use this document for personal use and in courses of instruction provided that the article is used in full and this copyright statement is reproduced. The author also grants a non-exclusive licence to Southern Cross University to publish this document in full on the World Wide Web and on CD-ROM and in printed form with the conference papers and for the document to be published on mirrors on the World Wide Web.


 [ Proceedings

 AusWeb99, Fifth Australian World Wide Web Conference, Southern Cross University, PO Box 157, Lismore NSW 2480, Australia Email: "AusWeb99@scu.edu.au"