|Web Techniques Magazine
Volume 2, Issue 1
By Jeff Straathof
Just as telephone lines used to get tied up on Mother's Day, Web sites are prone to traffic overload. According to the February 14, 1996, issue of USA Today, thousands of users per second were turned away from IBM's Web site during the chess competition between Gary Kasparov and Big Blue's chess-playing computer. Another example of a high-profile cyberevent that left Web surfers high and dry was the 1996 Super Bowl Web site, which turned away millions of would-be users.
Web-server bandwidth is a problem not only for Internet Web sites, but also for corporate intranets, which often function as the backbone of a company's operations. A recent report from Zona Research (Redwood City, CA) makes some startling predictions about the growth of the intranet marketplace: By the end of 1998, the $8 billion revenue generated by the intranet market will be almost four times that of the Internet.
Businesses everywhere are launching internal applications on the World Wide Web, using it as a channel to reach scores of employees in moments, and these applications must have quality, performance, and scalability. Test engineers must measure how fast Web-system components work together, so they can determine what kind of workload their Web servers can withstand: how many simultaneous hits the Web site can handle, and how this affects quality and performance.
Load testing also charts the time a visitor to the site has to wait for browser responses. It finds hidden bugs and bottlenecks and gives developers the chance to correct them before the site goes into production. All hardware, software, and database vendors boast of the speed of their products, but load testing discloses how fast those products work within a unique environment, for primary transactions, during peak business hours.
Furthermore, load testing checks and maintains applications as their workloads increase, so systems can be adjusted accordingly. As businesses grow and change, it is important to confirm a Web site's ability to sustain growth. Developers can reuse scripts to alter usage levels, transaction mixes and rates, and application complexity. Load testing is the only way to verify the scalability of components working together.
Software-based load-testing tools require a capture or recording agent that records real-world user activity-including HTTP transactions-into script format. These scripts are then executed from a single driver machine, which behaves as if it were an infinite number of PCs and load tests the server. This method is cost effective, especially for large numbers of users. It's easy to change the script content and/or the number of scripts, which simplifies checking the scalability of your Web site as it grows.
A combined hardware/software-based solution allows you to drive both virtual users and client PCs from one control monitor using the same scripts. This ideal test configuration allows simultaneous measurement of both client and server, as well as easy detection of network bottlenecks.
Planning. This step can itself be divided into four parts:
1. Define and prioritize your goals. An example of a goal might be to demonstrate scalability, that is, to find the maximum number of concurrent hits that meet the requirements of acceptable response time and minimum throughput. In such a test, response times are reported as a function of the number of concurrent hits. Another example is to determine "breaking points," that is, the outright failures of the hardware, software, operating system, database server, or other system components, including failures due to insufficient memory or similar resources. A breaking point should be based on response time and/or throughput. For example, the breaking point for a short query might be ten times worse than the acceptable response time, whereas the breaking point for a long query might be two times worse than the acceptable response time.
2. Determine the number and size of your database(s). In creating test databases, you must choose between real-life and artificially created databases. Choose the one that gives you the most realistic test database with a reasonable amount of effort.
3. Determine whether your databases need refreshing between tests. If so, to what degree? The optimal case requires the least refreshing for the most testing. Database refreshes between tests can be time consuming, especially with large databases; often, refreshing can take more time than the actual test.
4. Define the application workload. For example, a bank wants to test its intranet application, so the company creates ten different scripts of typical user activity. Script 1 might be an employee submitting an expense report; script 2 might be an employee searching for information about medical insurance; and so on. In the end, the bank might have ten scripts, each representing a common user activity. Script 1 might account for 20 percent of the calls, whereas the other nine scripts comprise the other 80 percent. In this way, the bank would design a realistic load to test its internal Web application.
Building the scripts. Scripts are built using a recording agent that captures user activity and HTTP transactions from any browser. The recording agent captures not only typing and mouse movements, but also the time it takes for the employee to think and pause. More important, the recording agent captures the HTTP transactions between the PC and server and measures the time required for the numerous connects and disconnects. It emulates at SLIP and PPP connection speeds-even when driving users from a machine on the same network as the server under test.
Without the recording agent, you would have to create scripts by hand, and since there is no foolproof method to recreate HTTP transactions generated by today's development tools, there would be no guarantee that the test were accurate, making the results meaningless.
Generally, load-testing tools require some programming knowledge (usually C) to modify scripts that emulate complex environments. However, advanced products offer libraries of commands to simplify script modification. For example, reading from files of shared input should be simple for the script writer.
Once the scripts are captured, they must be transferred to a driver machine for compilation and execution. At this point, you can alter them for realism, adding functions, loops, and think delays. You might also adjust typing and mouse-pointer delays, and utilize global variables and fixed-throughput pacing. You can add looping to make a single captured activity act like many, branching to randomize function order and change data so your emulated users don't make identical queries and updates. User entries can be substituted by generating random values, sharing pools of input on the driver, accessing data returned from the application under test, or passing data between scripts. With simple programming changes, the replayed stream represents exactly the requests generated by multiple users continuously operating a Web application.
Replaying the scripts. This determines the effect of the emulated workload on the system. Replay is accomplished by the touch of a button, and can be altered to increase slowly the number of users on the system. The driver machine executes scripts, which stresses the server.
Analysis. With graphs and charts depicting the performance of an application under the weight of many users, test engineers spot bottlenecks and system slowdowns. Why did the application work perfectly with 50 users, but break with 51? Why did the application perform to expectations with 100 users, but take a plunge with 200?
Because load-testing scripts can be saved, a test engineer can stress an application any time a piece of the application changes. This is particularly useful after there is a new release of the operating system, changes to hardware, or upgrade revisions of the application with new functionality or bug "fixes."
Note that while response-time charts can demonstrate application performance, there is no substitute for practical human feedback, so consider inviting a few "live" users to browse the Web site being tested. These users frequently provide valuable insights that no report can offer.