|
Synopsis
Intermittent non-repeatable field-reported failures
that were having a significant impact
on E-business profitability
were effectively identified by running regular eValid
transaction oriented monitoring tests
over a period of time combined
with careful analysis of the eValid playback event logs
corresponding to the failures detected.
Background
Over a period of time the customer had observed server failures that caused
loss of E-business transactions --
but which were not directly identifiable internally.
The most common failure appeared to the website
was an abandoned shopping cart,
but the reason for the abandonment was note clear.
Regular human navigtion of the site failed to uncover any problems, and even recreating individual transactions "always seemed to work OK" when done manually. The common thread to the failures was that they all seemed to happen during periods of heavy load and during a user sequence in which the website had accumulated a great deal of context.
eValid Application Description
To assist in the diagnosis and repair process,
eValid created three realistic E-business simulation scripts.
These scripts were "deep" transactions in the sense that they
performed a large number of steps, created a local context,
and worked all different parts of the web server "back end" functions.
We set these scripts up in "monitoring mode" using an in-house machine connected to the web via standard DSL lines. This effectively simulated real users, over a typical last mile connection.
Results Achieved
Here is a summary of the results achieved
after several weeks of continuous monitoring with these transaction scripts.
All of these observations were made by careful study of the "red screen" event logs.
All of the events noted were detected essentially "at random" because the majority of the tests, 10/hour, 24x7, or 1,680/week PASSed without incident.
Results Obtained
The net result of this forensic event log analysis work was to
identify, from the evidence collected,
a number of relatively simple required changes
in the server front-end and back-end processing activity.
After changes to the server complex were completed,
over a period of several weeks,
the rate of "red screen" playbacks dropped significantly.
The overall result of these improvements was a significant increase in sales for the company due to avoidance of loss of "in process" sales, and improved customer satisfaction due to user reliability enhancements.