A GSG CASE|
Rugs, Inc. vs Sys, Inc.
Rugs, Inc. and Sys, Inc. are fictitious names chosen to disguise the identities of the parties in this case history from the files of General Systems Group.
In the late eighties Sys, Inc. sold Rugs, Inc. a computing system to handle Rugs, Inc. order processing. Rugs, Inc. a carpet wholesaler, depended in a crucial way on being able to dispatch the orders rapidly and accurately, to its carpet retailers customers.
At the time of the system purchase Rugs, Inc. was already utilizing an older generation computing system, which was able to process the orders in a reasonably satisfactory fashion.
The Sys, Inc. system was installed without allowing for a phase in period. I.e. a period of parallel operation to verify that the new system was operating correctly. Such parallel operation is extremely difficult to simulate in a realistic fashion in an order processing environment. In fact, upon executing an order one has to reduce the stock on hand in the warehouse. Running two systems in parallel means that one of the two systems has to simulate the access to the warehouse otherwise stock reduction would double.
Shortly upon installation of the new system the staff of Rugs, Inc. begun to notice strange going-ons. Orders were fulfilled with the wrong kind of carpet, for the wrong kind of footage, etc. At the same time, it appeared as if the new computer system did not know what the real status of the warehouse was. In fact, it would declare certain types of carpet as in need of reorder when physical inspection of the stock would reveal plenty of carpet on hand, or the converse. Namely, carpet reported to be available in abundance, on physical inspection, was found to be low or non existent.
With passing of the days the situation rapidly grew worse. No attempt on the part of the Rugs, Inc. staff was able to remedy the situation. No matter how fast Rugs, Inc. staff kept entering correct figures for the inventories the system proceeded to rapidly corrupt them.
A few months after installing the faulty system Rugs, Inc. was forced to close the doors since its retailers customers had left in droves, due to Rugs, Inc. unreliable and faulty orders' dispatching. Thus a business that had thrived for over a half century was destroyed by a faulty computer system.
These events triggered a number of investigations and eventually a law suit. The latter ended up in a mistrial. After over five years and with a number of experts having looked at these events and the associate evidence, still no one had a clear idea of what had happened at Rugs, Inc.
GSG, Inc. was called in by the second litigation team organized by Rugs, Inc. Shortly after we had read all the relevant depositions, from the previous trial, we diagnosed that the probable cause of Sys, Inc. system failure was an improperly designed or at least improperly implemented Concurrency Control System (CCS). CCS protects a computer system from the problems that arise when more than one program attempts to write on the same spot on the computer's disk. It also protects from problems that arise when the information read (from disk) by one program is altered by a second program before the first one is done.
The first kind of problem results in the second program obliterating what ever was written by the first program. The second problem is subtler. We can explain it with an example. Suppose that two people in two different cities call in a flight reservation for the same flight and same day. The travel agent for the first customer will call up on his computer the flight and select a seat which is still available. Now if a second agent calls the same flight, before the first program has finished, the airline reservation system will show that seat as still available. The result is that two people can end up assigned to the same seat.
A diagnosis, of course, does not provide evidence that will stand up in court. However, the correct diagnosis of a computer systems problem is absolutely necessary to focus and orient the discovery activities in the right direction. Once we made our diagnosis of incorrect CCS design we immediately proceeded to formulate an approach to either validate or refute the diagnosis. The approach, in turn, suggested what materials should be the subject of discovery.
There are two ways that one can test an hypothesis of inadequate or malfunctioning CCS. They are:
In this case we recommended to use both approaches. We were lucky to unearth the actual system which had been at Rugs, Inc. abandoned in a warehouse where it had been since the dissolution of Rugs, Inc.. Later during our experiments we were thrilled to find out that the system still had on its disk the record of some 2,500 orders which had been processed in the last few days before the shutdown!!!
We were able to restore the system and get it to run. We then had a number of sessions where clerical personnel, recruited from a temp agency, sat at the terminals and entered orders just as it was done at Rugs, Inc. before the shut down.
The early experiments showed us that a low percentage (between 1% and 2%) of all orders where being corrupted by the system. This percentage of failure, however, was too small to account for the total database meltdown which had been observed many years before at Rugs, Inc. We remained baffled by this discrepancy for a while until one of the GSG, Inc. engineers noticed that each case of database corruption corresponded to a carpet roll being accessed by more than one clerk at the same time.
This of course confirmed that CCS failure was implicated, but still did not account for the intensity of the data corruption experienced at Rugs, Inc. A review of the experiments by our senior technical management revealed that the GSG's engineers had designed the tests so that the orders were nearly uniformly spread across the carpet rolls in the warehouse. However in the real world orders are not uniformly spread across all kinds of carpets. At any given moment some carpets are more popular than others, so orders tend to bunch up on popular items. In the system design business this condition is referred to as the database having hot spots.
Upon this observation GSG engineers redesigned the test, by obtaining from carpet marketing experts, indications of how orders would tend to concentrate on the most popular, next most popular, etc.. carpet. Once the more realistic redesigned test series was conducted we found that 19% of all order (i.e. approximately one every five orders) ended up being corrupted by the computer system. Therefore, resulting in a bad delivery to a customer. No business can survive one order in five being mis-handled. This finally gave firm empirical proof that a defective CCS was at the heart of the difficulties and that it could explain the massive extent of damage that had been experienced at Rugs, Inc. many years before.
GSG conducted, in parallel with these experiments, a thorough review of the code for the order processing system. This review gave very hard, irrefutable evidence of glaring errors in the implementation of the CCS. But, more interestingly supplied much more damaging evidence:
The cost of these techniques paled when compared with the damage award, and
was not much greater that traditionally techniques based on getting
circumstantial evidence via depos and trial interrogatories. The advantage
of these techniques is that the evidence collected is factual
which can be, as any scientific evidence, replicated or duplicated
in a reliable way, for the benefit of the jury, judge and even the opposing