Fixing Bugs in Complex Software

In my previous post I gave a list of Akin's Laws. I noted things I felt might be similarly true for both software design and spacecraft design. One of these was:
  • When in doubt, estimate. In an emergency, guess. But be sure to go back and clean up the mess when the real numbers come along. 
In what way do I think this applies to software engineering? When testing complex software.

From the blog posting A Fork in the Road by Matt Youell is a relevant quote:
Modern software systems contain so much abstraction and layering that it is really hard to judge the level of effort that will be involved in addressing any one problem.
Youell goes on to describe two very different ways of trying to find a bug in a "quite tangled system".

A really tough bug is often occurring at a level of abstraction or layer very much below the application level. This is what can make it hard to find. This is code we most likely did not write. There may be a misunderstanding about how a layer behaves. The bug may actually be in the application's operating system or is a bug in one of the libraries the application is using. It may even be showing up due to the way libraries interact.

The blog posting points out that for such complex bugs, it may be better to just refactor a portion of the code rather than try to track down the exact location of the bug. Refactoring often means that different abstractions and layers are used and used in different ways. (And, obviously, it should get rid of a possibly flawed application layer algorithm.)

But the point of the law given above is that we must be aware that this leaves a real mess as far as testing the refactored code is concerned. How do you test that a bug has been fixed if you did not understand the nature of the bug in the first place? This is over and above having to revalidate and reverify the refactored code from scratch.

No comments:

Post a Comment