Why operational systems and programs require testing, debugging and refactoring.

An operational environment, such as a web site and web applications with their corresponding data and infrastructure, has many similarities to the development of an executable program.  Each of them has a collection of data, logic and resource requirements to perform a function, through a series of instructions, both of them process their execution requests against their data and run-time environments.

When an error occurs, an exception is thrown, sometimes loudly, sometimes it is quietly caught and ignored.  Many classes of errors can’t be found until the program is run, because the compiler cannot test for many kinds of logic problems, or issues with the run-time environment, and parts of the program that are interpreted at run-time may have problems that cannot be tested in advance of being run due to the flexibility of interpretation at runtime.

Similarly, when initially configuring an operational system, some errors can be found on setting up the services and storage, when making sure all the pieces connect together.  Like running a program and not seeing any errors on completion, the operational system can be tested for functionality while it is first put together.

However, those many classes of errors that can’t be found by the compiler may still exist in the areas of the code that have not yet been run, and as a general rule, they are always present.

In the areas of a program or a system that has not been tested, there are will be problems that halt the operation, and may or may not be immediately obvious what is wrong.  Additionally, the fix for the problem may not be something that can be done without a number of changes being made to the program or system.

In the case of a program, this is expected.  After writing a program, it is known to require a testing and debugging phase, and lately it has also become popular to refactor code so that the design stays robust and healthy, and does not become more entwined with coupled logic which becomes harder to change as it grows in scope and size.

Due to the similarities between a program and an operational system, operational systems also require testing, debugging and refactoring before they should be deployed from development to production.  The stages of development, testing, debugging and refactoring have become common sense in the development community, but are not always seen as important in the operational community.

An excellent method in both communities to quickly achieve a result is to build a rapid prototype.  This provides the creation of an functioning program or system, that can be proved to function, and that provides enormous insight into the requirements for integrating all the components together to create that program or system.

Additionally, rapid prototyping allows a working blueprint for how to transition to a production quality program or system.  By having a functioning prototype, all the pieces can been seen functioning together.  All the challenges of making the program or system work reliably will have experience to back up the initial plans and theories.

Once the rapid prototype is available, translating each of the visible requirements can be done rapidly, as new plans can be drawn up after reviewing the areas where the prototype’s design worked strongest and weakest.

Shortly thereafter, a tested, debugged and refactored program or system can be released with confidence that supporting it’s usage over time will be manageable because of the valuable insight the prototype provided and the robust improvements added by the production release.

After all, unlike a program which is simply stopped and started, an operational system is running 24/7.  Upgrading an operational system while it is carrying traffic is significantly harder and slower, and so significantly more costly, than before it has started to carry traffic for perpetuity.

A short canary test of the prototype, to determine how well it functions under user traffic, can provide another level of additional insight, and can take place in parallel with preparing the production release.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: