The Legacy Code Crisis ? A Security Threat

DETROIT – Legacy code is code that is kept operating because it is still useful. Age alone does not dictate whether a program meets the ?legacy? standard. The simple test for legacy code is whether anybody in the organization knows for certain how the software does what it does.

Economic reality alone ensures that American corporations are riddled with legacy code. The fact is, very little software is written from the ground up these days. Instead, a company?s inventory is normally composed of new, or customized code segments, which are well tested and trustworthy, but which constitute a small proportion of the actual code base. Then there is the rest of the inventory, which can generally be confirmed to work. But which, since the details of its actual functioning are not known, could also contain a host of defects and malicious objects.

As a result of this, American corporations now sit on huge, complex application bases that contain large proportions of unknown code from unidentified sources. Worse, although it is possible that the supplier wrote the entire program, it is more likely that some of it was purchased from any number of potential outsourcers. Which means that the suppliers themselves might not know exactly how their own products work.

It is this lack of transparency that makes legacy code a security risk. Obviously if the functioning of the code is unknown, it is hard to say for sure that it can be trusted. And, if the majority of America?s software is a mystery to the people who operate and maintain it, it might be safe to say that much of the nation?s software is insecure.

There are several potential outcomes, only one of which is good. The first option is that the software might be defect free and function as intended for its useful lifecycle. Because the actual contents of the program are not known this can never be confirmed, but it is at least a possibility.

The second option is a lot darker. There might be objects intentionally embedded in the code that would open the system to attack. This could lead to all of the nasty consequences that you can think of, plus some that you have probably never imagined.

In addition there might be also inadvertent errors. These are very common and they could eventually lead to a significant breakdown in security and performance. Because the legacy portion is essentially untested the risk that this represents is also unknown.

Finally, due to the interconnectedness of systems and the pervasiveness of the computing environment there is always the potential that a secure piece of software can be turned into a highly insecure one by a minor change during the lifecycle. This is always a possibility and since the architecture of the program is unknown, it can never be prevented.

The University of Detroit Mercy is currently engaged in a research study, funded by the Department of Defense (DoD) and the National Security Agency (NSA). The goal of this study is to develop a common process for understanding and securing a complex legacy code base. The outcome of this UDM research will be a universally applicable methodology and tool that will allow organizations of any size to both fully understand their legacy code base, as well as securely manage its evolution.

The value of this study to national security is sufficient enough that Joe Jarzombek, the National Director for Software Assurance for the Department of Homeland Security visited Detroit on Thursday, October 20th to talk to the participants and discuss future directions. This visit will be detailed in an article next month.

Dan Shoemaker heads the Information Assurance Program For University of Detroit Mercy.