IBM Destination Z - Group home

Application Testing: The Data Challenge

By Destination Z posted Mon December 23, 2019 03:24 PM

Rigorous testing should always be a priority if you’re developing new applications or modifying existing ones. But application testing continues to bring with it a variety of challenges and headaches.

One of the biggest issues relates to actually generating the data to use during the testing process. The simplest solution has always been to copy existing production files. And if the applications are restricted to storing internal company data rather than customer information, then this might only cause real concern if the data comes from the payroll system, because copying these production files for testing would reveal exactly how much everyone was earning.

But using production data from customer-focused systems is much more problematic. Apparently it’s still legal to use customer data for testing under data privacy regulations in some countries, but only if you write to each individual customer first and ask if this is OK—and offer an opt-out option. That’s not something most companies are going to try to implement!

Regulations limit the extent to which you may keep copies of your production files. Only copies that are essential to safeguard and act as a security backup of the original are permitted. Using and giving unfettered access to production data in any other event, even if it’s only “internally” and without moving any data outside the computer room environment, can often find you at odds with the latest regulations.

So what’s one to do in terms of testing? While imposing restrictions on using “real” data for testing, the regulations also require you to prove you’ve tested all the possible processing paths that a transaction and its programs may follow. This can only be done by having an extensive set of test data that represents all the possible scenarios that might occur.

So the requirements for full, auditable testing and data privacy seem to be at odds with each other. In reality, the rules are often ignored with transgressions occurring on a regular basis as expedience wins out over formally sticking to the rules.

The situation is made much worse when development, testing and production are not all contained within one single IT operation. With the rise in offshoring and outsourcing of development, operations and support, personal data could be leaving the confines of computer rooms to which it should be physically restricted on a regular basis, and online access to it may often be available to an even bigger community of IT “coworkers,” who shouldn’t have access to it at all.

One solution is to take extracts from product files and databases, then mask, scramble and hide data in sensitive fields (thus depersonalizing it) before making it available for testing and other purposes. Of course, this means some sort of direct access to the data in its raw format is needed to allow someone (or a small group of people) to perform the depersonalization process before the resulting data can be passed onto others for more general use, Nevertheless, it represents a significant improvement in the situation.

A more complete solution would be to use a tool that incorporates a data privacy component that never permits direct access to certain files and databases, but instead automatically applies predefined scrambling, masking, etc., rules to certain fields in those files and databases as it extracts the data for the user to work with. This, in effect, provides a form of program-pathed access to the data, rather than direct access. The result is a simple, effective solution that guarantees complete anonymity of customer data.

Surely, when such straightforward solutions to the problem of creating realistic data exist, there can be no more excuses for continuing to bend the rules and risk the wrath of the authorities and customers—along with the humiliation and financial penalties when the truth eventually comes out.

Philip Mann is principal consultant and mainframe performance management expert for Macro 4. He has been working with IBM mainframes in excess of 30 years, including more than 10 years with Macro 4, where application performance tuning is one of his interests.