This entry was originally published on 2019-07-24.
By Chris Crone and Sarbinder Kallar.
Db2 12 Function Level 505 introduces a new Encryption Built-in Function (BIF) ENCRYPT_DATAKEY that can be used to encrypt data at the column level using the AES algorithm.
Previously, Db2 had the ENCRYPT_TDES function, but it was not widely used or enhanced to support AES because it required the application to manage the encryption key. Encryption itself is easy because algorithms are standardized, even to the point that HW implementations exist, key management, however, is hard. Encryption without key management is like buying a good lock and hiding the key under the mat.
What’s different now? The second argument to ENCRYPT_TDES (the old Db2 encryption BIF) is the encryption key and the second argument to ENCRYPT_DATAKEY is a key label. Why does this make a difference? With an encryption key, once you know the key, you simply use it to decrypt the data. In the case of a key label, knowing the key label is not valuable unless you also have authority to use the key label in RACF (See Steps for RACF-protecting keys and services for further information). Hard-coding a key in an application is a security exposure (as is fetching the key from a table or reading it from a file), but encrypting or decrypting using a key label does not expose the data to an unauthorized user. This means that key labels can be visible in tables, files, or source code… but are worthless to users who do not have access to utilize that key label.
Let’s take a simple example with a table and a query from an end user – to explain how the new capability works.
CREATE TABLE SAMPLE_TABLE (NAME CHAR(20), PHONE_NUMBER CHAR(12));
CREATE INDEX IDX ON SAMPLE_TABLE(NAME);
INSERT INTO SAMPLE_TABLE VALUES ('Chris','958-555-5309');
INSERT INTO SAMPLE_TABLE VALUES ('Sarbinder','958-555-1234');
SELECT NAME, PHONE_NUMBER FROM SAMPLE_TABLE;
NAME |
PHONE_NUMBER |
Chris |
958-555-5309 |
Sarbinder |
958-555-1234 |
The result of a recent audit determines that PHONE_NUMBER is personal information and auditors are saying that the column needs to be encrypted. We can modify the statements above as follows to encrypt PHONE_NUMBER:
CREATE TABLE SAMPLE_TABLE (NAME CHAR(20), PHONE_NUMBER VARBINARY(95)); --see Note 1 below
CREATE INDEX IDX ON SAMPLE_TABLE (NAME);
INSERT INTO SAMPLE_TABLE
VALUES ('Chris',ENCRYPT_DATAKEY('958-555-5309','MYKEYLABEL',AES256D)); --see Note 2 below
INSERT INTO SAMPLE_TABLE
VALUES ('Sarbinder', ENCRYPT_DATAKEY('958-555-1234','MYKEYLABEL',AES256D));
SELECT NAME, DECRYPT_DATAKEY_VARCHAR(PHONE_NUMBER) FROM SAMPLE_TABLE;
NAME |
PHONE_NUMBER |
Chris |
958-555-5309 |
Sarbinder |
958-555-1234 |
It should be noted that while Db2 requires the key label to be specified for the ENCRYPT_DATAKEY function, for the DECRYPT_DATAKEY_xxx functions a key label is not required.
It is also noteworthy that there is one encryption function, ENCRYPT_DATAKEY, that takes an expression of any supported datatype as the first input. However, for decryption, there are eight functions, each specifying an output datatype. For example, DECRYPT_DATAKEY_INTEGER returns a integer, and DECRYPT_DATAKEY_VARCHAR returns a VARCHAR. There are no fixed length string versions of the decrypt functions, even if the input was a fixed length string. The output will always be variable length. The result of encrypting data of any data type is a variable length binary string (VARBINARY or BLOB). Decrypting data takes the variable length binary input and returns the datatype as indicated on the function. The data being decrypted must match the input data type to the encrypt function. If you encrypt an integer, you must decrypt using DECRYPT_KEYLABEL_INTEGER or you will get an error (SQLCODE -171).
While the above changes do encrypt the data, the application needs to change their SQL to encrypt/decrypt. That may be OK for new applications, but avoiding application changes for existing applications is desirable.
There are various mechanisms that Db2 for z/OS provides that may be used to more transparently encrypt the data.
Method One – VIEW
By changing the DDL to use a view and instead of triggers, we can avoid changes to the INSERT and SELECT statements in our example:
We can change the above DDL so the INSERT and SELECT remain unchanged.
RENAME TABLE SAMPLE_TABLE TO SAMPLE_TABLE1;
CREATE VIEW SAMPLE_TABLE (NAME, PHONE_NUMBER) AS
(SELECT NAME, DECRYPT_DATAKEY_VARCHAR (PHONE_NUMBER) FROM SAMPLE_TABLE1);
CREATE TRIGGER TR1 INSTEAD OF INSERT ON SAMPLE_TABLE REFERENCING NEW AS NEW_DATA
FOR EACH ROW MODE DB2SQL INSERT INTO SAMPLE_TABLE1 (NAME, PHONE_NUMBER)
VALUES (NEW_DATA.NAME, ENCRYPT_DATAKEY(NEW_DATA.PHONE_NUMBER,'MYKEYLABEL', AES256D));
The INSERT and SELECT are now the same as they were in the original example.
INSERT INTO SAMPLE_TABLE (NAME, PHONE_NUMBER) VALUES ('Chris','958-555-5309');
INSERT INTO SAMPLE_TABLE (NAME, PHONE_NUMBER) VALUES ('Sarbinder','958-555-1234');
SELECT * FROM SAMPLE_TABLE;
NAME |
PHONE_NUMBER |
Chris |
958-555-5309 |
Sarbinder |
958-555-1234 |
Using a view, and an instead of trigger, we can hide the encrypt/decrypt of the data from the application for a simple application. That said, encryption isn’t completely transparent. The results of DESCRIBE will be different because the length of PHONE_NUMBER will be 12 in the case of the base table, but the length attribute returned by the view will be 79 (see note 3).
One other drawback of using a view is that if you use a predicate on PHONE_NUMBER with the view, such as:
SELECT * FROM SAMPLE_TABLE WHERE PHONE_NUMBER = ‘958-555-5309’ ;
Db2 will re-write this query as:
SELECT * FROM SAMPLE_TABLE WHERE DECRYPT_DATAKEY_VARCHAR(PHONE_NUMBER) = ‘958-555-5309’ ;
While this rewrite will enable the query to work without the application knowing the column is encrypted, the predicate will be stage 2. Furthermore, the only way to evaluate the predicate is to decrypt every row and compare it to the literal ‘958-555-5309’. If your query is written as:
SELECT * FROM SAMPLE_TABLE WHERE NAME = ‘Chris’ AND PHONE_NUMBER = ‘958-555-5309’ ;
The predicate on NAME is index-able, and will filter the rows that need to be decrypted. As such you need to understand the SQL that will be used with the encrypted data to understand if a solution based on VIEWs is acceptable.
Method Two – TABLE FUNCTION
A table function can be used to solve the DESCRIBE issue, as well as the predicate issue, but may not be as transparent as a Method One.
CREATE FUNCTION SAMPLE_TABLE_FUNC (INAME CHAR(20), IPHONE_NUMBER CHAR(12))
RETURNS TABLE (NAME CHAR(20), PHONE_NUMBER CHAR(12))
LANGUAGE SQL
READS SQL DATA
NO EXTERNAL ACTION
DETERMINISTIC
RETURN
SELECT NAME, DECRYPT_DATAKEY_VARCHAR(PHONE_NUMBER)
FROM SAMPLE_TABLE1
WHERE
NAME = INAME
AND
PHONE_NUMBER = ENCRYPT_DATAKEY(IPHONE_NUMBER,'MYKEYLABEL',AES256D);
With the table function, DESCRIBE now returns the same length for PHONE_NUMER as it would for the original table. Additionally, the predicates for the following query:
SELECT * FROM TABLE(SAMPLE_TABLE_FUNC('Chris', '958-555-5309'));
are both stage 1. It is possible to create a VIEW on top of a TABLE FUNCTION, which would allow you to use the same SELECT syntax that we started with ( SELECT NAME, PHONE_NUMBER FROM SAMPLE_TABLE; ), but in order to do that, we have to use compound predicates and wild cards that would make the predicates stage 2 (which defeats the purpose of the Table Function).
What about the key label?
Key Labels are protected entities. You have to be authorized to use the key label. In my examples, I didn’t bother trying to hide the key label, I simply hard-coded “MYKEYLABEL”. There may be cases where you need more flexibility and want to use a global variable, host variable, or parameter marker to provide the key label. If that is the case, making encryption/decryption transparent (or at least less intrusive) to the application will be more difficult because you will likely require changes to the application. For example, if you used a global variable, you would need to initialize the global variable. With distributed applications you can set the global variable using profile tables. Unfortunately, profile tables don’t work with local applications, and currently profile tables only support selective global variables, so you would need an application change to use a global variable. If you wish to provide the key label in a host variable or parameter marker for the key label, you would also need to change the application.
Conclusion
With the introduction of the ENCRYPT_DATAKEY and DECRYPT_DATAKEY_xxx functions, Db2 has provided the capability to use key labels to provide key management capability for encryption and decryption at the column level. In this blog, we discussed some of the mechanisms that DBAs can use to enable applications to use this new encryption capability with no, or minimal change to applications. While it may be possible to make encryption completely transparent to an application, that may not be the best choice when it comes to performance. It is important to know not just that data needs to be encrypted, but also how that encrypted data will be accessed to understand the best choices for transparency and performance.
- Note 1: The length of an encrypted column uses the following formula to determine the size of the binary field used to contain the encrypted string:
((((length attribute of source data in bytes + 16 - 1) / 16) * 16) + 15-byte fixed header + 64)`
Chris Crone is an IBM Distinguished Engineer for Db2 for z/OS development and Sarbinder Kallar is a software developer for Db2 for z/OS.
#Db2forz/OS#db2z/os#Db2Znews