IBM Destination Z - Group home

Modern data security

By Trevor Eddolls posted 12 days ago

  
Fully homomorphic encryption


It goes without saying that data security is extremely important. No-one wants to see company confidential information for sale on the dark web, and no organization wants to start paying huge fines for losing personally-identifiable information about customers or staff. To that end, data is usually encrypted when it’s sent anywhere (data in transit) and it is encrypted while it is simply being stored somewhere (data at rest). And, most of the time, that seems to be enough. Obviously, while you’re working on the data, it needs to be decrypted – doesn’t it? Unfortunately, that leaves a loophole that hackers can exploit. They can steal (exfiltrate) the data while it’s being worked on, while it’s unencrypted. So, what can be done about that?

The answer is Fully Homomorphic Encryption, which is usually referred to as FHE. So, what is FHE? What can it do? And why does it have such a strange name?

Wikipedia tells us that “Homomorphic encryption is a form of encryption allowing one to perform calculations on encrypted data without decrypting it first. The result of the computation is in an encrypted form, when decrypted the output is the same as if the operations had been performed on the unencrypted data.” The idea of homomorphism comes from algebra, and it refers to a structure-preserving map between two algebraic structures of the same type. Homos is Greek for the same and morphe means 'form' or 'shape'. For a long time, only partial homomorphic schemes were available that supported only addition operations or only multiplication operations. The word 'fully' was added to the name in 2009 when IBM found a way to support both operations. At that time, the process was slow – but not any more!

By using FHE, authorized people can manipulate data while the data remains encrypted. This minimizes the time that the data is vulnerable to exfiltration. With the addition of other techniques, it’s possible decrypt only the part of a file that a person is entitled to see and needs to see to do their work. Again, increasing the level of security on the data as a whole. In a nutshell, what FHE provides is the ability to keep data protected and processed at the same time – something that has never been possible before.

But there’s even more to it than that. It now means that people in the financial or healthcare sectors can share data that previously they couldn’t. It means the data can be used for analytics or cross-industry collaboration, but at no time are the people looking at the shared data given access to private data. So, insurance companies could run analysis on patient healthcare data without any personally-identifiable information being visible to the insurer.

Just looking at the cost of a data breach shows how important fully homomorphic encryption is. IBM’s Cost of a Data Breach Report 2020 from June found that the global average total cost of a data breach in 2020 is $3.86million – with the USA having the highest country average cost at $8.64million. The report also found that the average share of data breach costs incurred more than a year after the data breach is 39 percent.

You may still not be clear how FHE works. Let me give you a simple example. We’re talking about mathematical operations on encrypted data. Let’s suppose that I encrypt three numeric values, and I send you the encrypted values for those numbers. You can’t see what the values are. There’s no public key that you need to use to do any work on the numbers I sent. But, if you want to add those numbers or multiply them, you can by using the encrypted values. When you return the result to me, it will be mathematically correct – as if you’d worked on unencrypted data. The difference is that you will never know the true values of the input data or the output data.

Looking at this in more detail, the actual unencrypted data is embedded inside a large polynomial. Surrounding the data is intentional random noise. This hides the data, making it impossible to work out what’s real data and what’s noise. Operations on the ‘real’ data happen by manipulating the polynomials. This then adds more noise.

Is there a downside? Originally it was very slow. Now, it needs around 40 to 50 times the computing power and 10 to 20 times the amount of RAM that doing the same thing with unencrypted data would use.

On the plus side, in the event of quantum computing really existing, it will be very difficult to crack the code. Quantum computers, it’s thought, will be able to do brute force attacks on the kinds of encryption technology we’re all familiar with in seconds or minutes. Current computing platforms are thought to take hundreds of years. Current encryption is based on prime numbers. Fully homomorphic encryption algorithms use lattice-based encryption, which are thought to be quantum-computing resistant.

IBM recently released toolkits for people to start working with FHE. There are toolkits for MacOS, iOS, Linux, and Android. Each toolkit includes sample programs and IDE integration, making it simple to start writing FHE-based code. The toolkits are available on GitHub. Each toolkit is based on HELib, the world’s most mature and versatile encryption library, and includes sample programs.

The FHE toolkit for Linux runs on IBM Z (Ubuntu) and x86 architectures. Unlike the MacOS and iOS toolkits, which are based on Xcode, the Linux toolkit is distributed as a Docker container. The containers have been built and tested on several distributions. A pre-built Docker image is also available on Docker Hub.

As to the future, there are plans to bring new AI functionality to the toolkits. The developers are also exploring performance improvements to the FHE toolkits and underlying libraries using the capabilities provided by IBM Z.

Fully homomorphic encryption definitely looks like the way to go to ensure data integrity. It’s definitely worth having a play with it now ready for the next version (or the one after that), which will be needed as hackers get better at accessing supposedly secure data.