Building Aadhaar Data Vault Solution on IBM LinuxONE
Sandeep Batta, Timo Kussmaul, Anbazhagan Mani, Divya K Konoor, Peter Szmrecsanyi
Introduction
An “Aadhaar number” is a 12-digit unique identification number issued by the Unique Identification Authority of India (UIDAI) to every individual in India. Under the Aadhaar Act of 2016, an Aadhaar number can uniquely identify residents in India. It can be used to avail various services including services involving financial transactions and government sponsored programs. Given the sensitive nature of the transactions, it is very important to secure the Aadhaar number itself and all the systems that are created around it to provide services to the citizens of India.
What is Aadhaar Data Vault?
Keeping in the view of privacy of Aadhaar Numbers and its related data, the Unique Identification Authority of India (UIDAI), via its circular in 2017, has made it compulsory to store all Aadhaar Numbers in a Centralized Dedicated storage in encrypted form identified as “Aadhaar Data Vault” (ADV).
Aadhaar Data Vault enables e-Governance applications in eliminating Aadhaar footprint in IT eco-system and builds an abstraction layer (Reference Key) to safeguard Aadhaar Numbers and its related data. The mapping of reference key and Aadhaar number should only be maintained in the ADV. All the business systems and databases used within the enterprise should only use the reference key.
All the agencies which need to work with Aadhaar Numbers may use Aadhaar Data Vault Service which results in low risk of unauthorized access of Aadhaar Numbers and its related data within organization systems. UIDAI mandates the use of a Hardware Security Module (HSM) to generate and protect encryption keys used to protect Aadhaar Numbers and all the associated processes built around it.
High Level Generic Architecture of Aadhaar Data Vault
Figure 1: Generic Architecture of Aadhar Data Vault
The Generic Architecture of Aadhar Data Vault Solution comprises the following functional components:
Tokenizer Service
The Tokenizer is a standard application that converts a given Aadhaar number into a “reference-key”, accessible via APIs, as described in UIDAI-documentation. Given the sensitivity of the Aadhaar Number and the associated KYC data being handled, we propose to fortify the platform itself and making it a “secure tokenizer”, accessible to business applications. More on how to secure the platform as we dig deeper in the next section
Aadhar Data Vault
The Aadhaar Data Vault (ADV) is a centralized storage for all the Aadhaar numbers collected by the authorized agency for specific purposes under Aadhaar Act and Regulations. It is a secure system inside the respective agency’s infrastructure accessible only on a need-to-know basis.
In the conceptual architecture above, “Tokenizer” will be able to process all the sensitive information using the services of a Hardware Security Module (HSM)
HSM
A hardware security module (HSM) is a device or service that safeguards and manages secrets like cryptographic keys, and performs cryptographic functions such as key creation, key derivation, encryption, decryption and signature. An HSM contains one or more cryptographic processor
Business and KYC applications
The ADV integrates with business and KYC applications through the ADV API. This comprises integration with CKYRR systems and components including services such as for example for de-duplication using AI based matching algorithms.
Why IBM LinuxONE?
IBM LinuxONE is an enterprise-grade Linux® server that brings together IBM’s experience in building enterprise systems with the openness of the Linux operating system. LinuxONE offers a sustainable and cyber-resilient platform for hybrid cloud and AI applications, which can also help reduce TCO through workload consolidation. The IBM Z® and LinuxONE systems are known as being highly securable platforms, and come with key security technologies, some of them as listed below:
Confidential Computing, through IBM Secure Execution technology for deploying sensitive workload and for processing data within a hardware based secure enclave
Hardware Security Module (HSM), tamper-responding HSMs (a.k.a. IBM Crypto Express adapters) that support cryptographic operations using secure keys. These secure keys can only be used on the configured HSM and the plaintext value of a secure key is never observable inside an operating system. IBM HSM on LinuxONE have earned the highest level of certification, FIPS 140-2 level 4. You can plug up to 60 Crypto Express adapters into an IBM Z® or LinuxONE system. Each adapter can be logically partitioned into up to 85 domains, each acting as an independent virtual HSM. This partitioning allows thousands of virtual machines can access a dedicated virtual HSM
Pervasive encryption and protected key cryptography - Each processor of an IBM Z or LinuxONE system has a special component called Central Processor Assist for Cryptographic Functions (CPACF). CPACF accelerates the most common cryptographic operations that are standardized by the US National Institute of Standards and Technology (NIST), for example AES, SHA2, SHA3, ECDH, and ECDSA.
Quantum safe cryptography - The IBM z16 system uses quantum safe methods inside its hardware and firmware to protect customer hardware investments against potential quantum threats. Starting with Crypto Express 7S, adapters in CCA or EP11 mode provide first versions of quantum-safe cryptographic algorithms accessible to Linux software.
Integrated Accelerator for AI - The IBM Integrated Accelerator for AI is an on-chip AI accelerator available on the IBM Telum chip that is part of IBM z16 and LinuxONE 4 servers. It is designed to enable high throughput, low latency inference for deep learning and machine learning. With IBM z16 your models can be deployed to receive transparent acceleration and optimization.
More information on IBM LinuxONE security can be found at https://www.ibm.com/linuxone
Building Aadhaar Data Vault on IBM LinuxONE
In this section, we will discuss details on how Aadhaar Data Vault can be built with the unique security value proposition provided by IBM LinuxONE servers and we present a reference architecture with two key aspects:
Securely storing sensitive Aadhaar
To protect data-at-rest Aadhaar numbers (or other sensitive data), the LinuxONE ADV solution leverages IBM’s FIPS 140-2 Level 4 HSM available on IBM LinuxONE, which also provides encryption services for all the components in the reference architecture.
Securely processing Aadhaar information
To securely process Aadhaar information, our solution will leverage Confidential Computing. This is implemented in IBM LinuxONE using IBM Secure Execution technology and Hyper Protect Virtual Servers (HPVS). This protects the “Secure Tokenizer” Service (Figure 2) and the Aadhaar Data Vault from unauthorized access, even by privileged users. HPVS provides a hardware-based TEE and provides memory protection for data in use.
Figure 2: Reference Architecture for Aadhaar Data Vault with LinuxONE
The LinuxONE Aadhar Data Vault solution shown in Figure 2 comprises of the following components:
Hyper Protect Encryption Services
Hyper Protect Encryption Services uses the FIPS 140-2 Level 4 HSM available on LinuxONE to provide encryption and cryptography services to the entire environment. As shown, all the instances of HPVS, ADV-Database and other applications can use a self-contained encryption service on the LinuxONE. This ensures sensitive data, like Aadhaar, is always protected as it moves around in the environment, or when it is stored in the ADV-database.
Hyper Protect Virtual Server (HPVS)
The IBM Hyper Protect Virtual Server provides a hardware based Trusted Execution Environment (TEE) which ensures confidentiality and integrity of the data in use, such as Aadhaar numbers, and isolation of the applications and the data from unauthorized access, even from privileged actors.
HPVS enables secure control of the application and its environment properties, while ensuring separation of duty between different personas. It does so by enforcing a contract mechanism to enable different personas to define the application container images and the properties of the application and its environment in a secure way. This includes properties such as volumes to attach, environment variables, secrets and seeds for deriving cryptographic keys by the application running in HPVS.
HPVS Contract
The contract is a document comprising multiple sections which can created by the different personas and can be independently encrypted. Thus, each persona can keep its contract section confidential, yet cooperate with the other persona. The contract is provided to HPVS during deployment. After deployment, the contract is immutable and is subject to attestation. HPVS ensures the application and its environment properties accord to the contract and prevents any injection attacks that would tamper the workload.
Secure Tokenizer Service and Aadhaar Data Vault
In Figure 2, both the Secure Tokenizer and the Aadhaar Data Vault run in a HPVS. Both components interact via cryptographically secured mechanisms and both components can use attestation to verify the other party runs in HPVS and is executing the expected workload. By running both components in HPVS we can ensure that sensitive data is never exposed to unauthorized actors, and even privileged actors cannot gain access to the data (even if they try to use techniques like creating memory dumps, etc.).
In addition, the Secure Tokenizer Service also uses the HSM for random number generation. Figure 3 illustrates the flow for “Tokenization”
Figure 3: Secure Tokenization with HSM on IBM LinuxONE
Figure 4 illustrates how the Tokenizer will be used to “Process” client requests with an Aadhaar-reference-key
Figure 4: Processing client request with Aadhaar-Reference-Key with HSM on LinuxONE
AI Inferencing Engine for De-duplication
The solution can optionally comprise a AI Inferencing Engine, e.g. for deduplication.
User management
The ADV can optionally integrate with a user management system to provide authentication and authorization of API requests that are issued by business applications or components of the KYC system.
Logging
The ADV creates logging data and log events and sends these to a Log service. The ADV ensures the log data does not include sensitive data like the Aadhaar Number.
High level component requirements
To bring together all the features of the reference architecture on the LinuxONE, the following components are required:
Feature |
Component |
Secure Tokenizer & Aadhaar Data Vault (ADV) |
- 1 x LPAR for HPVS with 8 x IFLs, 96gb mem, 400gb disk
- A tokenization application capable of PKCS#11 interaction
|
ADV Database |
- 1 x LPAR
- LinuxONE compatible flavor of PostgreSQL
|
Hyper Protect Encryption Services |
Depending on the LinuxONE model:
- 1 x SSC LPAR
- 1 x LPAR for HPVS with 4 x IFLs, 48gb mem, 198gb disk
|
KYC, AI Inferencing & Other features (optional) |
- 1 x LPAR for HPVS with 8 x IFLs, 96gb mem, 400gb disk
|
*HPVS: Hyper Protect Virtual Server
Conclusion
There are several advantages of implementing Aadhar Data Vault solution on LinuxONE. LinuxONE’s security mechanisms help not only meet but surpass regulatory and compliance requirements for Aadhar Data Vault:
- Protect not only data-at-rest and data-in-motion, but also protect data in use with confidential computing (viz. Hyper Protect)
- Confidential computing helps protect the application, data and secrets from both insider threats and external threats. This protects against leaking of plain-text Aadhar numbers as tokenizer is secured via confidential computing.
- Leverage FIPS 140-2 Level 4 compliant HSM – industry highest rated HSM for security. The master keys never leave the HSM. The data encryption key (AES 256) is used by the secure tokenizer to create the reference key and is fully protected.
- Realize secure control with separation of duty over your solution via the HPVS contract mechanism.
References
Aadhaar Data Vault
NIC Conceptual Aadhaar Data Vault Reference
IBM Confidential Computing