Db2

Db2

Where DBAs and data experts come together to stop operating and start innovating. Connect, share, and shape the AI era with us.


#Data


#Data
 View Only

Intel Gaudi Air-Gapped AI Inferencing for IBM Db2 Genius Hub

By Merlin Moncy posted 15 days ago

  

Intel Gaudi accelerators are emerging as a powerful alternative for AI inferencing workloads, offering enterprise-grade performance for large language models in air-gapped environments. For organizations running IBM Db2 Genius Hub with Intel Gaudi 2 or Gaudi 3 infrastructure, deploying customer-managed air-gapped inferencing enables AI-powered database operations while maintaining complete control over data sovereignty and network isolation.

This blog provides a practical guide for setting up IBM Granite 4.0 inferencing on Intel Gaudi accelerators using Red Hat Enterprise Linux AI (RHEL AI) containers, specifically designed for organizations that require AI capabilities without external connectivity.

Why Air-Gapped Inferencing Matters

For many enterprises, AI adoption is not limited by interest or use cases. It is limited by architecture and compliance requirements. Organizations often need guarantees around:

  • Data sovereignty
  • Internal infrastructure ownership
  • Network isolation
  • Regulatory compliance
  • Predictable operational boundaries

These requirements become especially important when AI interacts with operational database systems, particularly in regulated industries where organizations must maintain strict control over operational metadata, infrastructure boundaries, and network exposure.

Customer-Managed Air-Gapped Inferencing allows organizations to keep:

  • Prompts
  • Operational telemetry
  • Database context
  • Inferencing workloads

inside infrastructure they fully control.

The result is an AI deployment model aligned with enterprise governance and security policies while still enabling AI-assisted database operations.

Not Just Private AI—Operationally Useful AI

Air-gapped inferencing matters not only because it preserves isolation, but because it enables practical AI-powered database operations within enterprise-controlled environments.

Db2 Genius Hub uses its agentic AI service to support capabilities such as:

• Natural language interactions for database questions

• Performance and troubleshooting analysis

• Conversational search and guided diagnostics

• Reasoning over live and historical Db2 context

• Database-aware workflows informed by institutional Db2 knowledge

The AI Configuration experience, provider validation workflow, and integrated inferencing support further reinforce this as a built-in operational capability rather than an external add-on.

Intel Gaudi Setup: A Practical Guide

This guide walks through deploying IBM Granite 4.0 on Intel Gaudi accelerators for air-gapped Db2 Genius Hub deployments.

Prerequisites

  • RHEL 9.6+ system
  • Intel Gaudi 2 or Gaudi 3 hardware
  • Gaudi software stack installed and validated (verify with `hl-smi`)
  • Podman 5.x installed
  • Red Hat registry access (account or service token)
  • Habana runtime support for Podman

Step-by-Step Setup

Step 1: Authenticate to Red Hat Container Registry

podman login registry.redhat.io

Step 2: Pull the RHEL AI vLLM Gaudi Container

podman pull registry.redhat.io/rhaii/vllm-gaudi-rhel9:3.4.0

Step 3: Run the Container with Gaudi Support

  • PODMAN_OPTS="-e OMPI_MCA_btl_vader_single_copy_mechanism=none --cap-add=sys_nice --net=host --ipc=host"
  • podman run -it --name vllm-gaudi --runtime=habana   -e HABANA_VISIBLE_DEVICES=all $PODMAN_OPTS   vault.habana.ai/gaudi-docker/1.24.0/ubuntu22.04/habanalabs/pytorch-installer-2.10.0:latest

Step 4: Validate Gaudi Availability

hl-smi

Step 5: Clone and Setup vLLM Gaudi Repository

  • cd /workspace
  • git clone https://github.com/vllm-project/vllm-gaudi.git
  • cd vllm-gaudi
  • git checkout gaudi3Benchmarks

Step 6: Configure HAProxy Load Balancer

Intel Gaudi deployments support scalable inferencing using HAProxy-based load balancing across multiple vLLM instances.

Install HAProxy:

yum install -y haproxy procps-ng

Or manually download if needed:

  • cd /tmp
  • curl -LO https://www.haproxy.org/download/2.8/src/haproxy-2.8.10.tar.gz
  • tar xzf haproxy-2.8.10.tar.gz
  • cd haproxy-2.8.10
  • make TARGET=linux-glibc USE_OPENSSL=1 USE_PCRE= -j$(nproc)
  • cp haproxy /usr/local/sbin/haproxy
  • ln -s /usr/local/sbin/haproxy /usr/local/bin/haproxy

Verify installation:

haproxy -v

Step 7: Start vLLM Servers with Load Balancing

  • cd /workspace/vllm-gaudi/gaudi3Benchmarks/1.23.0/v0.17.1/haproxy_lb
Start with 8 vllm/Gaudi instances (default) 8 vllm → 8 Gaudi3
  • ./start.sh
Start with a custom number of vllm / Gaudi instances 4 vllm → 4 Gaudi3
  • ./start.sh 4
Stop all processes including
  • ./stop.sh

Step 8: Test the Inferencing Endpoint

Example request:

curl -sS http://localhost:30360/v1/models   -H "Authorization: Bearer granite4.0h-g3key"

Step 9: Configure Db2 Genius Hub

1. Select Bring your own AI stack
2. Choose RHEL vLLM as the provider
3. Enter endpoint details
4. Use Test Connection to verify connectivity

Video Walkthrough

For a complete visual demonstration of the Intel Gaudi setup process, watch our step-by-step video guide that walks through container deployment and HAProxy configuration:

Additional Resources

Red Hat AI Container Catalog: https://catalog.redhat.com/en/software/containers/rhaii/vllm-gaudi-rhel9/69e0e3a360eb49b3bbffc4a8

IBM Granite 4.0 Load Balancer Configuration: https://github.com/vllm-project/vllm-gaudi/tree/gaudi3Benchmarks/gaudi3Benchmarks/1.23.0/v0.17.1/haproxy_lb

A Clear Path to Sovereign AI with Intel Gaudi

Intel Gaudi accelerators extend the Db2 Genius Hub AI stack with a cost-effective, enterprise-grade option for air-gapped inferencing. With HAProxy-based load balancing for horizontal scaling and optimized performance for large-scale inferencing workloads, Intel Gaudi provides organizations with another powerful choice for sovereign AI deployment.

Combined with enterprise support from Red Hat and Intel, plus IBM Granite 4.0 models, organizations can achieve AI-powered database operations while maintaining complete control over their infrastructure and data.

For organizations with Intel Gaudi infrastructure, this deployment path enables agentic AI capabilities without compromising on network isolation, infrastructure ownership, or data sovereignty.

Final Thought

The future of enterprise AI will not be defined by a single deployment model or hardware platform.

It will be defined by choice.

Intel Gaudi air-gapped inferencing for IBM Db2 Genius Hub gives organizations another way to adopt agentic AI on their own terms with their own infrastructure, inside their own security boundary, and with the operational control that regulated environments demand.

Ready to deploy Intel Gaudi air-gapped AI for your Db2 environment?
Contact our team today for technical guidance and deployment support.

 

About Authors

IBM Contributors:
  • Ashok Kumar,  Merlin Moncy and Taniya Bagh
    • from IBM's Db2 Genius Hub team specialize in AI-powered database operations and enterprise deployment solutions.
 
Intel Contributors:
  • Murali Madhanagopal, Shankar Ratneshwaran, Pramod Pai, and Suresh B. Nampalli
    • from Intel's Gaudi AI accelerator team bring deep expertise in Intel Gaudi hardware optimization and deployment.
0 comments
25 views

Permalink