Cloud Pak for Data

Cloud Pak for Data

Come for answers. Stay for best practices. All we’re missing is you.

 View Only

Detecting Network Policy Changes Causing Timeouts in OpenShift

By Hongwei Jia posted Mon May 05, 2025 12:24 AM

  

Introduction

In a secured OpenShift environment, network policies are critical for controlling traffic between pods and namespaces. However, when these policies are modified—intentionally or unintentionally—they may lead to unexpected issues such as pod communication failures or crawl operation timeouts.

This blog walks you through how to configure audit logging to detect changes to NetworkPolicy resources, monitor for such events, and capture audit logs for analysis.


Use Case

A crawler application (e.g., wd-discovery-crawler) is experiencing timeout issues. The suspected root cause is changes to existing NetworkPolicy resources in the cpd namespace. The goal is to capture and analyze audit logs that record such changes.


Verify the Current Audit Logging Profile

First, check the current audit configuration to confirm whether Kubernetes is logging the body of API write requests. This is crucial for tracking what was changed and by whom.

oc get apiservers.config.openshift.io cluster -ojsonpath={.spec}

If the profile is set to Default, it may not capture the level of detail needed to investigate the issue.

Enable Detailed Audit Logging

Switch the audit profile to WriteRequestBodies to capture the full content of API modification requests, including NetworkPolicy changes:

oc patch apiserver cluster --type=merge --patch '{"spec":{"audit":{"profile":"WriteRequestBodies"}}}'

Reproduce the Issue

Re-run the application or perform the activity that previously resulted in a timeout. Monitor the behavior and prepare to capture logs if the issue recurs.

Capture Audit Logs for Analysis

If the timeout issue occurs again, immediately collect the audit logs from all control plane (master) nodes to analyze changes made to NetworkPolicy resources:

for node in $(oc get no -l node-role.kubernetes.io/master -ojsonpath={..metadata.name}) ; do 
  oc adm node-logs $node --path=kube-apiserver/audit.log | \
  jq 'select(.requestURI | startswith("/apis/networking.k8s.io/v1/namespaces/cpd/networkpolicies"))' \
  >> $node-$(date -u +%s).json
done

This command filters logs specifically related to changes in the cpd namespace’s network policies.

Run Must-Gather for Comprehensive Audit Logs

For a complete collection of audit data that may assist Red Hat support or internal security audits:

oc adm must-gather -- /usr/bin/gather_audit_logs

This command collects a wide range of diagnostic data, including audit logs.

Summary

By enabling detailed audit logging and proactively monitoring key resources like NetworkPolicy, OpenShift administrators can:

  • Quickly detect unauthorized or accidental changes

  • Pinpoint the root cause of network communication issues

  • Maintain cluster security and operational integrity

Make sure to regularly review your audit configuration as part of your cluster's security posture.

0 comments
4 views

Permalink