Purpose:
Collecting logs, and troubleshooting Sysdig agent issues on Kubernetes (IBM Cloud).
Triage First – Must-Run Commands Before Reaching Out to Sysdig Team
If a customer reports issues like agent not running, metrics not showing, or agent is failing probes — run these first to collect evidence and potentially resolve the issue without escalation:
1. Kubernetes Node & Resource Checks
a. Get Node Details
kubectl get nodes -o wide
b. Resource Usage
kubectl top nodes
c. List All Pods
kubectl get pods --all-namespaces -o wide
Why:
Identify node health, resource pressure, and where pods are running.
2. Namespace-Level Pod Checks
List and Inspect Pods in ibm-observe
:
kubectl get pods -n ibm-observe
kubectl get pods -n ibm-observe -o wide
Why:
Confirms agent pods are running and shows pod-to-node mapping.
3. Pod and Node Debugging
Describe a Problematic Pod:
kubectl describe pod sysdig-agent-node-analyzer-qcdnb -n ibm-observe
Describe the Node Hosting IP:
kubectl describe node 10.242.64.4
Why:
Check events, restarts, node pressure, and taints.
4. Logs Collection
a. Stream Logs:
kubectl logs -n ibm-observe sysdig-agent-node-analyzer-qcdnb
b. Pull Full Internal Logs:
kubectl cp -n ibm-observe sysdig-agent-khpfg:/opt/draios/logs ./logs-khpfg
Why:
Retrieve both console output and full agent logs.
5. Configuration & Workload Verification
a. DaemonSet Check
kubectl get daemonset sysdig-agent -n ibm-observe -o yaml > ds-sa.yaml
Why:
Ensures the agent is scheduled on each node. Check:
- desiredNumberScheduled
- numberReady
- nodeSelector, tolerations, affinity
b. ReplicaSet (if used)
kubectl get rs -n ibm-observe -o wide
Why:
Verify deployment status and whether pods are correctly maintained.
c. Configuration Check
kubectl get configmap sysdig-agent -n ibm-observe -o yaml > cm-sa.yaml
Why:
Validates agent configuration (e.g., endpoint URL, token, log level).
✅ Summary Table
🔍 Task |
Command |
Purpose |
Example |
Describe Pod |
kubectl describe pod |
View pod failure reasons |
kubectl describe pod sysdig-agent-abc -n ibm-observe |
Get Logs |
kubectl logs |
Console log output |
kubectl logs -n ibm-observe sysdig-agent-abc |
Copy Full Logs |
kubectl cp |
Download full logs |
kubectl cp -n ibm-observe pod:/opt/draios/logs ./ |
Verify DaemonSet |
kubectl get daemonset |
Ensure agent deployed to all nodes |
kubectl get ds sysdig-agent -n ibm-observe -o yaml |
Get Config |
kubectl get configmap |
Review agent config |
kubectl get configmap -n ibm-observe -o yaml |
Uninstall |
helm uninstall |
Clean agent removal |
helm uninstall sysdig-agent -n ibm-observe |
✅ Tip
- Always check both DaemonSet and ConfigMap if agents are missing or not working.
- If the pod is restarting, start with
describe
and then move to logs.
- Use
kubectl cp
when deeper log analysis is needed.