Why: The need of smarter log collection
In the realm of Quality Assurance (QA), swiftly verifying and addressing each reported bug is paramount. However, the debugging process frequently encounters roadblocks when vital failure logs are absent or incomplete. Comprehensive log collection forms the bedrock of efficient bug resolution, ensuring that no critical information is overlooked.
In IBM Power Systems' extensive testing environments, identifying bugs is primarily achieved by gathering detailed data logs from numerous subsystems, including Baseboard Management Controllers (BMCs), Flexible Service Processors (FSPs), Hardware Management Console (HMC), Hypervisor, and various partitions. Testers frequently encounter issues where bugs are dismissed due to insufficient information, often going unnoticed during standard test cycles because of absent or incomplete logs at the time of failure. This situation leads to wasted system time and increased resource usage for testers.
In simpler terms, when testing IBM Power Systems, it's crucial to collect thorough logs from different components to pinpoint bugs. However, if these logs are missing or incomplete when a bug occurs, it can lead to the bug being overlooked, causing inefficiencies and increased resource usage.
Core Insights problems:
- · Fragmented log collection methods: Multiple tools are used for different infrastructure layers, each with distinct usage formats, output styles, and compatibility limitations.
- · Tool maintenance challenges: These tools are difficult to maintain, non-uniform, and lack extensibility, necessitating manual interventions and customized scripts for each environment.
- · Inefficient debugging: Debugging efforts become time-consuming and inefficient, as testers must repeat scenarios or manually sift through distributed logs.
- · Lack of unified framework architecture: The absence of a unified framework architecture design leads to poor traceability, insufficient automation integration, and delays in resolving critical issues.
What: A Unified, Modular, and Secure Solution:
To tackle the challenges, there is a pressing need for a unified log collection tool that could streamline the process and improve efficiency. This tool should possess the following characteristics:
- User-friendly and straightforward deployment: The tool should be easy to use and deploy, minimizing the learning curve for testers and reducing the time required for setup.
- Simple maintenance: The tool should be easy to maintain, eliminating the need for extensive manual interventions and customized scripts for each environment.
- Comprehensive log collection: The tool should be capable of collecting logs from all major subsystems, including BMCs, FSP, HMC, Hypervisor, and various partitions, ensuring that no critical information is left out.
In response to these requirements, the E2E (End-to-End) Log Collection Tool was developed. This tool serves as a central utility for test validation, bug triaging, and debugging by unifying log collection across all subsystems.
Key features of the E2E Log Collection Tool include:
- Extensible configurations: The tool supports extensible configurations, allowing it to adapt to various testing environments and requirements.
- Customizable YAMLs: YAML (YAML Ain't Markup Language) files can be customized to define specific log collection parameters, making it easier to tailor the tool to individual testing needs.
- Protocol-driven data retrieval: The E2E Log Collection Tool employs standard protocol-driven data retrieval, ensuring consistent and reliable log collection from all subsystems.
The tool aligns with modern DevOps practices and expands its compatibility across diverse environments. Notable feature improvements encompass:
- Secure Logging: Passwords and sensitive data are now obfuscated in logs, ensuring secure and compliant handling when sharing logs.
- Secure Password Integration: The tool supports lab-level credential management using Secure Password Tools, simplifying secure access across various environments.
- Expanded Protocol Support: The Tool supports multiple transport layers, including SSH, HTTPS, and IPMI, with user-defined port handling, making it compatible with hardware, simulation, and QEMU platforms.
- Custom Log Collection: The tool has been augmented with the ability to trigger and capture required dump logs for targeted system diagnostics.
- Extensible Plugins: The architecture allows users to incorporate new or existing scripts to perform custom operations.
- Extensible Application: The tool is designed to support open-source or enterprise tools directly within the YAML, with the parser detecting and orchestrating operations to collect logs.
These enhancements position the E2E Log Collection Tool as a powerful and adaptable solution for test validation, bug triaging, and debugging in IBM Power Systems' testing environments.
HOW: Architecture and workflow
Key points
1. CLI engine: The architecture is centered around a Command-Line Interface (CLI) engine that supports SSH, SCP, REST, and TELNET protocols.
2. YAML configuration: Users specify target IPs and logging configurations using YAML files.
3. Data retrieval: The system connects to service processors, hosts, and external plugins to collect log data in various formats, including text, JSON, binary, and dump.

Tool's workflow commences with a caller YAML file, which triggers the invocation of plugins or scripted tools to gather logs. These logs are then meticulously organized into structured reports, enabling testers to analyze and debug issues more efficiently. To further bolster the bug resolution process, the tool provides seamless integration with Test Management tools and bug tracking systems, thereby enhancing traceability and ensuring a more streamlined workflow in IBM Power Systems' testing environments.

YAML, Plugins & Customization
Log Collection Tool empowers users to tailor log collection logic through YAML files, enabling the definition of environment variables (ENV_VARS.YAML) and configuration of credentials or log sets. This flexibility is further enhanced by the tool's support for user-defined plugins and integration with standard tools. Comprehensive documentation and examples are provided to facilitate seamless customization and usage.
The Log Collection Tool's YAML-based configuration allows users to personalize log collection logic, define environment variables, and configure credentials or log sets. This flexibility is amplified by the tool's support for user-defined plugins and integration with standard tools. To ensure a smooth customization and usage experience, extensive documentation and examples are available.






Types of logs Collected:
The Log Collection Tool has proven the capability to gather an extensive array of logs from Baseboard Management Controllers (BMCs), Flexible Service Processors (FSPs), partitions, hosts, and external systems. Supported log formats encompass plain text logs, JSON outputs, binary files, and dumps. The tool meticulously tracks success/failure and timestamps during collection. Key log sources include and counting:
1. Hardware Management Console (HMC)
2. Service Processor (SP)
3. Hypervisor
4. Partitions – operating system environments such as Linux distributions, AIX, and Virtual I/O Server (VIOS)
5. Kernel-based Virtual Machine (KVM) Host and Guest – Linux-based virtualization platform
6. Power Virtualization Center (PowerVC) – IBM’s cloud and virtualization management platform for Power Systems
This comprehensive log collection capability ensures that all critical information is captured for efficient debugging and bug resolution in IBM Power Systems' testing environments.
Log collection output:

Log format output:
The Log Collection Tool categorizes logs into well-organized directories, sorted by component, timestamp, and status. For each run, a comprehensive summary report is produced, detailing the time taken, success/failure status, and any encountered collection errors. This structured approach ensures efficient log analysis and debugging in IBM Power Systems' testing environments

The E2E Log Collection Tool provides numerous benefits, such as:
- Unified Framework: A single, comprehensive tool for log collection across all subsystems.
- YAML-driven Configuration: Flexible and customizable setup for tailored log collection.
- Easy Deployment: Seamless integration into Jenkins pipelines for streamlined workflows.
- Enhanced Debugging: Structured reports facilitate efficient log analysis and debugging.
- Secure Handling: Password masking and access controls ensure secure log management.
Conclusion: Streamlining QA with Impact
The End-to-End (E2E) Log Collection Tool revolutionizes log aggregation during Quality Assurance (QA) testing, transitioning from a disorganized, manual process to a seamless, automated, and secure procedure. Its modular, adaptable, and unified design significantly decreases bug recurrence, diminishes the need for redundant test runs, and markedly enhances debugging efficiency.
Measured Benefits:
· ~20-30% faster bugs resolution by ensuring complete systems end to end debug logs.
· ~40-50% reduction in tool maintenance overhead due architecture design featuring Data driven model, simplified plugin (plug and play) support and scripted or application use on demand customization.