Originally posted by: smo_
When escalating an issue, it is useful to have a structured method to describe it. “Telling a story about the problem” may be good but this may not reveal the key points.
A predefined structure helps to create a clear problem description and is likely speed up its resolution.
IBM created, several years ago the EDANT pattern to describe an issue and ensure the key points are not omitted using the keywords Environment, Description, Action done, Next action, Test cases.
IT environments are becoming more and more complex and the impact of an IT failure can be dramatic. Customer expectation changes rapidly over the time (today’s expectations are tomorrow’s norms) it is therefore important to include these important factors in a structured problem description as well.
- The Customer’s expectation helps to set the right focus on a complex issue involving multiple components.
- The Customer impact helps escalation in a timely manner.
EPITED structure helps to efficiently state a problem and to setup the right focus.
Environment

Is not only
- What are the different location?
- How they are interconnected?
But includes
- Architecture and cabling drawing
- Configuration details
The description of an environment should contain the locations, the devices/components that are involved and how they are interconnected. The code level of the components may help to spot rapidly a known issue and information about redundancy can help the Support Team to provide an adequate recovery action plan.
In addition, a layout diagram may help in the understanding of this
Here is an example of a useful SAN layout with information about the impacted components:


Impacted component
Problem description

Is not on a
- General problem description
But provides a
- Specific problem description
A problem description should mention what happened, when it starts and how the problem was noticed.
It should start with a general statement of the issue and then enhance it with specifics. It should not start with the ‘bits and bytes’ but should start with the high level end user statement and then drill down to the details.
The time line of the issue is vital in understanding a complex issue. Also it is necessary to understand that all the involved systems (internal time standards) are set with the same time zone.


Impact

Is not only a description of the
But includes
The description of the technical Impact and the business impact will help to understand how important it is to have the problem solved.
Most IT companies assign severity / priority levels to assist the escalation of issues.
IBM’s definition of Severity and Priority are as follow:
Severity: The code that indicates the business impact of the problem to the customer.
Priority: The priority code indicates the urgency and order in which the customer needs the problems to be resolved and levels of communication/feedback on progress.
The IBM Software Handbook details Severity level situations and examples as well as response objective for severity and priority
Troubleshooting

Not only the
But also
- Results of those actions?
The description of the troubleshooting actions already performed by the local team may be useful to the Remote Support Team.
This description will help to avoid mixing the symptoms of the original error with the symptoms of any recovery actions or from any troubleshooting steps taken.
Also, bear in mind that the system reporting an error is not necessarily the system experiencing the error, it is therefore sometime necessary to analyze as well, the connected components.
Expectation

Is not only
- What do we want to achieve?
But includes
- What do we expect from the people involved in the issue?
In a long running problem, the expectation is sometimes forgotten.
It is good to re visit the original issue and clarify more, previously stated requirements and expectations.
This may sound obvious: The client needs the solution. Sometimes the client’s expectation is not so obvious. Expectation, may include prioritization on what need to be solved first. It is important that the priority in which systems and services are restored is determined by the customer and not a well intentioned guess by the support teams. As an example, a local ‘in country’ system may be more vital to the customer that a world-wide service.
Data collection

Is not only
- What are the collected data?
- When was the date captured?
But includes
- Where are the data available?
To allow troubleshooting, Support Engineering needs system logs collected during the issue or as soon, as possible, after the issue.
Multiple data collections may be necessary to troubleshoot one or multiple issues
Issues can involve many layers and components as: Applications, File systems, Multi pathing, HBA drivers, Fibre channel controllers, back end disks, network etc.
Collecting data from multiple layers and components may be necessary to perform a thorough analysis of the issue.
See also " Are you ready for your troubleshooter" that provides more info about data collection.
This blog entry describes how to make troubleshooting more efficient using the EPITED pattern.
If you use this structure you will be more than halfway to the solution. In addition, communication and escalation will be facilitated.
Tiny url for this blog post https://ibm.biz/epitedblog
#DS8000