Originally posted by: smo_
Are you ready for your troubleshooter?
When it comes to technical issue, client may need to call his service/maintenance provider to have it solved as quickly as possible. The responsibility has been given over, but are you ready to help your troubleshooter?
Why to help him? The more help you provide higher is the likelihood to have it solved in a shorter time. There is no crystal ball (at least now) to understand customer environment this mean support rep need your info.
Here are 10 points (and they are likely not the only ones) to follow to be well prepared to help your troubleshooter.
Before the problem occurs
1/ Do you know how to open a service request for each of the component you are working with?
If the issue involves multiple components you may need to contact multiple service maintenance providers.
2/ Do you know the information (support data) to collect, how to collect it and were to send it.
There is always a dilemma about how much of information you should provide.
Two major factors play a part in deciding which logs to send opening a problem ticket. There are speed and being asked for more logs in the future.
These is no magic answer but usually the provider talks about standard data collection. This data should be collected in less than 30 minutes (if you are well prepared) and this should be enough at least to start the troubleshooting of most of the cases.
3/ Did you try to collect this information in advance to be sure there is no firewall / access problem.
If you are well prepared this can avoid to lose some valuable minutes
4/ Do you know the time setting of the different boxes?
Time setting is very important in order to correlate event logs from different systems.
5/ Do you have a layout (not a marketing layout).
Layout talks so much more that a description and do not hesitate to draw on the layout to show the issue
A good layout should provide a high level overview with data center, distance between the datacenter and as well a detailed view with, if possible, server name and port used.
6/ Do you have change management log
Most of the issue are caused by changes that can have been triggered manually or by automated started routine.
When the problem occurs
7/ Try to provide as much information as possible
“We are facing problem, please analyze” is not of very much help.
With decentralized organization of datacenters this may be difficult to have an overview of the problem but “a problem well stated is a problem half solved”.
8/ Take a data collection as soon as you are facing the issue
Why? Logs memory can be limited and this will avoid to have them wrapped.
9/ Provide meaningful file name for your data collection
Your data collection can contain the device name, the date of the collection data and a keyword that can describe the date
10/ Provide time line of events and actions that were performed before, during and after the outage.
From the logs it is sometime difficult to figure out is an event has been manually triggered or not. (From the logs is it difficult to figure out if a cable has been pulled out or if it is defect)
If you are troubleshooting Storage/SAN environment the following may help you
A1/ Open IBM service request
A2/ Data collection guidance for some SAN Storage product
SAN Brocade see https://ibm.biz/supportsave
SAN cisco see http://ibm.biz/ciscodc
SVC V7000 http://ibm.biz/svcv7kdc
A3/ To upload the package, please make use of our Secure Upload web frontend:
Just use your PMR (preferred), RCMS, or CROSS case number for your upload to let the system even notify the support engineer with an update to the case.
The email address is optional but it will send you a short notice as soon as the upload completed successfully and the support engineer will be able to contact you via mail if needed.
After clicking on "Continue" you can drag and drop the archive file containing the supportsave to upload it
A4/ Time setting on different boxes
Here is a very good doc by Falk Schneider on how to setup NTP server for each of your system
See also good reference on Seb’s SAN blog
How to be prepared for the next problem determination