Hardware Management Console Best Practices
- 30 -
6 Problem Determination
This section is intended to cover issues and problems that may be encountered
during operation of the HMC itself. While some mention will be made of
managed systems in the context of server and frame management, service
applications needed to enable service environments, e.g. Electronic Service
Agent™, Service Focal Point, and the Remote Support Facility will not be
discussed.
6.1 Problem Analysis
The HMC is composed of many subsystems and layers to its code stack. Every
subsystem maintains a dynamic trace – a “flight recorder” of sorts. As of HMC
Version 5, Release 1.0, the vital traces are stored in persistent files in the HMC
filesystem. At periodic intervals, typically every hour, cron jobs are run,
depending on the process, for those subsystems or applications that tend to
generate heavier quantities of trace data. This cron job checks the respective
trace file’s size to see if it has exceeded a fixed, static threshold. If the file does
exceed this threshold, the trace file is backed up and a new trace file’s generation
begins. Also, when the HMC is rebooted, the last-running trace file for some
subsystems will be backed up; for others, it will be overwritten.
Up to eight backup copies are maintained at any given time. While this may seem
sufficient, the size of the trace files and the number of backups maintained are
dependent upon the HMC load. For example, in an environment where an HMC
is managing two frames and 40 logical partitions, the amount of trace data
generated can be voluminous over a short period. If IBM Support and Service
will be needed to diagnose an HMC problem, it is vital that the trace files be
extracted from the HMC as soon as possible after the problem has been observed.
The HMC provides a command ped/jointfilesconvert/366647/bg to assist in this process. This command
can only be run as user hscpe with the hmcpe task role. The man page should be
consulted for the full list of options this command provides. The most important
features it provides are:
• -d For pre-V5R1.0 HMC releases, this command can be issued with an
attribute value of on to enable trace logging. Note that this does not mean
trace logging will begin! While it is enabled, to initiate it the HMC must be
rebooted.
• -l For errors given pointing to failures in the RMC subsystem (ex. DLPAR),
this attribute can be used to list the status of various RMC Resource
Managers.
Comentários a estes Manuais