Instana U

 View Only

Debugging the IBM MQ sensor connection issues

By Xing Tian posted 16 days ago

  

Co-Authors:

Sunny Tian(tianxing@cn.ibm.com) - Software Developer - APM, Instana Observability Platform

Jing Jing Zhang (zjingjbj@cn.ibm.com) - Software Developer - APM, Instana Observability Platform

Li Jian Wang (wlijian@cn.ibm.com) - Software Developer - APM, Instana Observability Platform

IBM MQ has many complex combinations of different levels, object security, and authority settings. This article introduces the common connection issues that are encountered when you monitor IBM MQ using Instana.

IBM MQ connection modes

The IBM MQ sensor uses the following IBM MQ connection modes to connect to a queue manager:

Local bindings mode

Local bindings mode refers to IBM MQ classes for Java that uses the Java Native Interface (JNI) to directly call the existing queue manager API, instead of communicating through a network. For more information, see MQ Java bindings mode. To connect with the local binding mode, you need to add the Instana agent user to the mqm group of IBM MQ.

With the local bindings mode, the IBM MQ sensor can fully automate discovery and monitoring.

Client mode

Client mode refers to IBM MQ classes for Java that is used as a client. For more information, see MQ Java client. To connect with the client mode, you need to configure the queue manager name, channel name, and required permissions as a client application in the agent configuration.yaml file.

With the client mode, the IBM MQ sensor can only partially automate discovery when the queue manager is located locally.

How the IBM MQ sensor works

Let’s see how the IBM MQ sensor works when monitoring to connect the queue manager.

After you install the Instana agent, the IBM MQ sensor completes the following steps for monitoring and collecting metrics from IBM MQ:

  1. Discovers local queue mangers and collects all the information that can be discovered, such as port, MQ library path, and whether it is a HA queue manager.
  2. Fetches all the configured queue managers that you configured in the agent configuration.yaml file.

  3. Connects to all the discovered and configured queue managers.

    1. If IBM MQ and the Instana agent runs in the same environment, the sensor completes the following steps:

      • Attempts to connect in the local bindings mode for local queue managers. If the Instana agent user is a member of the mqm group, the sensor automatically monitors IBM MQ.

      • If the sensor fails to connect the local queue manager, it uses the channel and credentials specified in the agent configuration.yaml file to connect queue manager with client mode.

    2. If IBM MQ and the Instana agent runs in different environments, the sensor connects remote queue managers by client mode with the channel and credentials specified in the agent configuration.yaml file.

  4. After the sensor connects to the queue managers, the sensor uses PCF commands to get the detailed data. For more information about all the collected metrics, see Instana documentation. Some metrics are only available in local files for monitoring, not through PCF commands.

Common connection and authority problems

Most problems that you might encounter are related to IBM MQ connection and authority. Some of the common issues are summarized for reference:

Local monitoring problem with the local bindings mode

If the Instana agent user has the mqm authority, the Instana agent connects to queue manager with local bindings mode. In this scenario, if you encountered an authority problem, ensure that the mqm authority settings have taken effect, especially for the currently running queue manager by completing the following steps:

  1. Add the user to mqm group.

  2. Refresh security to make the settings take effect. To refresh security, complete one of the following steps:

    1. Run the runmqsc command with a user that has MQ admin authority, such as mqm, and then run the command refresh security within runmqsc. For example, within the command prompt, type refresh security.

    2. Restart queue manager to make the mqm authority that you granted take effect for the user.

Local or remote monitoring problem with the client mode

For the client mode, the IBM MQ sensor behaves as a client application to connect to the queue manager. So the sensor requires the same authority as any other applications to connect to queue Manager. In case of connection problems with the client mode, you can use one of the following means to resolve:

  • Connection tools, such as MQ explorer by remote connection or MQ sample application amqsputc, to test and find the authorized user to connect queue manager.

  • MQ test connection tools introduced in the last section of this blog.

You can select any of these tools to try your MQ connection, and if you can connect with these tools, directly configure the same user, password, and other configuration parameters for MQ sensor.

The following example shows how to use amqsgetc to test a connection:

  1. Assume your configuration.yaml file contains the following parameters:

  queueManagers:
    QM1: 
      host: 'hostname1.com' 
      port: '1414'
      channel: 'INSTANA.SVRCONN' 
      username: 'user1' 
      password: 'passw0rd 
  1. Test your connection by exporting an environment variable with the following format:

export MQSERVER=’<channel name>/TCP/<hostname of MQ server(listener port)’

Example:

export MQSERVER=’INSTANA.SVRCONN/TCP/hostname1.com(1414)
  1. Run the following command to put a message to the existing queue in a queue manager with the user you have permissions to put a message.

“$MQ_SAMPLE_DIR/bin/amqsgetc QUEUE_NAME QM_NAME” 

Example:

“/opt/mqm/samp/bin/amqsputc TEST.QUEUE QM1”  
  1. When prompted, enter a test string and then press Enter twice to send the message to the queue.

Common connection & authority issues that you might encounter are cited for reference.

RC 2538 - MQRC_HOST_NOT_AVAILABLE

Instana agent log message: "Listener not started on{host}:{port} ({exceptionCode}) {message}".

  • Issue:

Listener is not started.

Solution:

Start the listener. To start the listener, run the following IBM MQ runmqsc command:

START LISTENER($Listener_Name)

  • Issue:

Qmgr@host can’t be found.

Solution:

Double check whether the connection parameter for queue manager name and host name is correct for connection. Also, check if a firewall exits to block the connection.

  • Issue:

The host and port are configured in the agent configuration file in the Kubernetes cluster. In the Kubernetes cluster, when queue manager is restarted, the host IP is changed. As a result, the queue manager cannot be connected by the IBM MQ sensor.

Solution:

Enable automatic discovery of the host and port information by deleting the host and port in the agent configuration file for the host agent in the Kubernetes cluster.

RC 2540 - Channel is not defined.

Issue:

Instana agent log message: "Channel {channel} is not defined ({exceptionCode}). {message}".

Solution:

Check whether the correct SVRCONN channel is configured in the agent configuration.yaml file.

RC 2035 - Authority problem

Issue:

Instana agent log message: "Channel {channel} authorization failed for user {username} ({exceptionCode}). {message}".

Solution:

To debug this problem, you can check whether the channel security is enabled or not:

  • Channel security is disabled, but still report “no authority with user”. The reason is because the MQ application asserted user have no authority. You need to check which user is used and give the right authority to this user or change another proper user to connect to the queue manager.

  • Channel security is enabled. You need to check the provided user and password have correct authority to connect to the queue manager. Or if the TLS is enabled, you need to provide the correct keystore, keystorePassword, and the corresponding cipherSuite to connect to the queue manager.

The following flow chart shows the debug flow:

Work flow chart 1

Root causes and solutions:

This problem is caused by authority problem, and might be caused by different reasons according to different MQ configurations.

  • Cause 1: Channel security is disabled, but “no authority with root” is reported.

Usually when CHLAUTH is disabled and CONNAUTH is not set, that means MQ channel security is not enabled, you need to determine which user is used for authorization. The following table lists the order of precedence for security features.

Priority order

User security feature

1 (lowest)

Application Asserted user(the operating system user)

2

Channel definition MCAUSER attribute

3

Connection authentication with ADOPTCTX(YES)

4

Channel authentication records with USERSRC(MAP)

5 (highest)

Security exit

If you haven’t configured Security exit, channel record USERSRC(MAP), CLNTUSER, or MCAUSER, the application asserted user (the operating system user in a remote connection) is used. In this scenario, root is used as application asserted user because the Instana agent is running as root, and if root doesn’t have mqm authority, the MQ error log shows “no authority with root”.

For more information about user authority priority, see Determining which user is used for authorization.

For MQ authority interaction flow for this scenario, see the red marker 1 in the Work flow chart 2.

Solution:

You can configure the channel record, define CLNTUSER, or define the MCAUSER for channel. The simplest way for this situation is to define MCAUSER with a user, which has MQ access authority for your server connection channel, and then this user is used to connect to queue manager.

Example:

“ channel(SVRCONN) chltype(SVRCONN) MCAUSER(‘mqmtest‘)”

  • Cause 2: No authority is granted to connect to the SYSTEM server connection channel.

Some system server connection channels like SYSTEM.AUTO.SVRCONN are blocked by default. The MQ BLOCKUSERS rules are listed.
The default rules for CHLAUTH processing:

    • NO ACCESS to all channels by any MQ-admin* users

    • NO ACCESS to all SYSTEM.* channels by all users

    • ALLOW access to SYSTEM.ADMIN.SVRCONN channel (non MQ-admin users)

The first two rules block access to all channels. The third rule is more specific, and therefore takes precedence over the other two if the channel is the SYSTEM.ADMIN.SVRCONN channel, thus allowing access to that channel. For more information, see Resolving CHLAUTH access issues.

For MQ authority interaction flow for this scenario, see the red marker 2 in the Work flow chart 2.

Solution:

Use one of the following options:

    • Unblock the user for system server connection channel before the user is used

    • Define your own server connection channel for connection, which can workaround this block user problem.

  • Cause 3: Security is enabled or TLS is enabled

When Channel Security is enabled--CHLAUTH(ENABLED), or TLS is enabled, but the corresponding username and password or keystore parameters(keystore, keystorePassword and cipherSuite) are not provided. After you change MQ security-related configurations, you need to run “refresh security type(CONNAUTH)” from the runmqsc prompt to make security change take effect.

Solution:

Provide corresponding username and password or keystore parameters in the agent configuration file.

  • Cause 4: Security is enabled for both CHLAUTH and CONNAUTH, but user still have authority problem to connect to queue manager.

Solution:

    1. Check CHLAUTH and CONNAUTH configurations by MQ runmqsc command “display qmgr”.

    2. Check the following MQ authority flow to check the combined configuration.

    3. Identify which user and configuration take effect based on priority settings.

    4. Correct the problem.

Work flow chart 2

Red marker 1 is the flow for Cause 1: Channel security is disabled, but “no authority with root” is reported. 

Red marker 2 is the flow for Cause 2: No authority is granted to connect to the SYSTEM server connection channel. 

For more information, see Interaction of CHLAUTH and CONNAUTH.

  • Cause 5: MQ connection has no problem, but object security is not enough for the user to get other monitoring data.

Solution:

    1. Check MQ error log to see which object and which user have authority problem.

    2. Give correct authority to the object to solve this problem. For more information about authorities needed for different monitoring data, see Configuring IBM MQ authority.

Instana agent log connection errors list

The following table lists all connection problems that are related to Instana agent log error information and explains the meaning of these errors in MQ. You can check and solve them according to different root causes.

Instana log error information

Error code

ExceptionCode

MQ Error explanation

"Queue manager not started on {host}:{port} ({exceptionCode}) {message}"

RC 2059

MQRC_Q_MGR_NOT_AVAILABLE

This error occurs on an MQCONN or MQCONNX call. The queue manager identified by the QMgrName parameter is not available for connection.

"Listener not started on{host}:{port} ({exceptionCode}) {message}"

RC 2538

MQRC_HOST_NOT_AVAILABLE

An MQCONN call was issued from a client to connect to a queue manager but the attempt to allocate a conversation to the remote system failed. Common cause of this reason code is that the listener was started on the remote system.

"Channel {channel} authorization failed for user {username} ({exceptionCode}). {message}"

RC 2035

MQRC_NOT_AUTHORIZED

The RC2035 reason code is displayed for various reasons, such as an error on opening a queue or a channel, an error received when you attempt to use a user ID that has administrator authority, an error when using an IBM® MQ JMS application, and opening a queue on a cluster. MQS_REPORT_NOAUTH and MQSAUTHERRORS can be used to further diagnose RC2035.

"Channel {channel} is not defined ({exceptionCode}). {message}"

RC 2540

MQRC_UNKNOWN_CHANNEL_NAME

An MQCONN call was issued from a client to connect to a queue manager but the attempt to establish communication failed because the queue manager did not recognize the channel name.

"Queue manager {queueManagerName} is not defined ({exceptionCode}). {message}"

RC 2058

MQRC_Q_MGR_NAME_ERROR

On an MQCONN or MQCONNX call, the value specified for the QMgrName parameter is not valid or not known. This reason also occurs if the parameter pointer is not valid. (Detecting parameter pointers that are not valid is not always possible; if not detected, unpredictable results occur.)

"Channel {channel} is not available ({exceptionCode}). {message}"

RC 2537

MQRC_CHANNEL_NOT_AVAILABLE

An MQCONN call was issued from a client to connect to a queue manager but the channel is not currently available. Common cause of this reason code is that the channel is currently in stopped state.

"Error during connection to {host}:{port} queue manager {queueManagerName}. {message}"

Others

ExceptionCode

For other ExceptionCode, you can check with MQ command “mqrc” to see what’s the MQ error meaning, and fix the related MQ problem.

IBM MQ connection test tool for verification

When you find Queue Manager can not be connected with your configured parameters, you can try your connection parameters with this MQ connection test tool to connect to queue manager and check whether queue manager can be connected. Check the MQ error log to find the real failed reason, and fix it from MQ configuration.

Use this IBM MQ connection test tool, which behaves as the same with the MQ sensor.

0 comments
19 views

Permalink