Cloud Pak for Business Automation

Cloud Pak for Business Automation

Come for answers. Stay for best practices. All we’re missing is you.

 View Only

How to Connect Externally to Kafka and Elasticsearch Services Installed with IBM Cloud Pak for Business Automation

By Jorge Rodriguez posted Mon April 03, 2023 09:35 AM

  
Author:
Jorge D. Rodriguez 
STSM | Business Automation Solutions Architect | Business Automation Portfolio SWAT Team

Contributors:
Adrian Alexandru Vasiliu
Senior Software Engineer | IBM Event Integration | IBM Automation

Overview

IBM Cloud Pak for Business Automation (CP4BA) provides a fully integrated containerized platform that combines IBM’s best-in class automation software to digitally transform and automate business operations. Within the Cloud Pak, both Kafka and Elasticsearch are used as pillars of the operational intelligence, event processing and overall cross component integration capabilities. As expected in a fully integrated platform, all components within the Cloud Pak that require the use of Kafka and Elasticsearch services are automatically configured hence abstracting the configuration and integration details from system administrators and Cloud Pak users alike.

While most users have no need for direct access to the Kafka and Elasticsearch services provided by the Cloud Pak, there are more complex use cases that will require the ability to authenticate and perform API invocation flows against these services. A few examples of these types of use cases include:

  1. Use of Kafka CLI to triage or debug issues related to the Kafka cluster.
  2. Push of externally generated events into a Kafka topic. This could be done to leverage operational visibility tools within the Cloud Pak or to trigger further automation tasks within the Cloud Pak once the external event is received.
  3. Use of tools such as Elasticvue to inspect the cluster, look at index data, manage, etc.
  4. Use of external Kibana installations to visualize and query data stored in an Elasticsearch index.

Although not a comprehensive list, these use cases will help us identify the settings required to connect with these services and frame the discussion around a concrete set of examples. The remaining of the article will provide a step by step guide on how to connect Kafka and Elasticsearch services in CP4BA with the Apache Kafka CLI and the Elasticvue browser plugin respectively. It will also show how to get the necessary credentials, endpoints and other configuration details, such as TLS certificates, required to connect with other tools not delved with in this article.

Prerequisites

  • IBM Cloud Pak for Business Automation version 21.0.3 or later
  • Business Automation Insights or Automation Foundation installed as part of CP4BA
  • OpenShift CLI
  • OpenSSL command
  • Sed command

Kafka security settings on Cloud Pak for Business Automation

Kafka provides a rich and extensible framework used to secure all aspects of the messaging system. Cloud Pak for Business Automation takes advantage of these capabilities to deliver an enterprise grade deployment of Kafka which includes encryption for data in transit, user authentication and access controls. When connecting to Kafka services within CP4BA, the client must match the security settings of the deployment hence it is worth reviewing the general security configuration used by CP4BA installations of Kafka.

  • Security Protocol - SASL_SSL
    CP4BA installations of Kafka use the Simple Authentication and Security Layer (SASL) protocol to handle encryption and authentication requests. This protocol leverages certificates for data encryption just like TLS does. During the CP4BA installation, new certificates are automatically generated and applied to each Kafka broker. Later on in this article we will discuss how to extract the root certificate required by clients to communicate with the Kafka services.
  • SASL Mechanism - SCRAM-SHA-512
    The SASL protocol also provides a flexible authentication framework that allows for authentication requests to be handled by different backend systems. While Kafka supports four different authentication methods under this protocol, GSSAPI, Plain, SCRAM-SHA-256/512, and OAUTHBEARER, CP4BA installations of Kafka leverages SCRAM-SHA-512. The Salted Challenge Response Authentication Mechanism (SCRAM), is a username/password authentication protocol that addresses the security concerns of traditional mechanisms by allowing the server to validate the identity of the client, using chirographic techniques, without receiving the password as part of the authentication request. During the CP4BA installation of Kafka, the username and password required to configure SCRAM-SHA-512 are automatically generated and securely stored for later retrieval. In the next section of this article we will discuss how to extract this information so that clients such as the Kafka CLI can authenticate with the Kafka services.

Connecting to Kafka using the Apache Kafka CLI

CP4BA installations of Kafka are based on Strimzi, an open source project that provides container images and operators for running Apache Kafka on Kubernetes and Red Hat OpenShift. Fortunately, Strimzi images already include a copy of the Apache Kafka CLI. Therefore, to use the Kafka CLI, we just have to login into one of the Kafka broker pods and configure the security parameters required by the CLI. This section will show you how to get the security information from the Kafka installation and how to create the necessary configuration file to connect via the Kafka CLI.

  1. Using the oc command, switch to the project where CP4BA has been installed.
    oc project <cp4ba-namespace>

    Replace <cp4ba-namespace>

  2. Validate that Kafka was deployed as part of your CP4BA installation. If Kafka is not installed, you cannot proceed with the rest of the instructions in this section.
    # Get details of the Kafka resource to validate that it is installed.
    oc get kafka -o name

    If Kafka is installed, the output of the command above should look similar to kafka.ibmevents.ibm.com/iaf-system

  3. Retrieve the external endpoint for the Kafka bootstrap server configured during the CP4BA installation. Notice that we are storing this information into a bash style file called .kafka-connection-info for later use.
    # Get bootstrap server from CP4BA installation
    BOOTSTRAP_SERVER=$(oc get cartridgerequirements icp4ba -o jsonpath='{.status.components.kafka.endpoints[?(.scope=="External")].bootstrapServers}')
    # Store information into bash file
    echo "BOOTSTRAP_SERVER=${BOOTSTRAP_SERVER}" > .kafka-connection-info

    Inspect the.kafka-connection-info file and make sure that the expression ${BOOTSTRAP_SERVER} was properly replaced with the actual value.

  4. Retrieve the username and password automatically created during the Kafka installation. This is the username and password required by the SCRAM protocol. We will push this information into the same bash file created before so that we can use the values later on.
    # Find Kubernetes secret where username and password were stored during installation
    KAFKA_SECRET=$(oc get cartridgerequirements icp4ba -o jsonpath='{.status.components.kafka.endpoints[?(.scope=="External")].authentication.secret.secretName}')
    # Get SCRAM username
    KAFKA_USERNAME=${KAFKA_SECRET}
    # Get SCRAM password
    KAFKA_PASSWORD=$(oc extract secret/${KAFKA_SECRET} --keys=password --to=-)
    # Store information into bash file
    echo "KAFKA_USERNAME=${KAFKA_USERNAME}" >> .kafka-connection-info
    echo "KAFKA_PASSWORD=${KAFKA_PASSWORD}" >> .kafka-connection-info

    Inspect the .kafka-connection-info file to make sure that ${KAFKA_USERNAME} and ${KAFKA_PASSWORD} were properly replaced with the actual values.

  5. Now that we have the bootstrap server endpoint and credential information required to connect to the Kafka cluster, we are going to copy that information into the Kafka pod where we are going to run the Kafka CLI commands.
    # Copy bash file into a Kafka pod. In this case we are choosing kafka-0
    oc cp .kafka-connection-info $(oc get pods | grep -i kafka-0 | awk '{print $1}'):/tmp/.kafka-connection-info

    Notice that we are copying this file into the very first Kafka pod, broker node zero, but you can choose any other node available.

  6. The rest of the procedure must happen inside the Kafka pod where the Kafka CLI commands are available. We must login into the same Kafka pod were we copied the .kafka-connection-info file. To login into the Kafka pod where we copied the .kafka-connection-info file, run the following command:
    # Login into Kafka pod.  In this case we are connecting to kafka-0
    oc rsh $(oc get pods | grep -i kafka-0 | awk '{print $1}')
  7. Once inside the pod, we want to make sure that we can call Kafka CLI commands anywhere within the pod. Add the `/opt/kafka/bin` directory to the PATH environment variable.
    export PATH=$PATH:/opt/kafka/bin

    You can validate the commands are now accessible by running the kafka-topics.sh command as a test. If the command is accessible, you should be able to see help content for the command.

    # Run kafka-topics command to get help details
    kafka-topics.sh
  8. The last step before we are able to use the Kafka CLI is to create a properties file with the format and the security settings required by the commands. We will call the file kafka-cli.properties. To populate this file, we will use the information collected on previous steps as well as other security settings found in the /tmp/strimzi.properties file within the Kafka pod. Here are the commands to create the kafka-cli.properties file:
    # Go too /tmp directory
    cd /tmp
    
    # Make variables previously defined available
    source /tmp/.kafka-connection-info
    # Get password to access truststore from strimzi configuration file 
    TS_PASSWORD=$(cat /tmp/strimzi.properties | grep "zookeeper.ssl.truststore.password"  | cut -d= -f 2)
    # Get location of truststore from strimzi configuration file 
    TS_LOCATION=$(cat /tmp/strimzi.properties | grep "zookeeper.ssl.truststore.location"  | cut -d= -f 2)
        
    # Create new properties file with the properties used by the CLI
    cat <<EOF > /tmp/kafka-cli.properties
    ##########
    # Kafka CLI parameters
    ##########
    security.protocol=SASL_SSL
    sasl.mechanism=SCRAM-SHA-512
    sasl.jaas.config=org.apache.kafka.common.security.scram.ScramLoginModule required username="${KAFKA_USERNAME}" password="${KAFKA_PASSWORD}";
    ssl.truststore.type=JKS
    ssl.truststore.password=${TS_PASSWORD}
    ssl.truststore.location=${TS_LOCATION}
    EOF

    Notice how we are setting security.protocol and sasl.mechanims to the general security configuration values previously discussed. We are also leveraging the username and password obtained for SCRAM based authentication. Finally to properly configure the SSL communication between the Kafka services and the Kafka CLI, we are going to use a truststore already generated during installation which contains the certificate applied to all Kafka brokers. The ssl.truststore.location and ssl.truststore.password are used to configure the location of the truststore file within the pod and the password required to access the truststore respectively. 

    Before moving on to the next step, inspect the /tmp/kafka-cli.properties file and make sure that the values for the TS_PASSWORDTS_LOCATIONKAFKA_USERNAME and KAFKA_PASSWORD were substituted properly.

  9. Now that we are done configuring the parameters required by the Kafka CLI, use any of the commands found under the /opt/kafka/bin directory to work with the Kafka cluster. For example if you want to list the Kafka topics in the cluster, you can use the following:
    # Make variables such as BOOTSTRAP_SERVER available
    source /tmp/.kafka-connection-info
    # List topics in this Kafka installation
    kafka-topics.sh --list --bootstrap-server ${BOOTSTRAP_SERVER} --command-config /tmp/kafka-cli.properties

    Notice that many commands of the Kafka CLI require you to specify the bootstrap server to connect. The BOOTSTRAP_SERVER variable holding the Kafka bootstrap server was previously added to the .kafka-connection-info bash file.

    In this example, we used the --command-config flag to provide the connection details required by the kafka-topics.sh command. The name of this flag can vary from command to command. For example the kafka-console-producer.sh command uses the --producer.config flag instead. While the flag name can be different, the same property file can be used across commands.

  10. (Optional) Due to the ephemeral nature of the pod's filesystem, the /tmp/kafka-cli.properties file will disappear if the pod is restarted. If you want to keep a copy of the file for later use, you can run the command listed below. Make sure you are outside of the Kafka pod when running this command.
    # Copy the kafka-cli.properties file to the machine where the oc command is running
    oc cp $(oc get pods | grep -i kafka-0 | awk '{print $1}'):/tmp/kafka-cli.properties ./kafka-cli.properties

Getting Additional Information for Kafka Connection

In the previous section we learned how to get vital information required to establish a secure connection between a Kafka CLI client and the Kafka services installed as part of CP4BA. This information included the security protocol used by the installation, SASL_SSL, the authentication mechanism employed, SCRAM-SHA-512, and how to retrieve the boot strap server URL as well as credentials required for the connection. While we obtained that information in the context of connecting the Kafka CLI, the same information can be used to configure other type of clients. However, to configure other clients, you will need additional information including either the TLS certificate used to connect to the Kafka brokers or a truststore containing the certificate. The following steps show you how to retrieve the certificate in PEM format, the truststore containing the certificate in PKCS12 format and the password required to access the truststore. These commands must be run from a system that has access to the OpenShit cluster using the oc command.

Getting the TLS Certificate

To retrieve the TLS certificate, we will first get the name of the Kubernetes secret created during installation to store the certificate and extract it from there. The commands to use are as follow:
KAFKA_CA_SECRET=$(oc get cartridgerequirements icp4ba -o jsonpath='{.status.components.kafka.endpoints[?(.scope=="External")].caSecret.secretName}')
KAFKA_CA_KEY=$(oc get cartridgerequirements icp4ba -o jsonpath='{.status.components.kafka.endpoints[?(.scope=="External")].caSecret.key}')
    
oc extract secret/${KAFKA_CA_SECRET} --keys=${KAFKA_CA_KEY} --to=-  2>&1 | grep -v "${KAFKA_CA_KEY}"
Notice how the secret name and the key used within the secret are available via a custom resource called CartridgeRequirements. This custom resource is used by CP4BA to specify the requirements of specific capabilities within the Cloud Pak.

Getting the Truststore and Truststore Password

Although we can create a truststore using the TLS certificate retrieved in the previous section, we can simply get an already available truststore from one of the pods within CP4BA that connects to the Kafka services. The following set of commands show you how to inspect the configuration of the jobmanager pod, find the location of the truststore within the pod's filesystem, download it from there and retrieve the automatically generated password to access the truststore.
JOB_MANAGER_POD=$(oc get pod -l app=flink,component=jobmanager --no-headers -o custom-columns=":metadata.name")
TRUSTSTORE_PATH=$(oc get pods ${JOB_MANAGER_POD} -o jsonpath='{.spec.containers[?(@.name=="jobmanager")].env[?(@.name=="TRUSTSTORE_PATH")].value}')
TRUSTSTORE_PASSWORD_PATH=$(oc get pods ${JOB_MANAGER_POD} -o jsonpath='{.spec.containers[?(@.name=="jobmanager")].env[?(@.name=="TRUSTSTORE_PASSWORD_PATH")].value}')
    
# get truststore file 
oc cp -c jobmanager ${JOB_MANAGER_POD}:${TRUSTSTORE_PATH} truststore.p12

# Retrieve truststore password
echo $(oc exec -it -c jobmanager ${JOB_MANAGER_POD} -- cat ${TRUSTSTORE_PASSWORD_PATH})

This concludes the steps on how how to retrieve the security configuration for the Kafka services within CP4BA. You can now connect any other clients to fulfill use cases that have a Kafka connectivity requirement.

Elasticsearch security settings on Cloud Pak for Business Automation

Now that we are done looking at the Kafka configuration, we can focus on the security settings of the Elasticsearch deployment installed as part of CP4BA. Cloud Pak for Business Automation leverages the TLS and basic authentication features in Elasticsearch to ensure that all communications to and from the Elasticsearch cluster are secured. Clients leveraging the RESTful APIs for CRUD operations over HTTPS must authenticate using a set of credentials generated at install time and encrypt/decrypt the data being sent over the network using the self-signed certificate generated for the deployment of Elasticsearch services. In the next section of this article we will discuss how to extract these settings so that clients such as the Elasticvue can authenticate with the Elasticsearch services.

Connecting to Elasticsearch using the Elasticvue Plugin

In contrast to the Kafka deployment, there is no Elasticsearch client readily available in the distribution images used by CP4BA to deploy the Elasticsearch cluster. However, we can use open-source tools such as Elasticvue to inspect and manage the Elasticsearch cluster. Installing the Elasticvue browser plugin is outside of the scope of this article but you can easily install the plugin in one of the three supported browsers, Chrome, Edge 2020 or Firefox. This article was written using the Elasticvue plugin for Chrome. The following steps show you how to get the security information from the Elasticsearch installation and how to use that information to connect the Elasticvue plugin.

  1. Using the oc command, switch to the project where CP4BA has been installed.
    oc project <cp4ba-namespace>

    Replace <cp4ba-namespace>

  2. Validate that Elasticsearch was deployed as part of your CP4BA installation. If Elasticsearch is not installed, you cannot proceed with the rest of the instructions in this section.
    # Get details of the Elasticsearch resource to validate that it is installed.
    oc get elasticsearch -o name

    If Elasticsearch is installed, the output of the command above should look similar to elasticsearch.elastic.automation.ibm.com/iaf-system

  3. Retrieve the external endpoint created for Elasticsearch API calls. Keep this information available as it will be used in a later step.
    ELASTICSEARCH_URL=$(oc get automationbase foundation-iaf -o jsonpath='{.status.components.elasticsearch.endpoints[?(@.scope=="External")].uri}')
    # display the endpoint
    echo $ELASTICSEARCH_URL

    Notice how the URL is available via a custom resource called AutomationBase. This custom resource is used by CP4BA to enable the deployment of foundational infrastructure components such as Elasticsearch.

  4. Retrieve the Elasticsearch username and password assigned during the installation. To retrieve the username and password, we will first get the name of the Kubernetes secret used to store these credentials. Keep this information available as it will be used in a later step.
    export ELASTICSEARCH_SECRET=$(oc get automationbase foundation-iaf -o jsonpath='{.status.components.elasticsearch.endpoints[?(@.scope=="External")].authentication.secret.secretName}')
    ELASTICSEARCH_USERNAME=$(oc extract secret/${ELASTICSEARCH_SECRET} --keys=username --to=- 2>/dev/null)
    ELASTICSEARCH_PASSWORD=$(oc extract secret/${ELASTICSEARCH_SECRET} --keys=password --to=- 2>/dev/null)
    
    # display the elasticsearch user credentials
    echo $ELASTICSEARCH_USERNAME
    echo $ELASTICSEARCH_PASSWORD
  5. Now that we have gathered the URL and credentials we are almost ready to connect to the Elasticsearch services in CP4BA with the Elasticvue plugin. However, before we configure the Elasticvue plugin, we need to make sure that the browser accepts the self-signed certificate generated for Elasticsearch during the CP4BA installation. To accept the self-signed certificate in the browser, copy the URL obtained in the previous step, place it into the address bar and hit enter to navigate to it. The browser will show a warning message where you will have the opportunity to accept the self-sign certificate. In Chrome the warning message looks as shown in the picture below.

    Accept the certificate and dismiss the dialog box requesting user credentials that opens up right after accepting the certificate. For more information on how to accept self-signed certificates in your browser of choice see the Elasticvue documentation.

  6. To connect Elasticvue with the Elasticsearch services, open the Elasticvue user interface in the browser. In Chrome you can go to the Extensions menu and click on the Elasticvue option.
  7. Once in Elasticvue, enter the Elasticsearch username, the Elasticsearch password and the external endpoint retrieved on previous steps as show in the picture below.

  8. Finally, click the Connect button to establish the connection. If the connection is successful, you should be able to see an Elasticvue Home screen similar to the one shown below.

Now that you have connected to the Elasticsearch cluster, you should be able to leverage all the capabilities available in Elasticvue.

Getting Additional Information for Elasticsearch Connection

In the previous section we learned how to retrieve most the information required to establish a secure connection between Elasticvue and Elasticsearch. While we explicitly retrieved the Elasticsearch credentials and URL information we did not have to directly download the TLS certificate because the browser helped us workaround that step when we accepted the security warning. In some instances however you might be required to provide the TLS certificate when configuring the Elasticsearch client. To extract the TLS certificate used by Elasticsearch we will use the openssl command as follows to pull the certificates directly from the endpoint in PEM format.
# Extract URL and remove https:// from it 
ES_URL_HOST=$(oc get automationbase foundation-iaf -o jsonpath='{.status.components.elasticsearch.endpoints[?(@.scope=="External")].uri}' | sed -e 's/^https:\/\///g')

# Download certificate
echo | openssl s_client -showcerts -servername ${ES_URL_HOST} -connect ${ES_URL_HOST}:443 2>/dev/null | sed -ne '/-BEGIN CERTIFICATE-/,/-END CERTIFICATE-/p'
Now that you know how to retrieve all the security settings for Elasticsearch services within a CP4BA installation, you should be able to connect any other client or tool you might use.

Conclusion

IBM Cloud Pak for Business Automation allows you to easily connect external tools such as command line and visualization interfaces to Kafka and Elasticsearch services running as part of CP4BA. All parameters required to configure the tools, including endpoints and user credentials, are automatically generated during the installation and available for retrieval. In this article we learned how to collect these parameters from an existing installation of CP4BA and how to use them to configure tools such as the Kafka CLI and the Elasticvue plugin.

References

0 comments
40 views

Permalink