High Performance Computing

High Performance Computing Group

Connect with HPC subject matter experts and discuss how hybrid cloud HPC Solutions from IBM meet today's business needs.

 View Only

IBM Conductor with Spark integration with Windows Active Directory base on Kerberos for user authentication

By Archive User posted Mon December 18, 2017 03:54 AM

  

Originally posted by: xhu


Background 

In many organizations, system administrators need to integrate Linux systems to their existing Microsoft Windows Active Directory domain environments. Windows AD then serves as a centralized accounts controller so that AD users are able to submit Spark applications to IBM Spectrum Conductor with Spark clusters conveniently, and achieve the following advantages 

  • User accounts are located in Windows AD, with group membership also defined in Windows AD

  • Windows AD is the central authentication system. AD user accounts can be enabled for Linux systems on the network and users can log in to those Linux systems using the same AD username and password

  • Change management is done only in Windows AD

Solution

Since Windows AD implements Kerberos authentication mechanism, so the existing GSS-Kerberos security plug-in in Spectrum Conductor with Spark enhanced for Windows AD integration in order to consolidate Windows and Linux system. AD user/password is expected to be used to logon the IBM Spectrum Conductor with Spark cluster by using the CLI or the cluster management console

 

  • How Linux client uses the IBM Spectrum Conductor with Spark GSS-Kerberos plug-in to logon to the IBM Spectrum Conductor with Spark cluster.

image

 

 

 

 

  1. IBM Spectrum Conductor with Spark management hosts get management TGT from AD server using keytab.

  2. Client gets TGT using an AD username and password. From TGT, client gets TGS for IBM Spectrum Conductor with Spark management hosts.

  3. Client sends TGS to management hosts.

  4. Management hosts verify the client TGS.

  5. Management hosts send the result to client

 

 

 

  • How to get users from the AD server

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AD users can log on to Linux hosts using Kerberos authentication by pam_sss. Linux commands "id" and "getent" can get the AD user information by nss_sss. This means all applications that run on the configured Linux host can get all AD users by calling the GLIB APIs

 

Support table

  • Support case

case

submit app

logon by GUI

logon by CLI

run workload

Linux management host

Y

Y

Y

Y

Linux client host

Y

N

Y

N

Linux compute host

Y

N

Y

Y

Windows client host

Y

N

Y

N

 

 

 

 

 

 

 

  • Support OS

OS

role

Windows2008-x86_64_R2

AD/KDC/DNS

Red Hat Enterprise Linux (RHEL) 7.1, 7.2, 7.3, 7.4
Ubuntu 16.04 LTS

IBM Spectrum Conductor with Spark master/compute/client host

 

Integrate Active Directory and Linux environments

 

  • IBM Spectrum Conductor with Spark must be installed on hosts in the cluster, Linux hosts in the cluster must be joined to the Windows domain, below is the topology example

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  • Install and configure AD server (key steps)​

  1. Log on the Windows Server.

  2. Navigate to Start > Run. 

  3. Type ‘dcpromo’ to install AD binaries.

image

 

 

 

 

 

 

 

 

 

 

 

  1. Create a new domain and set the FQDN of the new forest root domain

imageimage

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Set the functional level and install DNS for the domain controller.
    imageimage

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Set the administrator password and finish the installation. Then, restart

image

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Go to Server Manager, then install the optional service to the Active Directory Controller.

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Create a group named egogroup. 

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Note: You must set Unix attributes with corresponding domain and GID.

  1. Create a user named test1 with the password Letmein123.

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Note: You must set Unix attributes with corresponding attributes.

  • Install and configure all Linux servers (key steps) : Take the IBM Spectrum Conductor with Spark master kvm-013732 as an example.

  1. Install sssd. yum install -y sssd-ad sssd-libwbclient oddjob-mkhomedir

  2. Install Kerberos. yum install -y krb5-workstation krb5-libs krb5-auth-dialog sssd-krb5 krb5-devel

  3. Install samba. yum install -y samba samba-winbind-clients samba-client samba-common samba4-libs samba-winbind

  4. Udpate the DNS as follows:

[root@kvm-013732 ~]# cat /etc/resolv.conf 
# Generated by NetworkManager
search ad.citi.com
nameserver 10.0.14.136

  1. Configure smb as follows:

[root@kvm-013732 ~]# cat /etc/samba/smb.conf
[global]
        workgroup = AD
        realm = AD.CITI.COM
        security = ADS
        kerberos method = secrets and keytab
        log file = /var/log/samba/log.%m
        max log size = 50
        idmap config * : backend = tdb
        cups options = raw

  1. Configure Kerberos as follows:

[root@kvm-013732 ~]# cat /etc/krb5.conf
[logging]
 default = FILE:/var/log/krb5libs.log
 kdc = FILE:/var/log/krb5kdc.log
 admin_server = FILE:/var/log/kadmind.log

[libdefaults]
 default_realm = AD.CITI.COM
 dns_lookup_realm = false
 dns_lookup_kdc = false
 ticket_lifetime = 24h
 renew_lifetime = 7d
 forwardable = true

[realms]
 AD.CITI.COM = {
  kdc = img-windows2008.ad.citi.com
  admin_server = img-windows2008.ad.citi.com
  default_domain = AD.CITI.COM
 }

[domain_realm]
 .ad.citi.com = AD.CITI.COM
 ad.citi.com = AD.CITI.COM
[root@kvm-013732 ~]#

  1. Configure sssd as follows:

[root@kvm-013732 ~]# cat /etc/sssd/sssd.conf 
[sssd]
config_file_version = 2
debug_level = 9
domains =  ad.citi.com
services = nss, pam, pac, ssh ,autofs

[domain/ad.citi.com]
debug_level = 9
id_provider = ad
auth_provider = ad
access_provider = ad
cache_credentials = false
ldap_id_mapping = false
ad_server = img-windows2008.ad.citi.com
override_homedir = /citi/home/%d/%u
default_shell = /bin/bash
enumerate = true
ldap_search_base = CN=Users,DC=ad,DC=citi,DC=com
entry_cache_timeout = 10
ad_hostname = img-windows2008.ad.citi.com
ad_domain = ad.citi.com

 

Note: Ensure that the following file has 600 permission: chmod 600 /etc/sssd/sssd.conf

  1. Join to AD.

[root@kvm-013732 ~]# kinit administrator 
Password for administrator@AD.CITI.COM: 
[root@kvm-013732 ~]# net ads join -k 
Using short domain name -- AD
Joined 'KVM-013732' to dns domain 'ad.citi.com'
No DNS domain configured for kvm-013732. Unable to perform DNS Update.
DNS update failed: NT_STATUS_INVALID_PARAMETER
[root@kvm-013732 ~]# authconfig --enablesssdauth --enablesssd --enablemkhomedir --update 

  1. Check whether joining to AD was successful. 

[root@kvm-013732 ~]# net ads info 
LDAP server: 10.0.14.136
LDAP server name: img-windows2008.ad.citi.com
Realm: AD.CITI.COM
Bind Path: dc=AD,dc=CITI,dc=COM
LDAP port: 389
Server time: Thu, 30 Nov 2017 11:43:45 CST
KDC server: 10.0.14.136
Server time offset: 2
Last machine account password change: Thu, 30 Nov 2017 11:42:10 CST

  1. Restart the related services.

[root@kvm-013648 ~]# service smb restart && service winbind restart && service sssd restart
Redirecting to /bin/systemctl restart winbind.service
Redirecting to /bin/systemctl restart smb.service
Redirecting to /bin/systemctl restart sssd.service

  1. Check whether AD user is synchronized to Linux

[root@kvm-013732 ~]# getent passwd  |grep test1
test1:*:10007:10000:test1:/citi/home/ad.citi.com/test1:/bin/sh

  • Enable Kerberos authentication to all Linux servers

   Note: 1)Take the IBM Spectrum Conductor with Spark master kvm-013732 as an example.
             2)Assume EGO_TOP=/opt/alzhi/cws-daily2.
             3)Take the vemkd cluster-wide principal as an example, but also support vemkd host-level principal. 

  1. Update ego.conf as follows:

EGO_SEC_KRB_SERVICENAME=vemkd4cws/cluster1
EGO_SEC_PLUGIN=sec_ego_gsskrb
EGO_SEC_CONF="/opt/alzhi/cws-daily2/kernel/conf,0,DEBUG,/opt/alzhi/cws-daily2/kernel/log“

  1. Create the sec_ego_gsskrb.conf file in the plug-in configuration directory (specified in the EGO_SEC_CONF parameter) and define the following parameters in the file:

[root@kvm-013732 conf]# cat sec_ego_gsskrb.conf 
REALM=AD.CITI.COM
KERBEROS_ADMIN=egoadmin
KRB5_KTNAME=$EGO_TOP/kernel/conf/vemkd4cws.keytab 
KINITDIR=/usr/bin
ENABLE_AD_USERS_MANAGE=Y

  1. In the Windows AD, create the user account “egoadmin” and the VEMKD service account “vemkd4cws”

Note: below is Cluster-wide principal, for host-level principal, it should be: vemkd4cws/kvm-013732.novalocal

 

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

  1. Use “Ktpass” to set up identity mapping for the service accounts from Windows KDC, and generate the keytab file. Then ,distribute the vemkd4cws.keytab file to all IBM Spectrum Conductor with Spark management hosts under $EGO_TOP/kernel/conf/.

Ktpass -princ vemkd4cws/cluster1@AD.CITI.COM mapuser vemkd4cws -pass Letmein123 -out vemkd4cws.keytab

  1. Restart EGO

  2. From a Linux management/compute host, log on as the "egoadmin" user account.

[root@kvm-013732 ~]# source ${EGO_TOP}/profile.platform
[root@kvm-013732 ~]# egosh user logon -u egoadmin -x Letmein123
Logged on successfully

  1. List all Windows AD users from the EGO database
    [root@kvm-013732 ~]# egosh user list |grep test1
    test1                                        test1
    test2                                        test2                
    test3                                        test3

  2. Assign the cluster administrator role to the “test1” user account.
    [root@kvm-013732 ~]# egosh user assignrole -u test1 -r "Cluster Admin“
    Role <Cluster Admin> is assigned to user <test1>.

Note
1). If there is a user in both EGO and AD, it will be treated as the same user for EGO.
2). After switching the EGO_SEC_PLUGIN parameter from sec_ego_gsskrb to default, the AD user already loaded to the EGO database will disappear.

 

Submit spark application to cluster by AD user

You can submit Spark batch applications from the cluster management console and CLI by AD user for Kerberos authentication using the following 5 methods:

  1.  By the cluster management console: Logon on to the cluster management console by using the AD user's principal and password, then submit  application to Spark Instance Group

image

 

 

 

 

 

 

 

 

 

  1. By the AD user’s principal and password using spark-submit.

  • client mode

[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.passwd=Letmein123 --deploy-mode client  --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar

  • cluster mode

[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.passwd=Letmein123 --deploy-mode cluster --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar

 

  1. By AD user’s principal and keytable file using spark-submit.

1).Create a user test1 keytable file named test1.keytab from the KDC and copy it to the Linux host $EGO_TOP/kernel/conf/.

image

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

2). Submit a Spark application to the Spark Instance    Submit 2). Submit a Spark application to the Spark instance group using spark-submit.

[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.keytab=$EGO_TOP/kernel/conf/test1.keytab --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar

 

  1. By AD user principal's TGT using spark-submit.

1).Get the user test1 principal's TGT.
[root@kvm-013732 bin]#  kinit test1@AD.CITI.COM
Password for test1@AD.CITI.COM: 

 

2).Submit a Spark application to the Spark instance group.
[root@kvm-013732 bin]#./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar

 

  1. By ssh to Linux with AD user 

1).Logon to Linux with the user test1.
[root@kvm-013650 ~]# ssh test1@kvm-013650
Last login: Mon Dec  4 10:30:45 2017 from kvm-013650.ad.citi.com
-sh-4.2$

2).Submit a Spark application to the Spark instance group.
[root@kvm-013732 bin]#./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar

 

 


#SpectrumComputingGroup
0 comments
0 views

Permalink