Originally posted by: xhu
Background
In many organizations, system administrators need to integrate Linux systems to their existing Microsoft Windows Active Directory domain environments. Windows AD then serves as a centralized accounts controller so that AD users are able to submit Spark applications to IBM Spectrum Conductor with Spark clusters conveniently, and achieve the following advantages
-
User accounts are located in Windows AD, with group membership also defined in Windows AD
-
Windows AD is the central authentication system. AD user accounts can be enabled for Linux systems on the network and users can log in to those Linux systems using the same AD username and password
-
Change management is done only in Windows AD
Solution
Since Windows AD implements Kerberos authentication mechanism, so the existing GSS-Kerberos security plug-in in Spectrum Conductor with Spark enhanced for Windows AD integration in order to consolidate Windows and Linux system. AD user/password is expected to be used to logon the IBM Spectrum Conductor with Spark cluster by using the CLI or the cluster management console.

-
IBM Spectrum Conductor with Spark management hosts get management TGT from AD server using keytab.
-
Client gets TGT using an AD username and password. From TGT, client gets TGS for IBM Spectrum Conductor with Spark management hosts.
-
Client sends TGS to management hosts.
-
Management hosts verify the client TGS.
-
Management hosts send the result to client

AD users can log on to Linux hosts using Kerberos authentication by pam_sss. Linux commands "id" and "getent" can get the AD user information by nss_sss. This means all applications that run on the configured Linux host can get all AD users by calling the GLIB APIs
Support table
case
|
submit app
|
logon by GUI
|
logon by CLI
|
run workload
|
Linux management host
|
Y
|
Y
|
Y
|
Y
|
Linux client host
|
Y
|
N
|
Y
|
N
|
Linux compute host
|
Y
|
N
|
Y
|
Y
|
Windows client host
|
Y
|
N
|
Y
|
N
|
OS
|
role
|
Windows2008-x86_64_R2
|
AD/KDC/DNS
|
Red Hat Enterprise Linux (RHEL) 7.1, 7.2, 7.3, 7.4
Ubuntu 16.04 LTS
|
IBM Spectrum Conductor with Spark master/compute/client host
|
Integrate Active Directory and Linux environments

-
Log on the Windows Server.
-
Navigate to Start > Run.
-
Type ‘dcpromo’ to install AD binaries.

-
Create a new domain and set the FQDN of the new forest root domain


-
Set the functional level and install DNS for the domain controller.


-
Set the administrator password and finish the installation. Then, restart


-
Go to Server Manager, then install the optional service to the Active Directory Controller.

-
Create a group named egogroup.

Note: You must set Unix attributes with corresponding domain and GID.
-
Create a user named test1 with the password Letmein123.

Note: You must set Unix attributes with corresponding attributes.
-
Install sssd. yum install -y sssd-ad sssd-libwbclient oddjob-mkhomedir
-
Install Kerberos. yum install -y krb5-workstation krb5-libs krb5-auth-dialog sssd-krb5 krb5-devel
-
Install samba. yum install -y samba samba-winbind-clients samba-client samba-common samba4-libs samba-winbind
-
Udpate the DNS as follows:
[root@kvm-013732 ~]# cat /etc/resolv.conf
# Generated by NetworkManager
search ad.citi.com
nameserver 10.0.14.136
-
Configure smb as follows:
[root@kvm-013732 ~]# cat /etc/samba/smb.conf
[global]
workgroup = AD
realm = AD.CITI.COM
security = ADS
kerberos method = secrets and keytab
log file = /var/log/samba/log.%m
max log size = 50
idmap config * : backend = tdb
cups options = raw
-
Configure Kerberos as follows:
[root@kvm-013732 ~]# cat /etc/krb5.conf
[logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
[libdefaults]
default_realm = AD.CITI.COM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
[realms]
AD.CITI.COM = {
kdc = img-windows2008.ad.citi.com
admin_server = img-windows2008.ad.citi.com
default_domain = AD.CITI.COM
}
[domain_realm]
.ad.citi.com = AD.CITI.COM
ad.citi.com = AD.CITI.COM
[root@kvm-013732 ~]#
-
Configure sssd as follows:
[root@kvm-013732 ~]# cat /etc/sssd/sssd.conf
[sssd]
config_file_version = 2
debug_level = 9
domains = ad.citi.com
services = nss, pam, pac, ssh ,autofs
[domain/ad.citi.com]
debug_level = 9
id_provider = ad
auth_provider = ad
access_provider = ad
cache_credentials = false
ldap_id_mapping = false
ad_server = img-windows2008.ad.citi.com
override_homedir = /citi/home/%d/%u
default_shell = /bin/bash
enumerate = true
ldap_search_base = CN=Users,DC=ad,DC=citi,DC=com
entry_cache_timeout = 10
ad_hostname = img-windows2008.ad.citi.com
ad_domain = ad.citi.com
Note: Ensure that the following file has 600 permission: chmod 600 /etc/sssd/sssd.conf
-
Join to AD.
[root@kvm-013732 ~]# kinit administrator
Password for administrator@AD.CITI.COM:
[root@kvm-013732 ~]# net ads join -k
Using short domain name -- AD
Joined 'KVM-013732' to dns domain 'ad.citi.com'
No DNS domain configured for kvm-013732. Unable to perform DNS Update.
DNS update failed: NT_STATUS_INVALID_PARAMETER
[root@kvm-013732 ~]# authconfig --enablesssdauth --enablesssd --enablemkhomedir --update
-
Check whether joining to AD was successful.
[root@kvm-013732 ~]# net ads info
LDAP server: 10.0.14.136
LDAP server name: img-windows2008.ad.citi.com
Realm: AD.CITI.COM
Bind Path: dc=AD,dc=CITI,dc=COM
LDAP port: 389
Server time: Thu, 30 Nov 2017 11:43:45 CST
KDC server: 10.0.14.136
Server time offset: 2
Last machine account password change: Thu, 30 Nov 2017 11:42:10 CST
-
Restart the related services.
[root@kvm-013648 ~]# service smb restart && service winbind restart && service sssd restart
Redirecting to /bin/systemctl restart winbind.service
Redirecting to /bin/systemctl restart smb.service
Redirecting to /bin/systemctl restart sssd.service
-
Check whether AD user is synchronized to Linux
[root@kvm-013732 ~]# getent passwd |grep test1
test1:*:10007:10000:test1:/citi/home/ad.citi.com/test1:/bin/sh
Note: 1)Take the IBM Spectrum Conductor with Spark master kvm-013732 as an example.
2)Assume EGO_TOP=/opt/alzhi/cws-daily2.
3)Take the vemkd cluster-wide principal as an example, but also support vemkd host-level principal.
-
Update ego.conf as follows:
EGO_SEC_KRB_SERVICENAME=vemkd4cws/cluster1
EGO_SEC_PLUGIN=sec_ego_gsskrb
EGO_SEC_CONF="/opt/alzhi/cws-daily2/kernel/conf,0,DEBUG,/opt/alzhi/cws-daily2/kernel/log“
-
Create the sec_ego_gsskrb.conf file in the plug-in configuration directory (specified in the EGO_SEC_CONF parameter) and define the following parameters in the file:
[root@kvm-013732 conf]# cat sec_ego_gsskrb.conf
REALM=AD.CITI.COM
KERBEROS_ADMIN=egoadmin
KRB5_KTNAME=$EGO_TOP/kernel/conf/vemkd4cws.keytab
KINITDIR=/usr/bin
ENABLE_AD_USERS_MANAGE=Y
-
In the Windows AD, create the user account “egoadmin” and the VEMKD service account “vemkd4cws”
Note: below is Cluster-wide principal, for host-level principal, it should be: vemkd4cws/kvm-013732.novalocal


-
Use “Ktpass” to set up identity mapping for the service accounts from Windows KDC, and generate the keytab file. Then ,distribute the vemkd4cws.keytab file to all IBM Spectrum Conductor with Spark management hosts under $EGO_TOP/kernel/conf/.
Ktpass -princ vemkd4cws/cluster1@AD.CITI.COM mapuser vemkd4cws -pass Letmein123 -out vemkd4cws.keytab
-
Restart EGO
-
From a Linux management/compute host, log on as the "egoadmin" user account.
[root@kvm-013732 ~]# source ${EGO_TOP}/profile.platform
[root@kvm-013732 ~]# egosh user logon -u egoadmin -x Letmein123
Logged on successfully
-
List all Windows AD users from the EGO database
[root@kvm-013732 ~]# egosh user list |grep test1
test1 test1
test2 test2
test3 test3
-
Assign the cluster administrator role to the “test1” user account.
[root@kvm-013732 ~]# egosh user assignrole -u test1 -r "Cluster Admin“
Role <Cluster Admin> is assigned to user <test1>.
Note:
1). If there is a user in both EGO and AD, it will be treated as the same user for EGO.
2). After switching the EGO_SEC_PLUGIN parameter from sec_ego_gsskrb to default, the AD user already loaded to the EGO database will disappear.
Submit spark application to cluster by AD user
You can submit Spark batch applications from the cluster management console and CLI by AD user for Kerberos authentication using the following 5 methods:
-
By the cluster management console: Logon on to the cluster management console by using the AD user's principal and password, then submit application to Spark Instance Group

-
By the AD user’s principal and password using spark-submit.
[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.passwd=Letmein123 --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar
[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.passwd=Letmein123 --deploy-mode cluster --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar
-
By AD user’s principal and keytable file using spark-submit.
1).Create a user test1 keytable file named test1.keytab from the KDC and copy it to the Linux host $EGO_TOP/kernel/conf/.

2). Submit a Spark application to the Spark Instance Submit 2). Submit a Spark application to the Spark instance group using spark-submit.
[root@kvm-013732 bin]# ./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --conf spark.ego.keytab=$EGO_TOP/kernel/conf/test1.keytab --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar
-
By AD user principal's TGT using spark-submit.
1).Get the user test1 principal's TGT.
[root@kvm-013732 bin]# kinit test1@AD.CITI.COM
Password for test1@AD.CITI.COM:
2).Submit a Spark application to the Spark instance group.
[root@kvm-013732 bin]#./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar
-
By ssh to Linux with AD user
1).Logon to Linux with the user test1.
[root@kvm-013650 ~]# ssh test1@kvm-013650
Last login: Mon Dec 4 10:30:45 2017 from kvm-013650.ad.citi.com
-sh-4.2$
2).Submit a Spark application to the Spark instance group.
[root@kvm-013732 bin]#./spark-submit --master spark://kvm-013732.ad.citi.com:7078 --conf spark.ego.uname=test1 --deploy-mode client --class org.apache.spark.examples.SparkPi ../examples/jars/spark-examples_2.11-2.2.0.jar
#SpectrumComputingGroup