Hi guys-
I'm trying to deploy NovaLink to an 8286-42A I have to use as a lab system. In theory it should be supported with NovaLink 2.2.0 + Ubuntu 20.04. I have a netboot server set up running Ubuntu 20.04 on x86_64 and I can sort of boot the P8, but I'm seeing a number of issues.
First, I had to rebuild core.elf because the one distributed on the NovaLink 2.2.0 media didn't include all the tftp/http support. I did this by installing the ppc grub components:
sudo dpkg --add-architecture ppc64el
sudo dpkg -i --ignore-depends=libc6:ppc64el,libparted2:ppc64el,grub-common:ppc64el /var/lib/tftpboot/ubuntu2004/pool/main/g/grub2/grub-ieee1275-bin_2.04-1ubuntu26.13_ppc64el.deb
(/var/lib/tftpboot/ubuntu2004 is the mountpoint for the Ubuntu 20.04 media)
Then I rebuilt it using:
grub-mkimage --output=/var/lib/tftpboot/core.elf --format=powerpc-ieee1275 boot configfile echo elf http ieee1275_fb linux loadenv ls net normal ofnet reboot regexp serial sleep tftp time true date -p /
That worked great, but this is where the real problems begin:
I used the HMC to update the system to the latest firmware (SV860_245), and then reset the server to factory defaults. At this point I should be able to toggle the system console type to IPMI. But no matter what, ipmitool can't connect. This seems like a big red flag that something is wrong with the firmware on this system. I spent a day or two trying to get it to work, including compiling the latest ipmitool source and trying that, but no dice. So I gave up and decided to use a serial console.
I can boot the server using the serial console, set up network boot in SMS, and boot to the grub menu. When I select the option to boot the NovaLink installer, a very odd sequence of events happens:
- The boot loader loads the Linux kernel, and then immediately reboots and does another BOOTP and drops back into grub. I get the following output on the console:
OF stdout device is: /vdevice/vty@30000000
Preparing to boot Linux version 4.18.0-500.el8.ppc64le (mockbuild@ppc64le-04.stream.rdu2.redhat.com) (gcc version 8.5.0 20210514 (Red Hat 8.5.0-20) (GCC)) #1 SMP Wed Jun 28 00:07:17 UTC 2023
Detected machine type: 0000000000000101
command line: BOOT_IMAGE=//novalink/ppc/ppc64/vmlinuz ro inst.text ip=dhcp nameserver= inst.repo=http://10.0.4.73/novalink BOOTIF=98:be:94:68:32:9e live-installer/net-image=http://10.0.4.73/novalink/images/install.img pkgsel/language-pack-patterns= pkgsel/install-language-support=false netcfg/disable_dhcp=true netcfg/choose_interface=auto netcfg/get_ipaddress=10.0.4.74 netcfg/get_netmask=255.255.255.0 netcfg/get_gateway=10.0.4.1 netcfg/get_nameservers=10.0.4.8 netcfg/get_hostname=p8a netcfg/get_domain= debian-installer/locale=en_US.UTF-8 debian-installer/country=US pvm-repo=http://10.0.4.73/novalink_dev/repo pvm-viosdir=http://10.0.4.73/novalink-vios/
Max number of cores passed to firmware: 256 (NR_CPUS = 2048)
Calling ibm,client-architecture-support...
- I try to run the installer again. This time we boot, but we get a kernel panic. The system reboots and displays the full IBM IBM IBM screen and I choose Option 1 to get back into SMS, choose network boot, get back into grub, and run the installer a 3rd time.
- This time, we actually boot. But, the installer crashes out with the following messages:
-- restarting PVM services
[ 555.431875] rc_ctrmc[6763]: Created symlink /etc/systemd/system/multi-user.target.wants/ctrmc.service → /usr/lib/systemd/system/ctrmc.service.
[ 555.431997] rc_ctrmc[6763]: Created symlink /etc/systemd/system/graphical.target.wants/ctrmc.service → /usr/lib/systemd/system/ctrmc.service.
[ 618.697582] pvmmenu[8765]: Failed to find any ManagedSystem entries.
[ 618.841977] pvmmenu[8776]: [PVME0101000A-0222] Unable to communicate with the access process. Error 2 .
[ 618.851087] pvmmenu[8777]: [PVME0101000A-0222] Unable to communicate with the access process. Error 2 .
[ 619.754381] pvmmenu[8778]: Failed to find any ManagedSystem entries.
Unable to set master mode.
[ 619.859989] pvmmenu[8760]: Unable to continue the installation.
/ #
I can run some commands from this environment. There is a service called pvm-core that runs a daemon called pvm_apd. This crashes, and if you try to run it from the command line, it exits immediately. I found that this service writes debug logs to /var/log/pvm/pvm_apd.dbg. In pvm_apd.dbg, we can see that it's trying to talk to /dev/ibmvmc, which does exist. /dev/ibmvmc is created by the ibmvmc kernel module which is loaded and can be seen with lsmod. Here is what appears to be the issue, as logged in pvm_apd.dbg:
06/11/24 20:48:01.535.579 UTC DEBUG pvm_apd[12550.140735765179904]: (ApMain.cpp:940) Access process 12550 called with --clean
06/11/24 20:48:01.535.705 UTC DEBUG pvm_apd[12550.140735765179904]: (ApMain.cpp:944) Access process 12550 ended normally.
06/11/24 20:48:01.535.778 UTC DEBUG pvm_apd[12550.140735765179904]: (common/thread/HmclSynchronizedQueuePool.cpp:161) HmclSynchronizedQueuePool status:
Number of available queues: 1
Number of outstanding queues: 0
Total checkouts: 0
Max checked out at one time: 0
Outstanding queues:
06/11/24 20:48:01.549.923 UTC DEBUG pvm_apd[12552.140736333442560]: (ApMain.cpp:958) Access process 12552 started.
06/11/24 20:48:01.550.086 UTC DEBUG pvm_apd[12552.140736333442560]: (common/cmdcaller/HmclCommandCaller.cpp:179) HmclCommandCaller.run: Command = "awk -F = '/partition_id/ { print $2 }' /proc/ppc64/lparcfg"
06/11/24 20:48:01.556.154 UTC DEBUG pvm_apd[12552.140736333442560]: (common/cmdcaller/HmclCommandCaller.cpp:466) Done waiting for process 12553
06/11/24 20:48:01.556.223 UTC DEBUG pvm_apd[12552.140736333442560]: (common/util/HmclAlphaRules.cpp:154) Management Partition lparID is 1
06/11/24 20:48:01.556.296 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:449) VMC exists: 0, VMC driver state: 0
06/11/24 20:48:01.556.324 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:328) VMC does not exist, try to create it.
06/11/24 20:48:01.584.245 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:488) VMC device created in slot: 30000002
06/11/24 20:48:01.584.302 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:537) OFDT file /proc/device-tree/vdevice/ibm,drc-indexes stat failed with errno 2
06/11/24 20:48:01.584.339 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:537) OFDT file /proc/device-tree/vdevice/ibm,drc-info stat failed with errno 2
06/11/24 20:48:01.584.365 UTC DEBUG pvm_apd[12552.140736333442560]: (ApVmcWrapper.cpp:503) No DRC name for VMC
06/11/24 20:48:01.584.658 UTC DEBUG pvm_apd[12552.140736303722736]: (common/exceptions/ApException.cpp:48) Exception:
ApException:
At: ApVmcWrapper.cpp:211
Message: Tried to read message when VMC not open
Category: Access Process(1)
Error Code: 0x3
Data: 0
06/11/24 20:48:01.584.777 UTC DEBUG pvm_apd[12552.140736303722736]: (ApMain.cpp:601) ApMain readMsg failed. Tried to read message when VMC not open
06/11/24 20:48:01.584.810 UTC DEBUG pvm_apd[12552.140736303722736]: (ApControl.cpp:3508) sendVMCDeviceNotOpenNotification - VMC Device was Closed, accessprocess will exit.
06/11/24 20:48:01.584.875 UTC DEBUG pvm_apd[12552.140736303722736]: (ApMain.cpp:685) ApMain readFromVmc exiting.
06/11/24 20:48:02.584.655 UTC DEBUG pvm_apd[12552.140736333442560]: (ApMain.cpp:1057) Access process received Vmc rc -6.
06/11/24 20:48:02.585.256 UTC DEBUG pvm_apd[12552.140736333442560]: (common/thread/HmclSynchronizedQueuePool.cpp:161) HmclSynchronizedQueuePool status:
Number of available queues: 2
Number of outstanding queues: 0
Total checkouts: 2
Max checked out at one time: 2
Outstanding queues:
I'm particularly interested in "No DRC name for VMC", and "Tried to read message when VMC not open". But there isn't enough documentation on the web to know where to go from here.
Again, this is an 8286-42A, running the latest firmware (SV860_245), and was factory reset using the ASMI interface immediately prior to trying the above.
Has anyone seen this, or tried to get NovaLink 2.2 running on Power 8? For the record, I tried the same with the 2.0.3 installer and got more-or-less the same behavior.
Thanks in advance!
------------------------------
Ben Huntsman
------------------------------