I will try to make this short and quick. Sorry, I have not used this site in many years as I was moved off AIX duties. But I have been called back for this issue and I need some insights.We have an AIX 5300-12 server that runs a lot of scripts in root's crontab. Some of these scripts interact with files lists on this AIX box as well as its twin in our EMEA site. All of a sudden, as of last week (12/11/2021) the jobs started hanging. If you look at "ps -ef" you see various "rsh EMEA-Server <Some command to run on the EMEA server, like "mv some file" or "touch a file". And you see the script (the parent process) that was called in cron. They are just sitting there.Turns out that the rsh process is the first one in each of the scripts. BUT if you go to the EMEA server, the action has been received and accomplished (be it move a file or touch a file or whatever) it is done and no longer in the "ps -ef" on the EMEA AIX server.If you come back to the USA AIX system and do a "kill -1" against the PID of the hung rsh command, the parrent program moves on but hangs again if there is a new "rsh" to EMEA?But get this.... If I run the same script that is scheduled in root's crontab, but run it at the command line.... it runs through, including all the "rsh" to EMEA and completes?So, what changed? We had some WAN/LAN changes on 12/11/2021 that beefed up the encryption between USA and EMEA sites. But our Linux and Windows servers do not have issues. And, again, these AIX scripts (kshell and perl scripts) run fine at the command line of the USA AIX server??Tried other AIX servers on both sides, same hang up.Tried the same command in crontab, between two AIX 5.3 servers both in USA, It WORKS?Only if fired up from cron and only if it goes from servers in one site to a server in the other site.Any insights would be appreciated.(Oh, Yes, I know SSH would be much better and safer than RSH. But I have to work, for now, with what I have.)Kind regards,Chris Baker
Christopher, did you try to run with a closed stdin ?
It's one thing that is frequently forgotten is that stuff run from cron doesn't have a terminal descriptor on stdin, and that sometimes causes problems (and rsh/ssh are know to have issues with that).
On an interactive session, try to run the script like this:
(/path/to/script/file > stdout 2> stderr)
Other things to look into:
- Make sure that the rsh command is called with '-n' (local stdin not passed to remote command).
- You can avoid having a separate connection for stderr by calling rsh with '-a'.
> I know that scripts run as root in cron do not inherit the same environment as root does outside of cron.
There are two things that are different in a cronjob:
- It's not an interactive session. (ex. "tty -s" will set an RC != 0 because there's no TTY associated with STDIN)
- It's not a login shell (No sourcing of /etc/profile, $HOME/.profile, and $ENV)
Capture the environment from the script, at the head of the script add "env | sort > ~root/script.env.out".
Run manually, move script.env.out to manual.out, then wait for it to run from cron and move script.env.out to cron.out
Check the differences: diff -u manual.out cron.out
If there is any variable that the script needs, it should be declared on the script.