Originally posted by: Porktree
I've got an Oracle database running on a p5 with 64gb of ram and 32 cpu's. I'm running 5.3. Over the last couple of days, it appears as though the Oracle DB hangs, it takes minutes to login, su'ing to users other than root takes 1-2 minutes if it completes. If I shut down the database and reboot the server, everything is fine for the next 18 or so hours, and then it happens again. Looking at resourses, I'm not paging, I've got about 10-20% idle, and none of the disks or dac's are saturated. ie, no issue is indicated.
This morning, after I shut down the database, I tried to restart it without bouncing the server, and while it was starting I tried to 'tail -f $ALERT'. This command hung. I echo'd $ALERT and it immediately gave the path to the alert log, 'tail -f /u01/logs/alert.log' worked fine, so I tried the command again using the environment alias, and again it hung. What I notice then is the reason it's tough to su - to another user is because the .profile sources a lot of alias/environmental variables, and is hanging trying to access these.
All resources are free, 99% idle cpu's - no disk access, memory is not paging. But, any (except echo) that references an $VAR of any kind hangs.
I can't find anything about this, or what it might be or how to fix it. At this point I'm taking the system down tonight to rebuild the kernel and apply a couple of lpar's that I had planned on doing next weekend.
Anyone have any ideas? Thanks.