Originally posted by: tony.evans
We run fsck on mounted filesystems without any issues, you can't make any repairs but it can still hint at issues. Never had it make a filesystem worse than it already was, but it is of course up to you!
What user is bad_dir? What processes are owned by that user?
Sounds to me like you've got something open in that filesystem which a process has started removing but not finished yet (either due to a fault with it, or with the filesystem).
You can use the kernel debugger to find out which processes have a known inode open.
The instructions I have are for finding out why a filesystem won't unmount, but they may be applicable.
<hr />
Another way to search for the process still holding an inode on that FS
is the following:
1) start the kdb
2) vfs
--> This will show us all the mounted FS of the machine
--> Find the one which fails to unmount (I call it N here)
.
3) run
(0)> vfs N | awk '{ if ($3 > 0) print $1 " " $3 " " $4 " " $6 }' | more
After some lines of statistics for that FS you should see something
like
COUNT VFSP TYPE
2 1 F100009E2364A8B0 REG
3 1 F100009E2363A8B0 REG
4 1 F100009E2362A8B0 REG
5 1 F100009E2365A0B0 REG
...
.
The 2nd column is the count of users of that inode and the 3rd
column is the address of the gnode related to the inode
.
4) gnode <gnode_adr>
--> write down the gn_seg information and the gn_data information
.
5) scb 2 <gn_seg>
--> There you should search for the proc pointer (proc)
.
6) proc <proc>
--> There we have the process that has an inode open. Just
calculate the dec PID from the hex value, exit the kdb and kill
the process.
<hr />
You may be able to use that info to find out what, if anything, has any inodes open.