When something like this happens, odds are very low that you will be able to troubleshoot further or execute any system specific commands like '/usr/bin/free', 'ps -eaf | wc -l'...etc.

I wish the error was more specific on what bloody resource in particular but anyways, Google hints this:
1) process table got filled up completely (check for defunct processes mainly).
2) server ran out of memory (IMHO, when memory becomes scarce.. it should call OOM, rather than throwing 'out of resource' error).
3) system reached its maximum number of process pids (by default it's around 32768).
4) number of open file descriptors got exhausted.
Since, I couldn't capture any of the above details when this issue happened, so kept quiet waiting for it to re-occur.
Fortunately/Unfortunately it happened again last night (on one of of our DB boxes running RedHat5).
This time, rather than wasting time in getting access to the server and then figuring what is going wrong, I thought to just crash the system via sysrq keys and got a core.
Once the core was dumped, host got rebooted & apps came up fine....Thank God!
Soon vmcore was treated with crash tool and to no surprise I could see more than 32000 processes (initiated as oracle user with name 'orarootagent.bi') floating around.


A-ha.....so my system-wide limit for maximum number of pids (/proc/sys/kernel/pid_max) was nearly exhausted by some rogue process . Cool ;) What next?
Boss is going to tap my shoulder tomorrow for this finding but at the very next second I might be asked.....HOW DO WE AVOID THIS HAPPENING IN FUTURE???
Suggestions:
a) Increase the overall system limits.
b) Limit the 'oracle' user to something less (probably half the the overall system limits).
c) Just blame DBA's for such nasty process and relax!
If { } any of the above hack works, you can skip the rest of the section and close this page....else { } continue :)
I wish the error was more specific on what bloody resource in particular but anyways, Google hints this:
1) process table got filled up completely (check for defunct processes mainly).
2) server ran out of memory (IMHO, when memory becomes scarce.. it should call OOM, rather than throwing 'out of resource' error).
3) system reached its maximum number of process pids (by default it's around 32768).
4) number of open file descriptors got exhausted.
Since, I couldn't capture any of the above details when this issue happened, so kept quiet waiting for it to re-occur.
Fortunately/Unfortunately it happened again last night (on one of of our DB boxes running RedHat5).
This time, rather than wasting time in getting access to the server and then figuring what is going wrong, I thought to just crash the system via sysrq keys and got a core.
Once the core was dumped, host got rebooted & apps came up fine....Thank God!
Soon vmcore was treated with crash tool and to no surprise I could see more than 32000 processes (initiated as oracle user with name 'orarootagent.bi') floating around.
A-ha.....so my system-wide limit for maximum number of pids (/proc/sys/kernel/pid_max) was nearly exhausted by some rogue process . Cool ;) What next?
Boss is going to tap my shoulder tomorrow for this finding but at the very next second I might be asked.....HOW DO WE AVOID THIS HAPPENING IN FUTURE???
Suggestions:
a) Increase the overall system limits.
b) Limit the 'oracle' user to something less (probably half the the overall system limits).
If { } any of the above hack works, you can skip the rest of the section and close this page....else { } continue :)