1
1
openmpi/orte/mca/plm
Ralph Castain 629b95a2fe Afraid this has a couple of things mixed into the commit. Couldn't be helped - had missed one commit prior to running out the door on vacation.
Fix race conditions in abnormal terminations. We had done a first-cut at this in a prior commit. However, the window remained partially open due to the fact that the HNP has multiple paths leading to orte_finalize. Most of our frameworks don't care if they are finalized more than once, but one of them does, which meant we segfaulted if orte_finalize got called more than once. Besides, we really shouldn't be doing that anyway.

So we now introduce a set of atomic locks that prevent us from multiply calling abort, attempting to call orte_finalize, etc. My initial tests indicate this is working cleanly, but since it is a race condition issue, more testing will have to be done before we know for sure that this problem has been licked.

Also, some updates relevant to the tool comm library snuck in here. Since those also touched the orted code (as did the prior changes), I didn't want to attempt to separate them out - besides, they are coming in soon anyway. More on them later as that functionality approaches completion.

This commit was SVN r17843.
2008-03-17 17:58:59 +00:00
..
alps Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
base Afraid this has a couple of things mixed into the commit. Couldn't be helped - had missed one commit prior to running out the door on vacation. 2008-03-17 17:58:59 +00:00
ccp Select the windows CCP component at runtime by testing if we are on Windows cluster. 2008-03-07 01:31:53 +00:00
gridengine Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
lsf Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
md Silence some minor compiler warnings 2008-02-29 02:39:39 +00:00
poe Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
process Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
rsh Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
slurm Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
slurmd Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
submit Add default hostfile parameter plus --default-hostfile command line option. 2008-03-05 04:54:57 +00:00
tm Cleanup recursions in ORTE caused by processing recv'd messages that can cause the system to take action resulting in receipt of another message. 2008-02-28 19:58:32 +00:00
tmd First cut at direct launch for TM. Able to launch non-ORTE procs and detect their completion for a clean shutdown. 2008-03-05 13:51:32 +00:00
xgrid Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
Makefile.am Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
plm_types.h Replace all occurences of orte_pointer_array by opal_pointer_array. Remove the 2008-02-28 05:32:23 +00:00
plm.h Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00