..
base
When we can detect that a daemon has failed, then we would like to terminate the system without having it lock up. The "hang" is currently caused by the system attempting to send messages to the daemons (specifically, ordering them to kill their local procs and then terminate). Unfortunately, without some idea of which daemon has died, the system hangs while attempting to send a message to someone who is no longer alive.
2007-10-15 18:00:30 +00:00
bproc
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
cnos
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
gridengine
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
lsf
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
poe
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
process
These changes were mostly captured in a prior RFC (except for #2 below) and are aimed specifically at improving startup performance and setting up the remaining modifications described in that RFC.
2007-10-05 19:48:23 +00:00
proxy
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
rsh
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
slurm
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
submit
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
tm
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
xcpu
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
xgrid
Per long threads on the mailing list and much confusion discussion
2007-12-15 13:32:02 +00:00
Makefile.am
Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done.
2006-09-14 21:29:51 +00:00
pls_types.h
Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done.
2006-09-14 21:29:51 +00:00
pls.h
Bring in the generalized xcast communication system along with the correspondingly revised orted launch. I will send a message out to developers explaining the basic changes. In brief:
2007-06-12 13:28:54 +00:00