1
1
openmpi/orte/mca/plm
Shiqing Fan 0259fa0b9c Correct a few variable names.
This commit was SVN r22401.
2010-01-14 10:55:15 +00:00
..
alps Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
base Detect the scenario where one or more procs fail to call orte/ompi_init while others in the job do. This scenario can cause the job to hang as MPI_Init contains a barrier operation that will not complete. Although ORTE does not contain such a barrier, it still will be considered as an error scenario so that we can detect the MPI case - otherwise, ORTE has no knowledge of OMPI and wouldn't know how to differentiate the use-cases. 2009-12-17 19:39:53 +00:00
bproc Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
ccp Clean up a little bit. 2009-10-06 07:52:43 +00:00
lsf Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
process Correct a few variable names. 2010-01-14 10:55:15 +00:00
rsh Don't cancel the recv unless it was issued or else we generate an error whenever we launch an app without having to launch daemons (e.g., a completely local launch to mpirun) 2009-12-03 04:28:43 +00:00
rshd Restore the original API to terminate individual processes instead of the entire job. This was originally removed as we didn't at that time know how to take advantage of it. Some of us are now working on proactive resilience methods that move procs prior to node failure, so this is now a required API. Modify the odls, plm, and orted functions to support this new functionality. 2009-07-13 02:29:17 +00:00
slurm Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
submit Restore the original API to terminate individual processes instead of the entire job. This was originally removed as we didn't at that time know how to take advantage of it. Some of us are now working on proactive resilience methods that move procs prior to node failure, so this is now a required API. Modify the odls, plm, and orted functions to support this new functionality. 2009-07-13 02:29:17 +00:00
tm Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
tmd Courtesy of Ralph and Jeff: 2009-10-24 01:04:35 +00:00
xgrid Santa's back! Fix all warnings about the deprecated usage of 2009-12-16 00:06:37 +00:00
Makefile.am Merge the ORTE devel branch into the main trunk. Details of what this means will be circulated separately. 2008-02-28 01:57:57 +00:00
plm_types.h Correct an error that causes the system to "bounce" when we order a job killed. We didn't used to discriminate between a process being ordered to die, and a process that was aborted by an external signal. Unfortunately, that means the error mgr gets called and told a process abnormally aborted when we order termination, thus causing the errmgr to send out a "kill procs" command again. 2009-10-14 22:49:56 +00:00
plm.h Restore the original API to terminate individual processes instead of the entire job. This was originally removed as we didn't at that time know how to take advantage of it. Some of us are now working on proactive resilience methods that move procs prior to node failure, so this is now a required API. Modify the odls, plm, and orted functions to support this new functionality. 2009-07-13 02:29:17 +00:00