1
1
openmpi/orte/mca
Ralph Castain 4e79a51395 Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job.
This required a little fiddling with a number of areas. Biggest problem was that it uncovered a potential for an infinite loop to be created in the registry. If a callback function modified the registry, the registry checked the triggers to see if anything had fired. Well, if the original callback was due to a trigger firing, that condition hadn't changed - so the trigger fired again....which caused the callback to be called, which modified the registry, which checked the triggers, etc. etc.

Triggers are now checked and then "flagged" as being "in process" so that the registry will NOT recheck that trigger until all callbacks have been processed. Tried doing this with subscriptions as well, but that caused a problem - when we release processes from a stagegate, they (at the moment) immediately place data on the registry that should cause a subscription to fire. Unfortunately, the system will just hang if that subscription doesn't get processed. So, I have left the subscription system alone - any callback function that modifies the registry in a fashion that will fire a subscription will indeed fire that subscription. We'll have to see if this causes problems - it shouldn't, but a careless user could lock things up if the callback generates a callback to itself.

Also fixed the code that placed a process' RML contact info on the registry to eliminate the leading '/' from the string.

This commit was SVN r6684.
2005-07-29 14:11:19 +00:00
..
errmgr * add local hook to remove static-components.h in distclean target. The 2005-07-08 13:54:12 +00:00
gpr Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. 2005-07-29 14:11:19 +00:00
iof Fix a holdover mistake from the directory re-org: 2005-07-19 12:25:19 +00:00
ns Move set_my_name (NDS) functionality from ns_base and universe contact 2005-07-27 23:18:16 +00:00
oob Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. 2005-07-29 14:11:19 +00:00
pls - enabled new bproc components 2005-07-28 22:28:38 +00:00
ras - enabled new bproc components 2005-07-28 22:28:38 +00:00
rds * add local hook to remove static-components.h in distclean target. The 2005-07-08 13:54:12 +00:00
rmaps * add local hook to remove static-components.h in distclean target. The 2005-07-08 13:54:12 +00:00
rmgr Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. 2005-07-29 14:11:19 +00:00
rml * add a bunch of svn:ignored files 2005-07-28 06:23:34 +00:00
schema Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. 2005-07-29 14:11:19 +00:00
sds - enabled new bproc components 2005-07-28 22:28:38 +00:00
soh Add a job_info segment to the system that holds a container for each job. Within each container is a keyval indicating the job state (i.e., all procs at stage1, finalized, etc.). This provides a rough state-of-health for the job. 2005-07-29 14:11:19 +00:00
Makefile.am Move set_my_name (NDS) functionality from ns_base and universe contact 2005-07-27 23:18:16 +00:00