containers when one is requested.
Fix a bug in gpr_replica_del_index_api which doesn't preset num_tokens and
num_keys, but assumes they are 0.
Fix orte_ras_base_node_delete() function to operate properly to delete the
appropriate container in the 'orte-node' segment when requested.
This commit was SVN r6756.
This required a little fiddling with a number of areas. Biggest problem was that it uncovered a potential for an infinite loop to be created in the registry. If a callback function modified the registry, the registry checked the triggers to see if anything had fired. Well, if the original callback was due to a trigger firing, that condition hadn't changed - so the trigger fired again....which caused the callback to be called, which modified the registry, which checked the triggers, etc. etc.
Triggers are now checked and then "flagged" as being "in process" so that the registry will NOT recheck that trigger until all callbacks have been processed. Tried doing this with subscriptions as well, but that caused a problem - when we release processes from a stagegate, they (at the moment) immediately place data on the registry that should cause a subscription to fire. Unfortunately, the system will just hang if that subscription doesn't get processed. So, I have left the subscription system alone - any callback function that modifies the registry in a fashion that will fire a subscription will indeed fire that subscription. We'll have to see if this causes problems - it shouldn't, but a careless user could lock things up if the callback generates a callback to itself.
Also fixed the code that placed a process' RML contact info on the registry to eliminate the leading '/' from the string.
This commit was SVN r6684.
Haven't fully tested these yet (nobody is using them at the moment that I know of - good thing, since they haven't been working for a long time - though I know the MPI-2 stuff needs the functionality), but will do so shortly. For now, they compile.
This commit was SVN r6567.
1. Modify the registry to eliminate redundant data copying for startup messages.
2. Revise the subscription/trigger system to avoid redundant storage of triggers and subscriptions. This dramatically reduces the search time when a registry action occurs - to illustrate the point, there are now only a handful of triggers on the system for each job. Before, there were a handful of triggers for each PROCESS in the job, all of which had to be checked every time something happened on the registry. This is much, much faster now.
3. Update all subscriptions to the new format. There are now "named" subscriptions - this allows you to "name" a subscription that all the processes will be using. The first one to hit the registry actually defines the subscription. From then on, any subsequent "subscribes" to the same name just cause that process to "attach" to the existing subscription. This keeps the number of subscriptions being tracked by the registry to a minimum, while ensuring that each process still gets notified.
4. Do the same for triggers.
Also fixed a duplicate subscription problem that was causing people to receive data equal to the number of processes times the data they should have received from a trigger/subscription. Sorry about that... :-( ...but it's all better now!
Uncovered a situation where the modex data seems to be getting entered on the registry a second time - the latter time coming after the compound command has been "fired", thereby causing all the subscriptions to fire. Asked Tim and Jeff to look into this.
Second phase of the changes will involve modifying the xcast system so that the same message gets sent to all processes. This will further reduce the message traffic, and - once we have a true "broadcast" version of xcast - really speed things up and improve scalability.
This commit was SVN r6542.
1. Fix the reigstry's overwrite logic. It was only overwriting the first keyval specified in a value - the rest were just added on regardless of whether or not the keyval already existed. This was the source of the multiple keyvals some people were seeing - should be fixed now.
2. Change the orted command parsing options so it reports options that aren't recognized - should help reduce confusion
This commit was SVN r6536.
* rename ompi_basename to opal_basename
* rename ompi bitop functions to opal
* rename ompi_cmd_line to opal_cmd_line
* rename ompi_sizet2int to opal_sizet2int
* rename orte_daemon_init to opal_daemon_init
* rename ompi_few to opal_few
This commit was SVN r6330.