For anyone interested, the problem stemmed from two things:
1. a bug in the ompi_bitmap utility (which I copied to orte_bitmap to avoid unintentionally disturbing something else) that causes the bitmap NOT to expand unless the caller asks for a bit that is more than one byte outside the current array size. The unit test didn't pick it up because it doesn't check that close to the boundary.
2. a "feature" in the ompi_bitmap utility that only expands the array if you try to SET a bit outside the current boundary, but NOT if you try to CLEAR a bit outside the array limit. This appears intentional as the unit test checks for this behavior, but I hadn't been expecting the asymmetry.
The orte_bitmap utility now appropriately expands in both circumstances. I also added a function to expand the array so it "covers" a bit location without setting or clearing it. The function allows you to ensure the array is big enough to handle the specified bit, but leave the bit alone if it already is there (the other functions would set/clear it if it was).
I've tested it with up to 100 processes without problem.
This commit was SVN r5980.
split between OMPI and ORTE, added a lengthy comment to ompi_bitmap.h
explaining the reason why (and how it would be fine to re-merge them
-- if someone has the time) and references to it from all the other
relevant .h files.
This commit was SVN r5876.
1. Fixed the GPR search engine so that keys AND worked, and so that multiple objects with the same key didn't mess up the search.
2. Added an orte_bitmap function based on the existing ompi_bitmap one, but minus the fortran "pollution"
3. Added a new name service function called create_my_name to remove the duplicate name creation that was happening with the RML. Basically, the RML has to assign a name when a process makes first contact if the process doesn't already have a name. For processes that get a name passed into them, this was okay - the name was already assigned. For other processes (e.g., singletons), this was not okay - the first message to the seed daemon was to create a name, which caused the RML to assign one, and then the name service to assign another.
4. Change orted so it gets its name the way everyone else does - during orte_init.
This commit was SVN r5842.