1. if the user has specified sched_yield, we simply do what we are told
2. if they didn't specify anything, try to get the number of processors on this node. Note that we already now get the number of local procs in our job that are sharing this node - that now comes in through the proc callback and is stored in the ompi_proc_t structures.
3. if we can get the number of processors, compare that to the number of local procs from my job that are sharing my node. If the number of local procs exceeds the number of processors, then set sched_yield to true. If not, then be a hog and set sched_yield to false
4. if we can't get the number of processors, default to conservative behavior and set sched_yield to true.
Note that I have not yet dealt with the need to dynamically adjust this setting as more processes are added via comm_spawn. So far, we are *only* looking within our own job. Given that we have now moved this logic to mpi_init (and away from the orteds), it isn't yet clear to me how a process will be informed about the number of procs in *other* jobs that are also sharing this node.
Something to continue to ponder.
This commit was SVN r13430.
* The real fix, don't leave the OOB in blocking mode during comm_dyn_init(),
as it means no progressing MPI events while the event library is waiting
for TCP stuff to come in.
* Add many comments explaining the reasons for the current ordering
This commit was SVN r13422.
* have the mpool size be based on MCW, not num procs
in other jobs we know about. Solves the problem of
the spawned job having a much bigger than needed
sm file
* Can't assume that "me" is in the list of procs
passed to addprocs, so need to use slightly different
logic and not go through all of add procs unless
there's a proc in my job that isn't me.
This seems to greatly improve the situation, although
there still seems to be more of a slowdown through
MPI_INIT for the children (if there are more than one
child) than MPI_INIT for the parent if there are 'n'
children compared to 'n' parents. Hopefully that
made sense ;)
This commit was SVN r13417.
of the first entry might not be the start of the user's buffer. This is
similar to what ompi_convertor_unpack does. This is the solution for
the test case attached to ticket #690.
Refs trac:690
This commit was SVN r13397.
The following Trac tickets were found above:
Ticket 690 --> https://svn.open-mpi.org/trac/ompi/ticket/690
not being able to take C function pointers for either of the
copy or the delete fn. Fix by overloading the Create_keyval methods.
Fix trac #737, #738. Reviewed by jsquyres.
* A couple of cxx tests in ompi-tests (winkeyval.cc & typekeyval.cc)
will be re-enabled to regression test for this fix.
This commit was SVN r13391.
completed successfully, Bad Things(tm) could happen.
* Now we explicitly check orte_initialized (a new global in ORTE
indicating whether we are between orte_init() and orte_finalize()
or not), and if so, react accordingly.
* If ORTE is initialized, use orte_system_info.nodename; otherwise,
use gethostname().
* Add loop protection to ensure that ompi_mpi_abort() is not invoked
multiple times recursively.
This commit was SVN r13354.
know what my local rank is, and therefore set my paffinity ID as
appropriate. Specifically, we're no longer relying on the
special/secret mpi_paffinity_processor MCA parameter that the orted
would set for us.
This allows processor affinity to be used in environments where the
orted is not used (e.g., bproc, and someday in the hopefully not
too-distant future, SLURM).
This commit was SVN r13352.
The following SVN revision numbers were found above:
r13351 --> open-mpi/ompi@a338b7e533
Over to Jeff now for modifying mpi_init accordingly.
Until Jeff makes his changes, nobody should see anything different as the new info just isn't used by anything!
This commit was SVN r13351.
MPICH2 for "small" commutative operations in the reduce_scatter basic
implementation. "small" is currently pretty big, as it doesn't take
much to beat reduce/scatterv. Need to do much more than this for
better all around performance of MPI_Reduce_scatter, but this was enough
to solve the problems I was having.
This commit was SVN r13348.
Found another places that we were incorrectly casting a C++ MPI handle
array to the corresponding C array type and hoping for the best (which
won't work at all). This commit fixes things so that we now do the
proper conversion between C<-->C++ handles.
This commit was SVN r13346.
and George on these refinements):
* Rename the static OBJ initializer macro to be
OPAL_OBJ_STATIC_INIT(class)
* Ensure that all static OBJ initializations get a refcount of 1
(doesn't ''really'' matter, since they're static, it should never
get to the point where the OBJ is DESTRUCTed, but more correct
nonetheless)
* Add a "magic number" to the OBJ when compiling with debug support.
The magic number does some rudimentary support to ensure that
you're operating on a valid OBJ (and fails an assertion if you're
not). Check to ensure that the memory contains the magic number
when performing actions of OBJ's. Also remove the magic number
when DESTRUCTing OBJs, so that if, for example, an OBJ is
DESTRUCTed more than once, we'll fail the magic number assert.
This commit was SVN r13338.
The following SVN revision numbers were found above:
r13227 --> open-mpi/ompi@96030de97b
r13228 --> open-mpi/ompi@c2e9075d29
- post isends in reverse order of posting irecvs.
if the messages arrive approximately in order, this should
minimize the time spent in matching the requests.
I did not see any performance difference over MX up to 64 nodes, but
the change makes sense and may have some impact when we have (many)
more nodes.
This commit was SVN r13337.
configured with --disable-mpi-cxx so that the default -I flags in the
wrapper compilers don't point to a directory that doesn't exist.
Thanks to Martin Audet for identifying the problem.
This commit was SVN r13296.
- Allreduce algorithms:
- Recursive doubling is used for small messages (up to 10KB) and can be used for
both commutative and non-commutative operations.
Recursive doubling passed OCC, IMB-3.2, Intel (Allreduce_c, Allreduce_loc_c, and
Allreduce_user_c), mpi_test_suite (Allreduce MIN/MAX, and Allreduce MIN/MAX with
MPI_IN_PLACE) tests on TCP up to 36 nodes and MX up to 64 nodes.
- Ring algorithms performs well for larger messages but cannot be used for
non-commutative operations. It passed the same tests as recursive doubling, except
some of the non-commutative tests in Intel benchmarks Allreduce_loc_c and Allreduce_user_c
(which was expected).
- MPI_Allreduce with new decision function passed all of the tests mentioned above.
- Cleaning up coll_tuned_util. Moving isendrecv to static inline just like sendrecv.
This commit was SVN r13252.
not the component. This potentially allows for a mix of HCAs that
support eager RDMA and those who do not on a port-by-port basis.
This commit was SVN r13242.
* If the text to cite where the problem occurred is "\n", prettyprint
somethign a little nicer so that it's clear that we're talking
about the end of line
* Add a missing help message ("ini file:unknown field"), and display
it a little better (i.e., show the erroneous field, not a
misleading "end of line" marker)
* It's "OpenIB", not "Open IB"
This commit was SVN r13241.
needlessly registered in multiple different places, and none of them
had a good help string. There was also an inconsistent check for
setting both mpi_leave_pinned and mpi_leave_pinned_pipeline (i.e., it
was only in ob1). This commit moves the registration of these params
to one central place (ompi/runtime/ompi_mpi_params.c, with all other
mpi_* MCA params) and uses globals to propagate the values as
relevant. The error check was also moved to the central location to
ensure that we can consistency everywhere.
This commit was SVN r13226.
return the buffer address from Fortran. It is not expected
behavior. For MPI_Buffer_attach, adjust the address of
the buffer handed in so it is always aligned.
Refs trac:750
Buffer detach reviewed by Jeff Squyres
Buffer attach alignment reviewed by George Bosilca
This commit was SVN r13205.
The following Trac tickets were found above:
Ticket 750 --> https://svn.open-mpi.org/trac/ompi/ticket/750
OB1 always use first element from array of BTLs available for RDMA. The patch
change the array creation algorithm, it puts different BTL in the first element
in round robin fashion.
This commit was SVN r13174.
to the F90 binding for MPI_INITIALIZED was wrong (should have been
logical, not integer).
Fixes trac:782.
This commit was SVN r13170.
The following Trac tickets were found above:
Ticket 782 --> https://svn.open-mpi.org/trac/ompi/ticket/782
- removing static qualification on ompi_coll_tuned_sendrecv
- adding ompi_coll_tuned_isendrecv function which posts isend and irecv requests
These changes are separate from but necessary for new algorithms I am working on.
This commit was SVN r13161.
so there's no longer a race there (I used to do the unlock request last, after local completion of all the
requests completed, to try to avoid having the passive side reply to the active side, but I don't do that
anymore). The unlock side will not "unlock" the window until it actually receives the correct number of results,
so we're good there.
This fixes an issue where we would receive data on the remote side we weren't expecting that could cause
us to release a lock before it really should have been released to the requesting peer. It could also
cause a deadlock if one of the processes trying to unlock was "self", as that would result in the active
unlock never sending the unlock request, even though it sent the payload, which could cause a counter
that should always be positive to hit -1, causing an infinite loop that could only be solved by
popping up the stack, which was an impossibility.
Refs trac:785
This commit was SVN r13160.
The following Trac tickets were found above:
Ticket 785 --> https://svn.open-mpi.org/trac/ompi/ticket/785
during their send calls by dropping the loop through the list of pending control messages if any are marked
as completed.
Refs trac:784
This commit was SVN r13159.
The following Trac tickets were found above:
Ticket 784 --> https://svn.open-mpi.org/trac/ompi/ticket/784
MPI_Aint. On 64-bit big endian machines, these can have some unpleasent
issues.
Refs trac:734
This commit was SVN r13140.
The following Trac tickets were found above:
Ticket 734 --> https://svn.open-mpi.org/trac/ompi/ticket/734
Otherwise, we end up recursively calling into the progress functions
and corrupting a list that doesn't like to be corrupted.
Refs trac:561
This commit was SVN r13138.
The following Trac tickets were found above:
Ticket 561 --> https://svn.open-mpi.org/trac/ompi/ticket/561
The 2nd parameter in MPI_WIN_CREATE is actually an address integer,
not a regular integer. The F77 prototype for this function was wrong,
causing Bad Things on some 64 bit platforms (on other 64 bit
platforms, we just got lucky).
This commit was SVN r13133.
The following Trac tickets were found above:
Ticket 732 --> https://svn.open-mpi.org/trac/ompi/ticket/732
side and only let MPI_WIN_UNLOCK return when the passive side has actively
replied that the window is unlocked.
Refs trac:761
This commit was SVN r13118.
The following Trac tickets were found above:
Ticket 761 --> https://svn.open-mpi.org/trac/ompi/ticket/761
to pass some of the tests provided by Sun. These will, of course, greatly
slow down calls to MPI_ACCUMULATE, but there's no way to pass the test
suite without them :/.
Refs trac:760
This commit was SVN r13117.
The following Trac tickets were found above:
Ticket 760 --> https://svn.open-mpi.org/trac/ompi/ticket/760
and convert it to a pointer when finding the destination addr.
Refs trac:587
This commit was SVN r13116.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
we are looking at subnet_id's and we are counting active ports per subnet.
move subnet count out of procs loop,, no need to do it there...
This commit was SVN r13105.
memcpy() instead of assigning the struct's by value.
Fixes trac:739.
This commit was SVN r13081.
The following Trac tickets were found above:
Ticket 739 --> https://svn.open-mpi.org/trac/ompi/ticket/739
Sorry for the configure change -- hopefully it's early enough in the
morning that it won't affect people... (new approach won't have a
configure change).
Refs trac:739.
This commit was SVN r13080.
The following Trac tickets were found above:
Ticket 739 --> https://svn.open-mpi.org/trac/ompi/ticket/739
Let's minimize the disturbances and say that the configure system is right.
From now on it's OPAL_BOOL_STRUCT_COPY. This one is related to r13076 and
has to follow when r13076 goes in the 1.2.
This commit was SVN r13077.
The following SVN revision numbers were found above:
r13076 --> open-mpi/ompi@f0932a0701
been fixed in the 7.0 PGI series, but is unlikely to be fixed in the
6.2 series:
* Add a configure test looking for the bad behavior (the PGI compiler
chokes on C code where structs containing bool's are copied by
value)
* Set OMPI_BOOL_STRUCT_COPY to 1 if it's ok, 0 if it's not (i.e., PGI
6.2 series will have this value set to 0)
* In two places in the code base -- orte-clean and btl_openib_ini.h,
we have a struct that contains a bool that is copied by value. In
these two places, check OMPI_BOOL_STRUCT_COPY and if it's 1, use
the "int" type instead of "bool".
Fixes trac:739
This commit was SVN r13076.
The following Trac tickets were found above:
Ticket 739 --> https://svn.open-mpi.org/trac/ompi/ticket/739
- utilizing coll_tuned_util functions
- setting line length to 80.
This implementation uses standard send messages (instead of synchronous ones).
The change improved our performance over MX multiple number of times, however,
there exists a small potential that last message to be sent can be delayed
(until next mpi call, which means potentially infinitely).
If this shows to be a problem, I will modify the algorithms to use synchronous
send as last operation (which will incur performance penalty again).
This commit was SVN r13071.
- in allgather algorithms I replaces irecv-isend-waitall sequence with
call to ompi_coll_tuned_sendrecv
- most of the functions in util code and allgather decision function conform to 80 character line width.
-
This commit was SVN r13069.
corrected in the MPI 2.0 errata.
* initialized some variables to make our sensitive sun compiler not to
not warn about them when user apps are compiling.
This commit was SVN r13058.
it may contain garbage and we will try to unregister it later in btl_free().
This commit was SVN r13054.
The following Trac tickets were found above:
Ticket 729 --> https://svn.open-mpi.org/trac/ompi/ticket/729
components that use configure.m4 for configuration or are always built.
The macro has not been needed since moving to configure types other than
configure.stub
Fixes trac:590
This commit was SVN r13031.
The following Trac tickets were found above:
Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.
Refs trac:587
This commit was SVN r13023.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
that a) STATUS[ES]_IGNORE *is* NULL, and b) ROMIO blindly sends its
status through to STATUS_SET_ELEMENTS, even if the status is IGNORE.
So we have some legal cases where IGNORE can be passed through here.
Well, that's what we get for trying to do good error checking. :-(
This commit was SVN r12999.
- Fix some fpritnf's in ompi_mpi_abort() that incorrectly assumed that
we were always being invoked from MPI_ABORT (ompi_mpi_abort() may be
invoked from a bunch of different places)
- Also try to opal_backtrace_print() if opal_bactrace_buffer() is not
supported.
- Print a message in MPI_ABORT if we're aborting.
This commit was SVN r12998.
This is somewhat limited currently for expample, if you have 3 ports on Node A and 5 ports
on Node B then the peers will use 3 ports to communicate with each other.
This is on a subnet basis, so for any pair of nodes we take the
intersection of the available ports within a subnet.
We use subnets to determine reachability for lazy connection establishment. So
if Node A and Node B each have two HCA's (on seperate networks) then the
subnet's must be distinct, otherwise we will try to wire up HCA's on seperate
networks.
This commit was SVN r12978.
* Make sure that the pval always writes to the correct portion of the
lval. This only matters on 32 bit big endian machines.
* On 32 bit machines when assigning to pval, the other 4 bytes of lval
weren't being written, which could lead to bogus data
We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes. Refs trac:587
This commit was SVN r12974.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
the value of __n holds. This is not a problem in the first case
because sizeof(int) == sizeof(MPI_Flogical), so no alloc is actually
performed (which is most compilers, and why we haven't been bitten by
this yet). But the second case -- where sizeof(int) !=
sizeof(MPI_Flogical) -- is definitely a problem and needs the "+1" in
the alloc, or Bad Things will happen.
This commit was SVN r12953.
The following SVN revision numbers were found above:
r12712 --> open-mpi/ompi@3e11c76d4c
is allocated on a per comm_world instance, with the lowest rank
in comm_world on the given host creating and initializing the file,
and then notifying the remaining files via the OOB.
Reviewed: Ralph Castain, Brian Barrett
Addressing ticket #674.
This commit was SVN r12949.
completion of the RDMA operation associated with the fragment. The
PML will call the BML free which in turn will call the BTL free. The MX
BTL will not release the fragment if it not tagged with 0xff.
This commit was SVN r12947.
* Added Create_errhandler for MPI::File
* Make errors_throw_exceptions a first-class predefined exception
handler, and make it work for Comm, File, and Win
* Deal with error handlers and attributes for Files, Types, and Wins
like we do with Comms - can't just cast the callbacks from C++
signatures to C signatures. Callbacks will then fire with the
C object, not the C++ object. That's bad.
Refs trac:455
This commit was SVN r12945.
The following Trac tickets were found above:
Ticket 455 --> https://svn.open-mpi.org/trac/ompi/ticket/455
* Add line about heterogeneous support to ompi_info output
* Print warning and abort if heterogeneous detected and
no heterogeneous support available.
Refs trac:587
This commit was SVN r12943.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
inuse decrement for the increment that was at the start of the procs loop.
Otherwise, the inuse count can end up higher than it actually is and a btl
can end up in the progress loop when it isn't active to any peer.
Refs trac:543
This commit was SVN r12938.
The following Trac tickets were found above:
Ticket 543 --> https://svn.open-mpi.org/trac/ompi/ticket/543
the data was buffered by the MX library. If it's the case then we declare
the send as completed and disable the completion event for the mx request.
This commit was SVN r12935.
protocol over the MX BTL. Now, we have only one matching, the one in Open
MPI.
The problem is that when the unexpected handler is triggered, not all the
message is on the host memory. In the best case we get one MX fragment (internal
MX fragment), in the worst we get NULL. The only way to fit this with the
design of the PML is to force the eager protocol at the MX internal fragment
size, and to limit the send/receive protocol at the same size. Tests show
the outcome is not far from optimal (if the pipeline depth is increased
a little bit).
Set MX_PIPELINE_LOG in order to allow MX to use internal fragments of 4K.
This commit was SVN r12930.
performance on a 2G Myrinet card, as it look like pipelining the messages
by 1M is faster than a simple send/receive. However, when using a 10G card
the send/receive will limit the maximum bandwidth to 2.5Gbs. The reason is
the scarce bus resources that have to be shared between the Myrinet hardware
and the memcpy operation. The PUT protocol remove the memcpy, we now have a
true zero-copy mechanism. But, there is no pipelining yet as it look like the
RDMA pipeline somehow disappeared from the OB1 PML ...
This commit was SVN r12925.
add_error_class and add_error_code files. Also fixed the update of the
lastusedcode attribute, all of work according to my tests pretty fine.
Please note: the testcode attached to the bug 683 still reports some bugs. I
am however pretty sure that the testcode is wrong at that points:
- the standard says that the attribute MPI_LASTUSEDCODE has to be updated for
a new error_class or a new error_code. The test currently assumes, that only
the add_error_code call changes the attribute value.
- you have to comment out the two lines 73 and 74 in order to make the
test finish, since these lines check for the error string of non-existent
codes.
- line 126 the error-string of MPI_ERR_ARG is not "invalid argument" but a
little bit more, so the test thinks the output is wrong. So probably the test
has to be update to match the according error string of MPI_ERR_ARG.
Fixes trac:682
This commit was SVN r12913.
The following Trac tickets were found above:
Ticket 682 --> https://svn.open-mpi.org/trac/ompi/ticket/682
It contains four algorithms:
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).
All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at
the University of Tennessee at Knoxville.
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).
This commit was SVN r12910.
which can cause segfaults on shutdown. Calling mx_finalize() isn't enough
to shutdown the thread, so must close endpoints as well.
Refs trac:513
This commit was SVN r12908.
The following Trac tickets were found above:
Ticket 513 --> https://svn.open-mpi.org/trac/ompi/ticket/513
if the remote architecture differs from the local architecture and the
btl doesn't support heterogeneous transport.
Refs trac:587
This commit was SVN r12879.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).
This commit was SVN r12878.
process when creating a datatype from an internal description.
Refs trac:640
This commit was SVN r12877.
The following Trac tickets were found above:
Ticket 640 --> https://svn.open-mpi.org/trac/ompi/ticket/640
Move the req_mtl structure back to the end of each of the structures in
the CM PML. The req_mtl structure is cast into a mtl_*_request_structure
for each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was
moved as an optimization at some point, which caused buffer sends to fail...
Refs trac:669
This commit was SVN r12873.
The following SVN revision numbers were found above:
r12871 --> open-mpi/ompi@597598b712
The following Trac tickets were found above:
Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
CM PML. The req_mtl structure is cast into a mtl_*_request_structure for
each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was
moved as an optimization at some point, which caused buffer sends to
fail...
Refs trac:669
This commit was SVN r12871.
The following Trac tickets were found above:
Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
Add ability for ini files to recognize "use_eager_rdma" flag. Set the
default to "no" (because we should assume that HCAs cannot support the
property necessary for using RDMA for eager messages -- that the last
byte of the message is guaranteed to be written to memory last --
unless proven otherwise. For example, iWARP cards apparently do not
provide this guarantee), and then set all Mellanox and IBM HCAs to
override the default to enable this behavior on these cards.
This commit was SVN r12851.
The following Trac tickets were found above:
Ticket 366 --> https://svn.open-mpi.org/trac/ompi/ticket/366
I found only two places that were looking at the tokens:
1. the odls - we used the tokens to separately process the globals container data from everything else. In this case, I left the subscription that returned the globals data alone, but "stripped" the subscription that returned the launch data for the procs. These subscriptions have nothing to do with the xcast message.
2. the pml_base_modex - the callback function was getting process names from the returned tokens. Actually, this function was doing a very bad thing - it was assuming that the first token returned was *always* the process name. This is currently true, but is one of those assumptions that someone could have easily changed - and suddenly found the system inexplicably failing. I modified the function to (a) get the name sent back to us, (b) "stripped" the value structures of tokens and segment strings, and (c) correctly obtained process names from the returned values. I also reindented the heck out of the code so it was legible (at least, to my old eyes).
This commit was SVN r12813.
This commit fixes several aspects regarding MPI conformance of requests.
* Eliminate the last argument of ompi_errhandler_request_invoke(); we
''always'' want to invoke the back-end exception handler with the
real error code.
* Make it clear in comments that we only invoke the ''first''
exception in a given array of requests, even if there's more than
one request with a non-MPI_SUCCESS value for MPI_ERROR.
* Defer the freeing of requests upon exception in the back-end
functions to MPI_WAIT* and MPI_TEST* until later; the requests are
kept so that we know what handler to invoke when we actually invoke
the exception. After figuring that out, ''then'' we free requests
with pending exceptions on them.
* Clean up return codes from the back-end MPI_TEST* and MPI_WAIT*
functions.
* Slightly modify ompi_errcode_get_mpi_code() to return unity if it
receives an MPI error code (vs. an OMPI error code).
This commit was SVN r12810.
The following Trac tickets were found above:
Ticket 659 --> https://svn.open-mpi.org/trac/ompi/ticket/659
usually is ok on little-endian systems, as the upper 32 bits will likely
be ignored, but on 32-bit big-endian systems, lval is complete junk.
Use ival if 32 bit mode, lval if 64.
Mixing of 32 and 64 bit architectures won't work without more changes.
This commit was SVN r12802.
* Do not add new procs to the global list during modex callback or
when sharing orte names during accept/connect. For modex, we
cache the modex info for later, in case that proc ever does get
added to the global proc list. For accept/connect orte name
exchange between the roots, we only need the orte name, so no
need to add a proc structure anyway. The procs will be added
to the global process list during the proc exchange later in
the wireup process
* Rename proc_get_namebuf and proc_get_proclist to proc_pack
and proc_unpack and extend them to include all information
needed to build that proc struct on a remote node (which
includes ORTE name, architecture, and hostname). Change
unpack to call pml_add_procs for the entire list of new
procs at once, rather than one at a time.
* Remove ompi_proc_find_and_add from the public proc
interface and make it a private function. This function
would add a half-created proc to the global proc list, so
making it harder to call is a good thing.
This means that there's only two ways to add new procs into the global proc list at this time: During MPI_INIT via the call to ompi_proc_init, where my job is added to the list and via ompi_proc_unpack using a buffer from a packed proc list sent to us by someone else. Currently, this is enough to implement MPI semantics. We can extend the interface more if we like, but that may require HNP communication to get the remote proc information and I wanted to avoid that if at all possible.
Refs trac:564
This commit was SVN r12798.
The following Trac tickets were found above:
Ticket 564 --> https://svn.open-mpi.org/trac/ompi/ticket/564
* don't load data into a buffer until we have the data, as
the data contains some header information needed to
properly load the data
This commit was SVN r12792.