at the top-level MPI API function. This allows two kinds of
scenarios:
1. MPI_Ireduce(..., op, ...);
MPI_Op_free(op);
MPI_Wait(...);
For the non-blocking collectives that we're someday planning -- to
make them analogous to non-blocking point-to-point stuff.
2. Thread 1:
MPI_Reduce(..., op, ...);
Thread 2:
MPI_Op_free(op);
Granted, for #2 to occur would tread a fine line between a correct and
erroneous MPI program, but it is possible (as long as the Op_free was
*after* MPI_reduce() had started to execute). It's more realistic
with case #1, where the Op_free() could be executed in the same thread
or a different thread.
This commit was SVN r7870.
stage fairly confident that
- it works in most scenarious (with symmetric hierarchies, with asymmetric
hierarchies, wihout hierarchies - it just removes itself)
- it does not create too many problems (I am not aware of any at least)
- it does not slow down startup anymore dramatically (thanks to the fixes of
Brian, Jeff, Tim and a significant reduction in the number of collective
operations in the comm_query)
Any feedback is highly welcome.
This commit was SVN r7868.
quering some of the collective components. Up to now, it just worked
somehow, but now with correct reference counting for ops in place, it
refused :-)
This commit was SVN r7866.
started to add static (fixed if) statement based decision rules based on gigE numbers
added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis
(this is not in the fixed call path but only in the dynamic (at com create) call path).
(these params can be used by test suites such as OCC to choice which algorithm they are using).
This commit was SVN r7854.
--enable-mpi-{cxx,f77,f90} so that people aren't confused about what
they are actually disabling.
This should go to the 1.0 branch
This commit was SVN r7851.
This takes care of Troy's first segfault problem, and compile errors that will likely happen as soon as Ken applies George's patch and runs make again.
This commit was SVN r7833.
Note, we are failing the ring tests in the intel p2p test suite, but we seem
to fail the same tests under the current trunk.. will look into this further.
This commit was SVN r7823.
convertor (when prepared) increase the reference count on the used datatype. This reference count
will be released only when the OBJ_DESTRUCT is called on a convertor. However, having to call
OBJ_CONSTRUCT and OBJ_DESTRUCT on each request every time we want to use it (even when it come
from the cache) is an expensive operation. This can be avoided is the OBJ_DESTRUCT will leave the
convertor in exactly the same state as OBJ_CONSTRUCT. With this approach we just have to call
OBJ_CONSTRUCT for each convertor once when we initially create the request.
This commit was SVN r7813.
memory initialization, call opal_progress() to push any pending events
around and possibly yield the processor if nothing entertaining is happening.
This should probably go to the 1.0 branch.
This commit was SVN r7808.
--with-devel-headers is given to configure:
* allocator, rcache, and mpool were putting things in the wrong place
* timer wasn't installing the inline implementations at all
This commit was SVN r7805.
really crappy job of trying to emulate the inline assembly mode of GCC (and will
completely rewite the assembly, which seems to be bad in my opinion). GCC and
the AIX assembler don't see eye-to-eye on what GCC emits when doing inline
assembly. That's two compilers and no actual working support. So just punt and
fall back to XLC inline assembly or non-inlined assembly.
This commit was SVN r7800.
number of collective operations and simplifies the logic significantly.
- introducing a special case if size of comm == 1, avoiding thus collective
operations as well ( i.e. no need for hierarchies)
- fix for an unsymmetric case. Still to be tested.
This commit was SVN r7799.
its not needed and there could be multiple sources each w/ their
own sequence.
- if a write doesn't complete, need to check for non-blocking case..
This commit was SVN r7795.