Do not ignore the type and extent of the last optimized basic type in some special cases.
Update the last fake END_LOOpP with the correct value for the first_elem_disp field.
This commit was SVN r8023.
never stored on the stack. It is partially stored on the stack depending on the loops
but every time we pack/unpack a basic datatype we take in account again it's displacement.
This approach make the whole logic a lot simpler. In same time I split the big functions
in several basic block.
This commit was SVN r8021.
- if the alignment of wchar is zero then wchar_t is not supported by the OS. We skip it.
- Now that the definition of end_loop change compute the first_elem_description for all
predefined datatypes.
- In debug mode print a list of the datatypes that are not supported by the current
architecture.
This commit was SVN r8020.
We can compute the number of complete datatype that we will advance, update the stack and
then compute the new position taking in acount only the remaining bytes.
This commit was SVN r8019.
one that does not return any value. There are 2 exceptions MPI_Wtick and MPI_Wtime.
For these 2 we can insert the bindings manually.
This commit was SVN r8016.
restoring the PMPI version. A variety of reasons for this:
- mpi.h was blinding using inline in a C header without the configrue mojo
properly set it, as mpi.h doesn't include ompi_config.h. This eventually
would have caused a borked build.
- mpi.h and mpif.h were never updated to not include PMPI_W{tick,time} as
a proper prototype
- The C++ and F90 bindings didn't do the right things when there was no
PMPI version of the C call, but profiling was enabled
- Since we only use gettimeofday, the function call overhead really doesn't
matter
This should probably go to the 1.0 branch
This commit was SVN r8014.
"MPI_OP_foo")
- Remove all the handlers for MPI_REPLACE for general reductions
(it's only defined for MPI_ACCUMULATE, and ACCUMULATE is handled
differently than the other reductions, so it's safe to make all the
maps for REPLACE be empty)
This commit was SVN r8008.
This file is now yet activated. It will became the default after the next
commit. (checkpoint to start testing on other clusters)
This commit was SVN r8006.
MPI_UNSIGNED_LONG_LONG, MPI_LONG_LONG, and MPI_LONG_LONG_INT --
although I already had implementations of all the relevant functions
for these types. Doh!
This commit was SVN r7944.
go through the dynamic decision rule interface.
(forced algorithms are set with MCA params)
fixed some silly verbose output with wrong func name in it etc
updates to fixed dec rules.
This commit was SVN r7940.
modules, if its priority is zero (the default value). Reason for that is
+ if there is no other module with a priority > 0, the hierarchical
collective module has a problem anyway, since it has to rely on the coll
modules of the subcommunicators. On the other hand, if its priority is
zero, it won't be chosen anyway, and we can simply save the
allreduce/allgather and comm_split operations which might occur during
hierarchy detection.
+ to improve the startup times until we have the modex thing which we
discussed with Jeff and Tim in Knoxville in place
- adding an mca parameter indicating a symmetric configuration. This can
speed up startup times, since each process can conclude from its data onto
the data of the other processes -> no need for the allreduce operations. Per
default this parameter is set to "no".
This commit was SVN r7932.
larger than 32K for inter-nodes transfert ... and then they do not support iovecs larger than
16K for inter-node transfert. Therefore we have to set the size of our first fragment to
16K to match both cases.
This commit was SVN r7926.
REDUCE_SCATTER to more thoroughly check the datatype/op combination
to see if it's valid or not. If it's not, print a meaningful error
message rather than "Invalid MPI_Op" indicating what specifically
was wrong (therefore hopefully helping users track down where in the
code the problem is, and/or telling us that there's a reduction
operation combo that we don't support that we should)
- The check for whether a datatype is intrinsic needed to be updated
-- it's not sufficient to check that dtype->id < DT_MAX_PREDEFINED;
you really need to check the PREDEFINED flag on the datatype.
Thanks to George for this fix (only intrinsics have a meaningful
value in dtype->id).
This commit was SVN r7923.
the base send and receive request from the pml_base, we can solve our problem
if we construct the convertor attached to any request in the pml_base_construct
function. At the end of the life time for each request (here life time is
related to one utilisation, without taking in account the cache) we release
all information attached to the convertors in the _FINI macro by calling the
ompi_convertor_cleanup.
This commit was SVN r7910.
to be supported by mellanox vapi.. perhaps this will be supported in the near
future, for now it doesn't hurt to have it in the trunk
Also cleanup the receive descriptor posting macro's..
This commit was SVN r7903.
at the top-level MPI API function. This allows two kinds of
scenarios:
1. MPI_Ireduce(..., op, ...);
MPI_Op_free(op);
MPI_Wait(...);
For the non-blocking collectives that we're someday planning -- to
make them analogous to non-blocking point-to-point stuff.
2. Thread 1:
MPI_Reduce(..., op, ...);
Thread 2:
MPI_Op_free(op);
Granted, for #2 to occur would tread a fine line between a correct and
erroneous MPI program, but it is possible (as long as the Op_free was
*after* MPI_reduce() had started to execute). It's more realistic
with case #1, where the Op_free() could be executed in the same thread
or a different thread.
This commit was SVN r7870.
stage fairly confident that
- it works in most scenarious (with symmetric hierarchies, with asymmetric
hierarchies, wihout hierarchies - it just removes itself)
- it does not create too many problems (I am not aware of any at least)
- it does not slow down startup anymore dramatically (thanks to the fixes of
Brian, Jeff, Tim and a significant reduction in the number of collective
operations in the comm_query)
Any feedback is highly welcome.
This commit was SVN r7868.
quering some of the collective components. Up to now, it just worked
somehow, but now with correct reference counting for ops in place, it
refused :-)
This commit was SVN r7866.
started to add static (fixed if) statement based decision rules based on gigE numbers
added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis
(this is not in the fixed call path but only in the dynamic (at com create) call path).
(these params can be used by test suites such as OCC to choice which algorithm they are using).
This commit was SVN r7854.