at the top-level MPI API function. This allows two kinds of
scenarios:
1. MPI_Ireduce(..., op, ...);
MPI_Op_free(op);
MPI_Wait(...);
For the non-blocking collectives that we're someday planning -- to
make them analogous to non-blocking point-to-point stuff.
2. Thread 1:
MPI_Reduce(..., op, ...);
Thread 2:
MPI_Op_free(op);
Granted, for #2 to occur would tread a fine line between a correct and
erroneous MPI program, but it is possible (as long as the Op_free was
*after* MPI_reduce() had started to execute). It's more realistic
with case #1, where the Op_free() could be executed in the same thread
or a different thread.
This commit was SVN r7870.
stage fairly confident that
- it works in most scenarious (with symmetric hierarchies, with asymmetric
hierarchies, wihout hierarchies - it just removes itself)
- it does not create too many problems (I am not aware of any at least)
- it does not slow down startup anymore dramatically (thanks to the fixes of
Brian, Jeff, Tim and a significant reduction in the number of collective
operations in the comm_query)
Any feedback is highly welcome.
This commit was SVN r7868.
quering some of the collective components. Up to now, it just worked
somehow, but now with correct reference counting for ops in place, it
refused :-)
This commit was SVN r7866.
started to add static (fixed if) statement based decision rules based on gigE numbers
added mca params so that a user can force a certain algorithm/segment/topo on a per collective basis
(this is not in the fixed call path but only in the dynamic (at com create) call path).
(these params can be used by test suites such as OCC to choice which algorithm they are using).
This commit was SVN r7854.
Note, we are failing the ring tests in the intel p2p test suite, but we seem
to fail the same tests under the current trunk.. will look into this further.
This commit was SVN r7823.
convertor (when prepared) increase the reference count on the used datatype. This reference count
will be released only when the OBJ_DESTRUCT is called on a convertor. However, having to call
OBJ_CONSTRUCT and OBJ_DESTRUCT on each request every time we want to use it (even when it come
from the cache) is an expensive operation. This can be avoided is the OBJ_DESTRUCT will leave the
convertor in exactly the same state as OBJ_CONSTRUCT. With this approach we just have to call
OBJ_CONSTRUCT for each convertor once when we initially create the request.
This commit was SVN r7813.
memory initialization, call opal_progress() to push any pending events
around and possibly yield the processor if nothing entertaining is happening.
This should probably go to the 1.0 branch.
This commit was SVN r7808.
--with-devel-headers is given to configure:
* allocator, rcache, and mpool were putting things in the wrong place
* timer wasn't installing the inline implementations at all
This commit was SVN r7805.
number of collective operations and simplifies the logic significantly.
- introducing a special case if size of comm == 1, avoiding thus collective
operations as well ( i.e. no need for hierarchies)
- fix for an unsymmetric case. Still to be tested.
This commit was SVN r7799.
originally suggested by Ralf Wildenhues, to try to speed autogen, configure,
and make (and possibly even make install). Use automake's include directive
to drastically reduce the number of Makefile files (although the number of
Makefile.am files is the same - most are just included in a top-level
Makefile.am). Also use an Automake SUBDIRs feature to eliminate the
dynamic-mca tree, which was no longer really needed. This makes adding
a framework easier (since you don't have to remember the dynamic-mca
tree) and makes building faster (as make doesn't have to recurse through
the dynamic-mca tree)
This commit was SVN r7777.
and for all root nodes and passed all tests.
First cut on barrier (which from my perspective does not make sense from the
performance point of view) and on allreduce (which might make sense),
This commit was SVN r7774.
done. This version also doesn't break ompi (at least if its not chosen :-) ).
New features compared to the version from last Thursday (where bcast and
reduce seemed to work in most scenarios):
- clearer internal infrastructure
- ability to handle all root processes with a (hopefully) minimal number of
local leader communicators.
This commit was SVN r7769.
the same (since those are both mandated by MPI and <31 characters),
but change some of the back-end subroutine names so that they are <31
characters and therefore obey the F90 standard. Remove an outdated /
useless (and confusing) script.
This commit was SVN r7764.
(actually a work around for an optimisation in the reduce for not saving ops on the first recv of each segment)
Minor change in topo.
This commit was SVN r7758.
- update the hierarch stuff to use btl's instead of ptl's
- start the new logic regarding how to handle local leader communicators
This commit was SVN r7691.
reduce_inorder() function -- we don't use the tree at all.
- Add more relevant "volatile"'s for the control buffers in the
fragment mpool (and associated casts where necessary)
This commit was SVN r7616.
that on some OSes the iovec struct is not POSIX complian, the iov_len is not a size_t but simply an int.
This patch, add a local variable (type size_t) to use with the memory allocation function, and then put
back the value in the iov_len field.
This commit was SVN r7615.
on others as an char*. Thus the right side of all assignment should be explicitly casted to an void* in
order to avoid any casting complaints from the compilers.
This commit was SVN r7607.
- corrected memory hook callback to catch all allocations (need to optimize this)
- don't attempt to consolidate allocations
This commit was SVN r7600.
- Move the "process 0" logic out of the main loop in reduce to make
the code a bit less complex (at the price of slight code
duplication, but it iss now significantly easier to read)
- Fix problem with uniquenes guarantee in the bootstrap mpool -- using
the CID alone was not sufficient enough to guarantee uniquenes; now
use (CID, rank 0 process name) tuple to check for uniqueness
- Made a few debugging help changes in coll_sm.h; especially helps
debugging on uniprocessors
This commit was SVN r7599.
- Move one base global to the basic component and make it an MCA
parameter
- Convert the basic component to use the new MCA param API
This commit was SVN r7598.
discussed and cleared with Edgar.
Ensure that only processes who will be in the new communicator call
the coll selection function. It is pointless (and Bad in some cases)
for processes who are not in the new communicator to try to select a
coll module for the new communicator.
This commit was SVN r7573.
On thor, if IB is down, we get NULL back from ibv_get_devices(), which then
caused segfaults in dlist_start().
* Pretty-print error message if no HCAs found
This commit was SVN r7557.
any NICs to use
* Make mvapi, gm, and mx components all publish information, even if there
are no NICs available so that modex_recv doesn't hang. If there are no
NICs available, don't set the reachable bit, but don't do anything
to fail. This unfortunately doesn't cover the hangs that will result if
different procs load different sets of components, but it's a start
This commit was SVN r7550.
lower the default priority to 0 so that it's not active unless you
specifically ask for it (this component needs more testing by people
other than me before we unleash it on the public).
This commit was SVN r7545.
Try to detect which form this version of Open IB uses, defaulting to the 5
version if we can't figure it out (the new version has 5 params)
* Only add -lcm if it exists on the system - some versions of Open IB
apparently don't need it.
This commit was SVN r7542.
bits to expose the low bits being set. We were casting from a size_t to a void*
which is not good when working with big endian machines.
This fix makes MPI 2 dynamics work on PPC 64 (tested with a Linux OS).
This commit was SVN r7538.
The following SVN revision numbers were found above:
r7537 --> open-mpi/ompi@fd45714c03
- If we have an NS error, don't return an error -- this function's
purpose is to abort :-)
- s/abort()/exit(1)/ so that we don't drop massive corefiles
This commit was SVN r7524.
a random string of characters as part of the version number (the really
soon to happen 1.0lanl release and the 1.1sc2005 release that we've
talked about). So rather than having alpha and beta fields that must
be numeric values, have a general field that can be any alphanumeric
value.
This commit was SVN r7511.
searching the tree. Modified the memory callback to search the tree at each
page boundary for registrations. This is necessary as an application may
malloc memory and send out of any portion of that memory, even discontiguous
regions.
This commit was SVN r7510.
add misc asserts to check for proper reference counting.
ugly hack 1 -- use mallopt to never release memory ala sbrk - this is
commented out in mca_btl_mvapi_component_init
ugly hack 2 -- test registrations comming out of the tree via rcache_find, for
an unknown reason the tree is returning registrations where the address is not
within the base or bound of the registration. If this happens, we return
NULL.
comment out code to enable mem hooks if leave_pinned is set, note we can do
this via an mca param and will default it to leave_pinned with mem_hooks when
we iron out these issues.
I am adding a unit test for the rcache. Note that we have a unit test for the
rb tree but the compare function is significantly different than that used for
registrations. After we have tracked down the issues with rcache_rb we will
remove the above hacks.
This commit was SVN r7499.
Red Storm is complaining about invalid memory pointer (need to go back
to Linux and look at this with valgrind)
* Turn off send in place for now, so I can run the tests on RS and see if
everything else is ok
This commit was SVN r7497.
* fix dumb bug in btl_portals_get where I using the dest descriptor key instead
of the source descriptor key for the match bits, resulting in a PtlGet() with
the wrong match bits
This commit was SVN r7496.
need to remove old ones from the front
- call deregister to actually remove items from the cache/mru
list and deregister the memory (if not being used)
This commit was SVN r7476.