These two macros set the prefix for the OPAL and ORTE libraries,
respectively. Specifically, the OPAL library will be named
libPREFIXopen-pal.la and the ORTE library will be named
libPREFIXopen-rte.la.
These macros must be called, even if the prefix argument is empty.
The intent is that Open MPI will call these macros with an empty
prefix, but other projects (such as ORCM) will call these macros with
a non-empty prefix. For example, ORCM libraries can be named
liborcm-open-pal.la and liborcm-open-rte.la.
This scheme is necessary to allow running Open MPI applications under
systems that use their own versions of ORTE and OPAL. For example,
when running MPI applications under ORTE, if the ORTE and OPAL
libraries between OMPI and ORCM are not identical (which, because they
are released at different times, are likely to be different), we need
to ensure that the OMPI applications link against their ORTE and OPAL
libraries, but the ORCM executables link against their ORTE and OPAL
libraries.
of the topology is higher than the communicator size
It is possible to have a topology degree higher than the size of the communicator.
For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave
the neighborhood collectives with a request buffer that is too small.
This commits introduces a semantic change :
from now, c_topo must be set before invoking coll_select
A problem was found with the libnbc MPI_Iallgather
routine when using intercommunicators. Special
thanks to Takahiro Kawashima(Fujitsu) for the patch
and a test case. Verified master fails without the
patch and the test passes with the patch applied.
fixes#219
of the topology is higher than the communicator size
It is possible to have a topology degree higher than the size of the communicator.
For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave
the neighborhood collectives with a request buffer that is too small. This commit
adds a call that will dynamically increase the size of the request buffer if it
is too small.
A better fix would be to create the topology *before* calling the coll_select
routine on a communicator. This will take some discussion and the solution will
not likely be ready anytime soon.
Thanks to Lisandro Dalcin for reporting this.
Original thread: http://www.open-mpi.org/community/lists/devel/2014/08/15713.php
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32796.
reviewed by miked
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32753.
The following SVN revision numbers were found above:
r32735 --> open-mpi/ompi@5fecf65daf
reviewed by miked
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32740.
The following SVN revision numbers were found above:
r32735 --> open-mpi/ompi@5fecf65daf
when CHECK_AND_RECYCLE detects an error, a message is displayed
if the error occurs on an intrinsic communicator, then abort
the program (instead of trying to free the communicator)
cmr=v1.8.3:reviewer=hjelmn
This commit was SVN r32659.
The only user of this code was coll/sm. I implemented a basic replacement
for the removed code. This gets the trunk compiling again with
--disable-dlopen.
This commit was SVN r32333.
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
new flag to ompi_info that allows a user to print all MCA variables of a specific type.
--type version_string
This command will print all MCA variables of type version_string.
This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.
This commit was SVN r32166.
the other collective modules. If we endup without some of the
collective the code will raise an error anyway.
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r32096.
This changeset :
- always call the low/level implementation for :
* MPI_Alltoallv
* MPI_Neighbor_alltoallv
* MPI_Alltoallw
* MPI_Neighbor_alltoallv
- fix mca_coll_tuned_alltoallv_intra_basic_inplace
so zero size types are correctly handled
cmr=v1.8.2:reviewer=bosilca:ticket=4715
This commit was SVN r32013.
The following Trac tickets were found above:
Ticket 4715 --> https://svn.open-mpi.org/trac/ompi/ticket/4715
Correctly handle the corner case in MPI_Alltoallv when
some tasks have no data to transfer and some other tasks
do have data to transfer.
This test case is covered in ibm/collective/alltoallv_somezeros
from the ompi-tests repo.
cmr=v1.8.2:reviewer=bosilca
This commit was SVN r31985.
Avoid sending/receiving zero size messages in order to be compliant
with the top-level modification
cmr=v1.8.2:ticket=4651:reviewer=bosilca
This commit was SVN r31836.
The following Trac tickets were found above:
Ticket 4651 --> https://svn.open-mpi.org/trac/ompi/ticket/4651
This commit :
- Correctly retrieve the communicator size when
checking memory and parameters
- Ensure (sendtype,sendcount) and (recvtype,recvcount)
matches and return with MPI_ERR_TRUNCATE otherwise
- Return with MPI_SUCCESS without invoking the low level
if no data is going to be transferred
- Fixes trac:4506
cmr=v1.8.2:reviewer=bosilca
This commit was SVN r31815.
The following Trac tickets were found above:
Ticket 4506 --> https://svn.open-mpi.org/trac/ompi/ticket/4506
It is essential to call mca_base_framework_close for every framework
that is opened. coll/ml was not doing this so neither bcol nor sbgp
were getting cleaned up. This commit fixes this omission.
Also fixed a leak caused by calling OBJ_DESTRUCT for something created
with OBJ_NEW. With these changes coll/ml appears to be valgrind clean.
cmr=v1.8.2:reviewer=manjugv
This commit was SVN r31743.
MPI_Cart_Create/MPI_Graph_create/MPI_Dist_Graph
Fixes trac:4581
This commit was SVN r31716.
The following Trac tickets were found above:
Ticket 4581 --> https://svn.open-mpi.org/trac/ompi/ticket/4581
top_ompi_srcdir -> OMPI_TOP_SRCDIR
top_ompi_builddir -> OMPI_TOP_BUILDDIR
We also split the srcdir/builddir flags according to their local tree (e.g., OPAL_TOP_SRCDIR), and tied them all together in configure.ac. Renamed ompi_ignore and ompi_unignore to be opal_<foo> as these are agnostic markers.
Only thing left is ompilibdir being treated similar to what we dif for srcdir/builddir. Coming soon.
This commit was SVN r31678.
Patch from Gilles Gouaillardet on #4517 to fix handling 0-sized
messages in coll tuned with MPI_ALLTOALLV and MPI_IN_PLACE.
Reviewed by Jeff Squyres.
Fixes trac:4517
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31521.
The following Trac tickets were found above:
Ticket 4517 --> https://svn.open-mpi.org/trac/ompi/ticket/4517
Patch from Gilles Gouaillardet on #4506 to correctly handle 0-sized
messages in coll/basic MPI_Alltoallv and MPI_Alltoallw.
Reviewed by Jeff Squyres.
Fixes trac:4506.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31519.
The following Trac tickets were found above:
Ticket 4506 --> https://svn.open-mpi.org/trac/ompi/ticket/4506
Ensure to also OBJ_RELEASE the neightbor and ineighbor modules.
Fixes trac:4444 (this patch is from that ticket).
This commit was SVN r31516.
The following Trac tickets were found above:
Ticket 4444 --> https://svn.open-mpi.org/trac/ompi/ticket/4444
The file coll_ml_ibarrier.c wasn't included in coll/ml's Makefile.am
and the setup code from coll_ml_hier_algorithms_ibarrier.c was not
being called. It looks like this code is stale and has long since been
replaced by the code in coll_ml_barrier.c
Once all these little CMRs are approved I may make it into one roll-up
CMR to make it easier on the RM.
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31418.
a segmentation fault in the reduce cleanup
Some of the changes address false warnings produced by scan-build. I
added asserts and changed some malloc calls to calloc to silence these
warnings.
The was one issue in cleanup for reduce since the component_functions
member is changed by the allreduce call. There may be other issues
with how this code works but releasing the allocated
component_functions after setting up the static functions addresses
the primary issue (SIGSEGV).
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31417.
some of the collective modules, the shared memory and the profiling
interface. I left out VT, dynamic fcoll and seq rmaps.
cmr=v1.8.1:reviewer=jsquyres:subject=silence Coverity reported warnings
This commit was SVN r31309.
Discussed this with Manju and we decided to back this one out until a later time.
This reverts commit r31188 and closes trac:4435
This commit was SVN r31282.
The following SVN revision numbers were found above:
r31188 --> open-mpi/ompi@f1dd589092
The following Trac tickets were found above:
Ticket 4435 --> https://svn.open-mpi.org/trac/ompi/ticket/4435
There were a couple of issues with the memory leak fixes and several more verbose
issues. This fixes those issues.
cmr=v1.8.1:ticket=trac:4473
This commit was SVN r31273.
The following Trac tickets were found above:
Ticket 4473 --> https://svn.open-mpi.org/trac/ompi/ticket/4473
Thanks to ggouaillardet for finding and fixing these issues.
Closes trac:4460
cmr=v1.8.1:reviewer=manjugv
This commit was SVN r31264.
The following Trac tickets were found above:
Ticket 4460 --> https://svn.open-mpi.org/trac/ompi/ticket/4460
The error doesn't prevent the user from running so there is no reason
to display it unless the user requested it (through coll_ml_verbose).
cmr=v1.8:reviewer=jsquyres
This commit was SVN r31242.
a hierarchy actually matches a bcol that is in use.
There was a bug in one of the paths to calculate the ml buffer size. I fixed
the bug and squashed all the paths together to avoid further issues (the
result was correct in another path that calculated the same value).
Additionally, the i_hier was being used as the bcol_index. This is not
correct in a couple of cases so I added a variable to keep track of the
real bcol_index.
cmr=v1.8:reviewer=pasha
This commit was SVN r31189.
bound.
This case is correctly handled by coll/ml so remove the check that diables
coll/ml in the not bound case.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31188.
This patch fixes two leaks:
- Fix typo in fallback collective code that caused coll/ml to retain
the ibcast module twice but only release it once. One of those ibcast
saves was supposed to be bcast.
- Do not check for module initialization in the module destructor. It
is possible to destruct a module that is partially setup.
cmr=v1.8:reviewer=manjugv
This commit was SVN r31187.
This isn't causing any errors that I know about but it does fix an
annoying valgrind warning. Simple fix, no review required.
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r31130.
There are situations where coll/ml does not initialize properly. These will
eventually need to be fixed but in the meantime it is better to not always
print an error message because the collective framework can still fall back
on another collective module. This commit reduces the verbose output.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31129.
It is usually not a good idea to assert when something is not implemented
or something goes wrong. Replace asserts with debug output and return.
cmr=v1.7.5:reviewer=manjugv
This commit was SVN r31128.
Also fixed spelling: IS_NOT_RECHABLE -> IS_NOT_REACHABLE.
Also mark a few places where opal_show_help() should have been used;
Manju will take care of these.
This commit was SVN r31104.
In r31071 I modified the logic to not increment the hierarchy level if
no processes were selected by that sbgp. That fixed a problem seen on
systems where we don't support process binding. The problem is there
is a case where we actually did select processes yet the number of
selected processes is 0. We need to increment the hierarchy in this case
as well.
This should fix the segmentation fault found by recent MTT runs. Once
this is committed to 1.7.5 remove the .ompi_ignore's from coll/ml and
bcol/ptpcoll. Tested with ompi-tests/ibm.
cmr=v1.7.5:reviewer=rhc
This commit was SVN r31081.
The following SVN revision numbers were found above:
r31071 --> open-mpi/ompi@1911d97044
This was causing JVMs to run out of stack space, and all manner of
badness ensued.
Instead, use the heap -- that's what it's there for.
cmr=v1.7.5:reviewer=rhc:subject=make coll/ml use the heap for large debug array
This commit was SVN r31073.
fails to select any processes on any nodes.
Also modified basesmsocket to only print debugging info to the framework
output.
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r31071.
- -check-shmem-params is OFF by default. It checks OSHMEM API params and will abort on bad input
- hcoll do not save fallback coll pointers for unsupported collectives.
fixed by Val, Roman, reviewed by Miked/Igor
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30995.