Thanks to Lisandro Dalcin for identifying the problem.
Fixes trac:4876
Submitted by George Boscila, reviewed by Jeff Squyres.
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32615.
The following Trac tickets were found above:
Ticket 4876 --> https://svn.open-mpi.org/trac/ompi/ticket/4876
WHAT: Merge the PMIx branch into the devel repo, creating a new
OPAL “lmix” framework to abstract PMI support for all RTEs.
Replace the ORTE daemon-level collectives with a new PMIx
server and update the ORTE grpcomm framework to support
server-to-server collectives
WHY: We’ve had problems dealing with variations in PMI implementations,
and need to extend the existing PMI definitions to meet exascale
requirements.
WHEN: Mon, Aug 25
WHERE: https://github.com/rhc54/ompi-svn-mirror.git
Several community members have been working on a refactoring of the current PMI support within OMPI. Although the APIs are common, Slurm and Cray implement a different range of capabilities, and package them differently. For example, Cray provides an integrated PMI-1/2 library, while Slurm separates the two and requires the user to specify the one to be used at runtime. In addition, several bugs in the Slurm implementations have caused problems requiring extra coding.
All this has led to a slew of #if’s in the PMI code and bugs when the corner-case logic for one implementation accidentally traps the other. Extending this support to other implementations would have increased this complexity to an unacceptable level.
Accordingly, we have:
* created a new OPAL “pmix” framework to abstract the PMI support, with separate components for Cray, Slurm PMI-1, and Slurm PMI-2 implementations.
* Replaced the current ORTE grpcomm daemon-based collective operation with an integrated PMIx server, and updated the grpcomm APIs to provide more flexible, multi-algorithm support for collective operations. At this time, only the xcast and allgather operations are supported.
* Replaced the current global collective id with a signature based on the names of the participating procs. The allows an unlimited number of collectives to be executed by any group of processes, subject to the requirement that only one collective can be active at a time for a unique combination of procs. Note that a proc can be involved in any number of simultaneous collectives - it is the specific combination of procs that is subject to the constraint
* removed the prior OMPI/OPAL modex code
* added new macros for executing modex send/recv to simplify use of the new APIs. The send macros allow the caller to specify whether or not the BTL supports async modex operations - if so, then the non-blocking “fence” operation is used, if the active PMIx component supports it. Otherwise, the default is a full blocking modex exchange as we currently perform.
* retained the current flag that directs us to use a blocking fence operation, but only to retrieve data upon demand
This commit was SVN r32570.
In core library portions of the configury (e.g., top-level
configure.ac itself), we were calling AC_CHECK_LIB and
OPAL_CHECK_FUNC_LIB to check for various libraries.
'''SIDENOTE:''' It turns out that modern Autoconf has AC_SEARCH_LIBS,
which does just about exactly what OPAL_CHECK_FUNC_LIB does. So this
commit effectively replaces OPAL_CHECK_FUNC_LIB with AC_SEARCH_LIBS.
However, we never bothered to add these found libraries to the wrapper
compiler list of libraries used for static linking (doh!). We've been
getting lucky for quite a while that components were adding the same
libraries to their wrapper compiler LIBS list.
This is problematic, however, if we don't build some of these
components. For example, Paul Hargrove noticed that if he configured
with --disable-shared --enable-static --disable-io-romio, ROMIO was no
longer adding some libraries to the wrapper LIBS list -- libraries
that just happened to also be needed by core OPAL/ORTE/OMPI layers.
The solution is not to use AC_CHECK_LIB or OPAL_CHECK_FUNC_LIB, but
use a pair of new macros:
* OPAL_SEARCH_LIBS_CORE: a wrapper around AC_SEARCH_LIBS. If we add
something to $LIBS, then also add it to the wrapper list of static
libraries. This is the main piece of functionality that was
wrong/missing.
* OPAL_SEARCH_LIBS_COMPONENT: similar to OPAL_SEARCH_LIBS_CORE, but
instead of directly adding it to the wrapper list of static
libaries, add it to <framework>_<component>_LIBS (which eventually
gets slurped up into the wrapper list of static libraries. See the
lengthy comment in config/opal_setup_wrappers.m4 near the beginning
of OPAL_SETUP_WRAPPER_INIT() for a more detailed explanation).
Most components did this correctly already, but one or two weren't
right, so I implemented this second macro quite similar to the
first and put it everywhere we already used AC_SEARCH_LIBS or
OPAL_CHECK_FUNC_LIB.
This needs to soak for a day or two on the trunk before moving to the
v1.8 branch.
Refs trac:4834
cmr=v1.8.2:reviewer=ggouaillardet
This commit was SVN r32447.
The following Trac tickets were found above:
Ticket 4834 --> https://svn.open-mpi.org/trac/ompi/ticket/4834
also replase the OMPI_CAST_RTE_NAME macro with
an inline function if OPAL_ENABLE_DEBUG, so we can
get warnings from the compiler if ampersand is missing.
Thanks to Paul Hargrove for reporting the bugs
This commit was SVN r32408.
This fixes some duplicate symbols, once the .o files for the modules
were restored into the library (some compilers need the .o files, some
don't (!)).
Also, remove trailing whitespace. :-)
This commit was SVN r32386.
communication library should use to initialize itself.
Ralph will champion this change back with an RFC if there is a realistic
need/use case from the community.
This commit was SVN r32361.
The following SVN revision numbers were found above:
r32355 --> open-mpi/ompi@c903917f47
The only user of this code was coll/sm. I implemented a basic replacement
for the removed code. This gets the trunk compiling again with
--disable-dlopen.
This commit was SVN r32333.
common/ofacm is only used by the iboffload code in ompi. This code does
not currently work so it is safe to ignore these components until it is
fixed.
This commit was SVN r32331.
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
Let's not make the move to OPAL any harder than it has to be; this
commit can wait until after the BTL move.
This commit was SVN r32316.
The following SVN revision numbers were found above:
r32315 --> open-mpi/ompi@7b7ed8ed97
CMR'ing just to (try to) keep the differences between trunk and v1.8
branch (somewhat) small.
Reviewed by Dave Goodell
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32315.
Previously, we were only checking connectivity upon first ''send'' to
a peer. But this ignores the case where the first communication to a
peer is actually an ACK -- i.e., we successfully received something
from the peer and we need to send an ACK back. So we need to verify
that the ACK will actually get there.
Specifically, certain asymmetric routing cases can lead to a hang if
we don't check the connectivity in both directions. E.g., if the
sender is able to get traffic to the receiver, but the receiver is
unable to get traffic back to the sender because it made a different
routing decision than the sender.
In this case, the connectivity checker from the sender could succeed
(because the connectivity checker will ACK along the same path in
which the ping was received), but sending a BTL ACK could fail
(because the BTL ACK will be sent back along the path chosen by the
graph algorithm, which, in an erroneous asymmetric routing scenario,
may be different/wrong).
Hence, we want to trigger the connectivity checker at the first
communication from A->B, which may either be a BTL send or an ACK.
Reviewed by Dave Goodell.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32309.
Ensure that target directories exists before creating symlinks.
cmr=v1.8.2:reviewer=jsquyres
Thanks Jeff to step up as an reviewer.
This commit was SVN r32305.
- Portals4/OSC was unable to acquire an exclusive lock due to an invalid
local address in the atomic operation. This caused the reported hang.
- After fixing the hang, the test continued to fail because
ompi_datatype_is_contiguous_memory_layout() reports that MPI_EMPTY (the
origin datatype) is noncontiguous and Portals4/OSC does not support
noncontiguous datatypes at this time. However, in this case the origin
count is zero so contiguous/noncontiguous is irrelevant. Now we skip
the contiguous check if the count is zero.
cmr=v1.8.3:reviewer=regrant:subject=Fix for "Portals4/MTL hangs in c_get_accumulate test"
This commit was SVN r32295.
The following Trac tickets were found above:
Ticket 4662 --> https://svn.open-mpi.org/trac/ompi/ticket/4662
Fix a copy-n-paste error: the ompi/pompi interfaces should not have
optional ierror arguments. Optional ierror arguments are only used in
the MPI_<foo> interfaces. The ompi/pompi interfaces are the actual
underlying routines (in C, incidentally, which is why they're declared
as BIND(C)), and do not have optional ierror arguments.
Also fix a typo in the BIND(C) name for pompi_win_shared_query_f().
cmr=v1.8.2:reviewer=ggouaillardet
This commit was SVN r32287.
Rever r32246, r32254, and 32255 -- they were fixing side-effects of
the real bug. Real fix coming after this one.
This commit was SVN r32286.
The following SVN revision numbers were found above:
r32246 --> open-mpi/ompi@08d2a1a48d
r32254 --> open-mpi/ompi@232d4dbb7b
QA ran across the case where the user can't write to the target
directory for the connectivity map file. In this case, we silently
continued. They requested that we at least warn in this case.
Fixes Cisco bug CSCup62821
Reviewed by Dave Goodell
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32283.
Description:
This mod fixes a regression in the ugni btl eager get
path introduced in changeset 32196.
References:4800
Closes:4800
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r32264.
The logic was mishandling the case of a newer kernel and an older
libusnic_verbs. Simplify usnic_transport() to return constants in the
2 known cases (not a usNIC device and the TRANSPORT_USNIC_UDP case),
and call the magic probe in all other cases.
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32260.
If we don't explicitly declare that (a == NULL && b == NULL) is
equivalent to qsort, we could end up with wonky sorting order. I.e.,
it's *possible* that some NULLs could end up in the middle of the
array.
Regardless of whether it will ever happen in practice, it makes the
code more clear to also handle the "both are NULL" case.
Also fix the 2-spacing indents.
Reviewed by Dave Goodell.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32259.
Simplify and fix the r32246
cmr=v1.8.2:ticket=trac:4792
This commit was SVN r32254.
The following SVN revision numbers were found above:
r32246 --> open-mpi/ompi@08d2a1a48d
The following Trac tickets were found above:
Ticket 4792 --> https://svn.open-mpi.org/trac/ompi/ticket/4792
ABSoft compilers cannot compile a fortran subroutine
with the BIND(C, NAME="name") modifier *and* argument(s)
with the OPTIONAL modifier
This patch detects this unsupported feature and use
adhoc wrappers if it is missing
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r32246.
The wrong descriptor field was used when calculating the size received when
using the RDMA rendevous protcol.
This commit was SVN r32232.
The following SVN revision numbers were found above:
r32196 --> open-mpi/ompi@a14e0f10d4
new automake requires subdirs-object directive, to resolve this:
09:43:37 automake: warning: possible forward-incompatibility.
09:43:37 automake: At least a source file is in a subdirectory, but the 'subdir-objects'
09:43:37 automake: automake option hasn't been enabled. For now, the corresponding output
09:43:37 automake: object file(s) will be placed in the top-level directory. However,
09:43:37 automake: this behaviour will change in future Automake versions: they will
09:43:37 automake: unconditionally cause object files to be placed in the same subdirectory
09:43:37 automake: of the corresponding sources.
09:43:37 automake: You are advised to start using 'subdir-objects' option throughout your
09:43:37 automake: project, to avoid future incompatibilities.
09:43:37 tools/otfmerge/Makefile.common:13: warning: source file '$(OTFMERGESRCDIR)/otfmerge.c' is in a subdirectory,
09:43:37 tools/otfmerge/Makefile.common:13: but option 'subdir-objects' is disabled
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32225.
Always include into the tarball (aka 'make dist') :
- mpi-f90-interfaces.h
- mpi-f90-cptr-interfaces.F90
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32215.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
Handle OMPI_REQUEST_NOOP in MPI_Startall rather than PML
cmr=v1.8.2:reviewer=bosilca:ticket=4764
This commit was SVN r32213.
The following Trac tickets were found above:
Ticket 4764 --> https://svn.open-mpi.org/trac/ompi/ticket/4764
The connectivity map output routine needs to handle the case where
entries in the endpoints array are NULL (e.g., if one process has 2
endpoints and another process has only 1 endpoint).
Fixes Cisco bug CSCup83649.
cmr=v1.8.2
This commit was SVN r32211.
Older gfortran compilers (e.g., the gfortran that ships in RHEL5) do
not support ISO_C_BINDING, and therefore do not support the
TYPE(C_PTR) type. As such, they cannot support the overloaded
interfaces for MPI_WIN_ALLOCATE_SHARED and MPI_SHARED_QUERY that are
mandated in MPI-3.
So we separate those interfaces out into a separate .F90 file that is
#include'd in the tkr mpi.F90 file. In this separate .F90 file, we
use an #if to determine whether the compiler supports ISO_C_BINDING or
not.
Also re-jiggered the order of testing in ompi_setup_mpi_fortran.m4: we
now need to test whether the compiler supports ISO_C_BINDING even when
we're only building the mpi module (not strictly when we're building
the mpi_f08 module).
Finally, tweaked the use-mpi-tkr/Makefile.am to:
* Add some proper dependencies for mpi.F90
* Allow the general AM compilation to be used instead of supplying a
specific rule for compiling mpi.F90
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32204.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
We have been getting several requests for new collectives that need to be inserted in various places of the MPI layer, all in support of either checkpoint/restart or various research efforts. Until now, this would require that the collective id's be generated at launch. which required modification
s to ORTE and other places. We chose not to make collectives reusable as the race conditions associated with resetting collective counters are daunti
ng.
This commit extends the collective system to allow self-generation of collective id's that the daemons need to support, thereby allowing developers to request any number of collectives for their work. There is one restriction: RTE collectives must occur at the process level - i.e., we don't curren
tly have a way of tagging the collective to a specific thread. From the comment in the code:
* In order to allow scalable
* generation of collective id's, they are formed as:
*
* top 32-bits are the jobid of the procs involved in
* the collective. For collectives across multiple jobs
* (e.g., in a connect_accept), the daemon jobid will
* be used as the id will be issued by mpirun. This
* won't cause problems because daemons don't use the
* collective_id
*
* bottom 32-bits are a rolling counter that recycles
* when the max is hit. The daemon will cleanup each
* collective upon completion, so this means a job can
* never have more than 2**32 collectives going on at
* a time. If someone needs more than that - they've got
* a problem.
*
* Note that this means (for now) that RTE-level collectives
* cannot be done by individual threads - they must be
* done at the overall process level. This is required as
* there is no guaranteed ordering for the collective id's,
* and all the participants must agree on the id of the
* collective they are executing. So if thread A on one
* process asks for a collective id before thread B does,
* but B asks before A on another process, the collectives will
* be mixed and not result in the expected behavior. We may
* find a way to relax this requirement in the future by
* adding a thread context id to the jobid field (maybe taking the
* lower 16-bits of that field).
This commit includes a test program (orte/test/mpi/coll_test.c) that cycles 100 times across barrier and modex collectives.
This commit was SVN r32203.
mca_btl_base_segment_t and replace them with des_local and des_remote
This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.
In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.
RFC text:
This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.
What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.
Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.
This commit was SVN r32196.
Description: This mod fixes two name conflicts between the ugni and scif btls.
References:4771
Closes:4771
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r32183.
new flag to ompi_info that allows a user to print all MCA variables of a specific type.
--type version_string
This command will print all MCA variables of type version_string.
This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.
This commit was SVN r32166.
Several problems with MPI_Win_allocate_shared and MPI_Win_shared_query
were discovered in a code review. This commit fixes them:
* Add _cptr versions of both subroutines in mpif-h, use-mpi-tkr, and
use-mpi-ignore-tkr directories
* Fix case of PMPI weak symbols for both C implementations
* Add MPI and PMPI f08 implementations of both subroutines (there is
no _cptr version in the mpi_f08 module)
* Fixed _f08 suffix on the f08 module of both subroutines
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32162.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
mca_base_var_get now can return OPAL_ERR_NOT_FOUND if a variable no
longer exists. This commit updates the return code check to ensure
the correct MPI_T error code is returned to the user.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r32161.
This commit adds a check to see if the target is in an access epoch. If
not we return OMPI_ERR_RMA_SYNC. This fixes test_start3 in the onesided
test suite. The cost of this extra check is 1 byte/peer for the boolean
flag indicating that the peer is in an access epoch.
I also fixed a problem where mupliple unexpected post messages are not
correctly handled.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r32160.
If the btl_usnic_connectivity_map MCA param is set to a non-NULL
value, then each MPI process will output a file named
<prefix>-<hostname>.pid<pid>.job<jobid>.mcwrank<MCW rank>.txt. Its
contents will detail which usNIC device(s) (and therefore which
link(s)) are being used to communicate with each peer MPI process.
Here is a sample output file (named
mpi005.pid26071.job1640759297.mcwrank0.txt):
{{{
device=usnic_0,interface=eth4,ip=10.10.0.5/16,mac=24:57:20:05:20:00,mtu=9000
device=usnic_1,interface=eth5,ip=10.2.0.5/16,mac=24:57:20:05:21:00,mtu=9000
device=usnic_2,interface=eth6,ip=10.3.0.5/16,mac=24:57:20:05:50:00,mtu=9000
peer=1,hostname=mpi006,device=usnic_0@peer_ip=10.10.0.6/16@peer_mac=24:57:20:06:20:00,device=usnic_1@peer_ip=10.2.0.6/16@peer_mac=24:57:20:06:21:00,device=usnic_2@peer_ip=10.3.0.6/16@peer_mac=24:57:20:06:50:00
peer=2,hostname=mpi007,device=usnic_0@peer_ip=10.10.0.7/16@peer_mac=24:57:20:07:20:00,device=usnic_1@peer_ip=10.2.0.7/16@peer_mac=24:57:20:07:21:00,device=usnic_2@peer_ip=10.3.0.7/16@peer_mac=24:57:20:07:50:00
peer=3,hostname=mpi008,device=usnic_0@peer_ip=10.10.0.8/16@peer_mac=24:57:20:08:20:00,device=usnic_1@peer_ip=10.2.0.8/16@peer_mac=24:57:20:08:21:00,device=usnic_2@peer_ip=10.3.0.8/16@peer_mac=24:57:20:08:50:00
}}}
Reviewed by Reese Faucette
cmr=v1.8.2
This commit was SVN r32156.
This corner case is now handled in the pml so the same code
is invoked for both MPI_Start and MPI_Startall.
This also correctly report an error if MPI_Startall is invoked twice
on a MPI_PROC_NULL persistent request.
This commit was SVN r32139.
ibv_create_ah() can also return EHOSTUNREACH, which means that there
is no route to the peer. Treat that as a non-fatal warning.
Reviewed by Reese Faucette.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32135.
There's no need for the port number (since usNIC has no port numbers),
and make the wording the same as other help messages.
Reviewed by Reese Faucette.
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32134.
I recently found a case where ompi_mpi_abort() segv's:
{{{
$ mpirun --mca btl non_existent_btl_name ...
}}}
In this case, the BML init fails because we have no paths to any
peers. It calls ompi_mpi_abort(), but this is before ompi_comm_self
has been setup. ompi_mpi_abort() assumes that if the comm parameter
is != NULL, it can be used. But since we aborted so early in
MPI_INIT, that's a false assumption.
(note that this isn't happening on v1.8 because the check for
INIT/FINALIZE in ompi_mpi_abort() is a little different. Hence: this
is a trunk issue -- at least for now)
When fixing this problem, I noticed a few other problems in ompi_mpi_abort():
* the group access was incorrect (it didn't use accessor functions)
* it wasn't clear that ORTE's ompi_rte_abort_peers() returns
NOT_IMPLEMENTED and falls through down to ompi_rte_abort()
* the check for my proc in the communicator was a little more
complicated than necessary
* the logic for checking for aborts early in MPI_INIT wasn't right
* some comments were stale
* the hostname output in error messages would be NULL if MPI_FINALIZE
had been invoked
* it was possible to abort, but still exit with a 0 status
This commit fixes all of the above problems, and makes the logic a
little more straightforward. Thanks to Ralph Castain and George
Bosilca for the assists with this patch.
This commit was SVN r32125.
no need to #include <math.h> ...
cmr=v1.8.2:reviewer=miked:ticket=4759
This commit was SVN r32121.
The following Trac tickets were found above:
Ticket 4759 --> https://svn.open-mpi.org/trac/ompi/ticket/4759
The distances as returned by hwloc_get_whole_distance_matrix_by_type are typ float.
This patch handle all distances as float.
cmr=v1.8.2:reviewer=miked
This commit was SVN r32120.
cmr=v1.8.2:reviewer=tkordenbrock:subject=Portals4/MTL hanging fix
This commit was SVN r32113.
The following Trac tickets were found above:
Ticket 4681 --> https://svn.open-mpi.org/trac/ompi/ticket/4681
cmr=v1.8.2:reviewer=tkordenbrock:subject=Move r32112 to v1.8.2 branch
This commit was SVN r32112.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r32112
The following Trac tickets were found above:
Ticket 4682 --> https://svn.open-mpi.org/trac/ompi/ticket/4682
RHEL 7 has shipped with kernel support for the RDMA_TRANSPORT_USNIC
enum, but ''not'' the RDMA_TRANSPORT_USNIC_UDP enum. This means that
when you install usNIC drivers from cisco.com, the kernel will report
IBV_TRANSPORT_USNIC, even though the transport is actually using UDP.
Therefore, we have to modify the logic in common/verbs to do the
additional magic probe if the device reports either an
IBV_TRANSPORT_IWARP or IBV_TRANSPORT_USNIC (because both of those might
be lies -- do the probe to figure out the real transport).
The code changed by this patch is fairly trivial; it simply moves the
logic of the magic probe to its own short function, and then calls that
short function in both the IBV_TRANSPORT_(IWARP|USNIC) cases. It looks
longer because several lengthy comments were also updated.
Authored-by: Jeff Squyres <jsquyres@cisco.com>
Reviewed-by: Dave Goodell <dgoodell@cisco.com>
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32098.
the other collective modules. If we endup without some of the
collective the code will raise an error anyway.
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r32096.
Based on extensive discussions before/at the June 2014 developer's
meeting, put a lengthy comment explaining a second reason why we
''must'' use an RTE barrier during MPI_FINALIZE and
MPI_COMM_DISCONNECT (i.e., unreliable transports). Slightly explain
more the original reason why we do this, too (BTLs can lie/buffer a
message without actually injecting it on the network).
This commit was SVN r32095.
rtnetlink doesn't check the source address when determining whether to
return route info for a query. So we need to check that the OIF matches
the OIF of the source interface name. Without this check, OMPI might
pair a local interface which does not have a route to a particular
remote interface.
Fixes Cisco bug CSCup55797.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32090.
At the developer meeting today, the question was raised as to whether
the SCTP BTL was maintained any more. I emailed Alan Wagner to see if
he had any interest/resources to continue to maintain the SCTP BTL.
He indicated that he unfortunately had any resources to maintain it;
it would be fine to remove the SCTP BTL from the trunk.
So long, SCTP BTL... fare thee well...
This commit was SVN r32075.
This file has to be pre-emptively compiled to generate the module, but
then it also has to be included in libmpi_usempif08.
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32071.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
Some parameters were ommited and compilation failed if
configured with --disable-weak-symbols
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32064.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
1: find/create procs, and create associated endpoint for each
2: resolve peer addresses
The 2nd part is done as a separate loop so that the address lookups
can be parallelized.
The overall result is to split usnic_add_procs() into two smaller,
simpler parts.
cmr=v1.8.2:ticket=trac:4734
This commit was SVN r32062.
The following Trac tickets were found above:
Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
When ibv_create_ah() fails due to an address resolution failure, it
really only means that we can't reach that one peer -- so we should
just ignore that one peer. If ibv_create_ah() fails for some other
reason, then give up on the entire usnic_X device.
Change the show_help() message that is displayed when ibv_create_ah()
fails due to address resolution failure; indicate that it's likely a
routing problem. Also opal_output_verbose() the same info, since
show_help() is de-duplicated (and this particular show_help() message
can be squelched).
Fixes Cisco bugs CSCup35851 and CSCup35872.
cmr=v1.8.2:ticket=trac:4734
This commit was SVN r32061.
The following Trac tickets were found above:
Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
Use the appropriate modules, don't use mpif-config.h.
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32052.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
There is more comprehensive work regarding MPI_SIZEOF coming, but the
Fortran working group in the MPI Forum is debating this internally,
and I'm still doing more testing to get a final solution. So for the
moment, just remove real*16 and complex*32 support so that it compiles
porperly with older compilers (that do not support real*16 and
complex*32).
This commit was SVN r32048.
Move them all to fortran/use-mpi-f08, since that's the only directory
that uses them (the use-mpi-f08-desc directory has been disabled).
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32045.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
Thanks to Michael Rachner for pointing out the issue.
cmr=v1.8.2:ticket=trac:4736
This commit was SVN r32042.
The following Trac tickets were found above:
Ticket 4736 --> https://svn.open-mpi.org/trac/ompi/ticket/4736
This is part one of several Fortran improvements and fixes. This
first part removes the now-defunct scripts that are used to generate
the .f90 files in the use-mpi-tkr implementation, and just commits the
output from those scripts. This makes long-term maintenance of the
use-mpi-tkr implementation simpler.
cmr=v1.8.2:reviewer=jsquyres:subject=Various Fortran fixes/improvements
This commit was SVN r32040.
Move away from verbs-specific terms "device" and "port" in the usnic
BTL help messages. Replace them with "usNIC interface" (since usNIC
has no concept of a port).
cmr=v1.8.2:ticket=trac:4734
This commit was SVN r32029.
The following Trac tickets were found above:
Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
Move MACLEN and IPV4LEN into _util.h and rename them to be MACSTRLEN
and IPV4STRLEN, respectively.
cmr=v1.8.2:ticket=trac:4734
This commit was SVN r32028.
The following Trac tickets were found above:
Ticket 4734 --> https://svn.open-mpi.org/trac/ompi/ticket/4734
The post and start window calls are supposed to be matching. The code
did not check to see that an incoming post matched with the start call.
This commit fixes the bug by placing the post on a pending list that
will be checked by the next call to start.
cmr=v1.8.2:reviewer=dgoodell
This commit was SVN r32017.
The replace callback did not increment the incoming frag counter. This
leads to a hang during synchronization. This commit adds the increment
and also puts the request on the garbage collection list to fix a leak.
This fixes a hang found when running the mpich test suite.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32016.
The wrong type was used when calculating the amount of space needed
for an accumulate fragment. Fixed the calculation and took the
opportunity to eliminate the get_acc header as it is identical to the
acc header.
This fixes trac:4719 and #4718
Tracking these fixes for 1.8.2 in this CMR.
Throwing this to Brad for review as he is the one who ran into the issue.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32015.
The following Trac tickets were found above:
Ticket 4719 --> https://svn.open-mpi.org/trac/ompi/ticket/4719
This changeset :
- always call the low/level implementation for :
* MPI_Alltoallv
* MPI_Neighbor_alltoallv
* MPI_Alltoallw
* MPI_Neighbor_alltoallv
- fix mca_coll_tuned_alltoallv_intra_basic_inplace
so zero size types are correctly handled
cmr=v1.8.2:reviewer=bosilca:ticket=4715
This commit was SVN r32013.
The following Trac tickets were found above:
Ticket 4715 --> https://svn.open-mpi.org/trac/ompi/ticket/4715
in self optimization
This addresses an issue found with the MPICH pscw_ordering test. Eager sends
were not yet active (which is ok for the standard path) but not ok for the
self optimization. Fixed by waiting for all post messages before checking
the sync state.
Fixes trac:4724
Tracking the 1.8.2 issue in this CMR.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32012.
The following Trac tickets were found above:
Ticket 4724 --> https://svn.open-mpi.org/trac/ompi/ticket/4724
It is valid to lock/unlock MPI_PROC_NULL. It probably isn't work tracking
whether MPI_PROC_NULL is locked for MPI_PROC_NULL RMA operations so this
is probably the permanent solution.
Closes trac:4720
Tracking the 1.8.2 issue with this CMR.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32011.
The following Trac tickets were found above:
Ticket 4720 --> https://svn.open-mpi.org/trac/ompi/ticket/4720
Brad correctly pointed out that the total window size should not be an
int. Changed it to an unsigned long.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32010.
Only one field is valid for RMA requests: MPI_ERROR. This field is set
to the correct value in ompi_request_empty so there is no reason to
allocate and keep track of osc/sm requests because they are always
complete on return. Since we are no longer using the osc/sm request
structure or free list they are now removed.
Closes trac:4723
Tracking this issue with the CMR. Brad, can you verify the issue is indeed fixed.
cmr=v1.8.2:reviewer=bbenton
This commit was SVN r32009.
The following Trac tickets were found above:
Ticket 4723 --> https://svn.open-mpi.org/trac/ompi/ticket/4723
Correctly handle the corner case in MPI_Alltoallv when
some tasks have no data to transfer and some other tasks
do have data to transfer.
This test case is covered in ibm/collective/alltoallv_somezeros
from the ompi-tests repo.
cmr=v1.8.2:reviewer=bosilca
This commit was SVN r31985.
is really special as the weights can be one of the following three
values (NULL, EMPTY or some legal value). As such, we need a complex
if to correctly convert the Fortran value to the corresponding C
value. Thus, always defining the c_ array is the simplest and most
straighforward approach.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31955.
Issue noted by Walter Spector on the user's mailing list.
Throwing to Craig Rasmussen for review.
cmr=v1.8.2:reviewer=jsquyres
This commit was SVN r31933.
This would be a really, really weird case if it ever happens (i.e.,
you have usnics but the agent process failed somewhere in MPI_INIT
such that the agent never appears), but having an infinite loop
doesn't seem like a good idea.
(does not need to go to v1.8 because v1.8 still uses RML for
communication for the connectivity checker)
This commit was SVN r31932.
This conservative fixes tries to fetch info from both
opal_dstore_nonpeer and opal_dstore_peer.
This is required is task A spawns tasks B and C.
B was previously unable to find info from C, this caused locality
info not being set and a hang in coll/ml init.
no CMR is required since v1.8 uses a unique dstore
This commit was SVN r31923.
if eager rdma is used, endpoint reference_count is greater than one.
this commit is a temporary fix that OBJ_RELEASE the endpoint as much as needed.
thought this is likely correct, it can be suboptimal and hence needs to be reviewed
cmr=v1.8.2:reviewer=hjelmn
This commit was SVN r31922.
http://www.open-mpi.org/community/lists/devel/2014/05/14822.php
Revamp the ORTE global data structures to reduce memory footprint and add new features. Add ability to control/set cpu frequency, though this can only be done if the sys admin has setup the system to support it (or you run as root).
This commit was SVN r31916.
We were still leaking 1) file descriptors for data files, and 2) some
control files. I fixed both of these leaks and everything is looking
good. This should fix the bug where we are running out of file
descriptors when running the loop_spawn test. I also too the
opportunity to refactor the code a bit to make the mapping/unmapping
simpler. This should help avoid these sorts of issues in the future.
Depends on #4678
cmr=v1.8.2:reviewer=manjugv
This commit was SVN r31893.