1
1
Граф коммитов

5001 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
Jeff Squyres
0ab4eaa7d3 usnic: revert r32315 because the BTL move to opal is ongoing
Let's not make the move to OPAL any harder than it has to be; this
commit can wait until after the BTL move.

This commit was SVN r32316.

The following SVN revision numbers were found above:
  r32315 --> open-mpi/ompi@7b7ed8ed97
2014-07-25 14:17:20 +00:00
Jeff Squyres
7b7ed8ed97 usnic: minor cleanup / consolidation
CMR'ing just to (try to) keep the differences between trunk and v1.8
branch (somewhat) small.

Reviewed by Dave Goodell

cmr=v1.8.3:reviewer=ompi-rm1.8

This commit was SVN r32315.
2014-07-25 14:11:54 +00:00
Jeff Squyres
6ae45b34fc usnic: check connectivity on first communication to a peer
Previously, we were only checking connectivity upon first ''send'' to
a peer.  But this ignores the case where the first communication to a
peer is actually an ACK -- i.e., we successfully received something
from the peer and we need to send an ACK back.  So we need to verify
that the ACK will actually get there.

Specifically, certain asymmetric routing cases can lead to a hang if
we don't check the connectivity in both directions.  E.g., if the
sender is able to get traffic to the receiver, but the receiver is
unable to get traffic back to the sender because it made a different
routing decision than the sender.

In this case, the connectivity checker from the sender could succeed
(because the connectivity checker will ACK along the same path in
which the ping was received), but sending a BTL ACK could fail
(because the BTL ACK will be sent back along the path chosen by the
graph algorithm, which, in an erroneous asymmetric routing scenario,
may be different/wrong).

Hence, we want to trigger the connectivity checker at the first
communication from A->B, which may either be a BTL send or an ACK.

Reviewed by Dave Goodell.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32309.
2014-07-24 21:32:56 +00:00
Rolf vandeVaart
9bc8fbaefd Create new error message so we can better pinpoint where an error occurs.
This commit was SVN r32303.
2014-07-24 15:18:55 +00:00
Rolf vandeVaart
3f703afb97 Fix CUDA registration where we run out of memory being allocated.
This commit was SVN r32297.
2014-07-23 21:10:17 +00:00
Todd Kordenbrock
42a871efd4 This commit fixes trac:4662 - "Portals4/MTL hangs in c_get_accumulate test".
- Portals4/OSC was unable to acquire an exclusive lock due to an invalid
local address in the atomic operation.  This caused the reported hang.
- After fixing the hang, the test continued to fail because
ompi_datatype_is_contiguous_memory_layout() reports that MPI_EMPTY (the
origin datatype) is noncontiguous and Portals4/OSC does not support
noncontiguous datatypes at this time.  However, in this case the origin
count is zero so contiguous/noncontiguous is irrelevant.  Now we skip
the contiguous check if the count is zero.

cmr=v1.8.3:reviewer=regrant:subject=Fix for "Portals4/MTL hangs in c_get_accumulate test"

This commit was SVN r32295.

The following Trac tickets were found above:
  Ticket 4662 --> https://svn.open-mpi.org/trac/ompi/ticket/4662
2014-07-23 19:13:07 +00:00
Edgar Gabriel
d4f83ab929 clean up of the MCA parameters of the fcoll framework. Most parameters are now
set/retrieved in ompio instead of the fcoll components.

This commit was SVN r32294.
2014-07-23 19:03:14 +00:00
Jeff Squyres
ac2621debf usnic: show_help if we can't create the connectivity map file
QA ran across the case where the user can't write to the target
directory for the connectivity map file.  In this case, we silently
continued.  They requested that we at least warn in this case.

Fixes Cisco bug CSCup62821

Reviewed by Dave Goodell

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32283.
2014-07-22 20:50:59 +00:00
Devendar Bureddy
74852b4d21 HCOLL: fix misplaced hcoll_init return value check.
cmr=v1.8.2:reviewer=jladd

This commit was SVN r32282.
2014-07-22 18:47:34 +00:00
Rolf vandeVaart
63d6a08283 Fix set-but-unused-warning noticed by jsquyres.
cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32281.
2014-07-22 18:37:40 +00:00
Howard Pritchard
828a4a29b7 Subject: fix regression in ugni btl eager get path
Description:
This mod fixes a regression in the ugni btl eager get
path introduced in changeset 32196.
References:4800
Closes:4800

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32264.
2014-07-22 15:42:11 +00:00
Rolf vandeVaart
8778418da2 Remove some debug #ifdefs (oops). Other lock support.
This commit was SVN r32263.
2014-07-22 02:09:06 +00:00
Rolf vandeVaart
1a61dd3078 Add some more locks where needed.
This commit was SVN r32262.
2014-07-22 00:29:57 +00:00
Rolf vandeVaart
7897d2a828 Improve verbose message which says which device:ports are being used. Also move where message is generated.
This commit was SVN r32261.
2014-07-21 20:38:52 +00:00
Jeff Squyres
da18eb1b8b common/verbs: fix usnic detection
The logic was mishandling the case of a newer kernel and an older
libusnic_verbs.  Simplify usnic_transport() to return constants in the
2 known cases (not a usNIC device and the TRANSPORT_USNIC_UDP case),
and call the magic probe in all other cases.

Reviewed-by: Dave Goodell <dgoodell@cisco.com>

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32260.
2014-07-21 19:52:29 +00:00
Jeff Squyres
b6075ea775 usnic: explicitly handle case when both endpoints are NULL
If we don't explicitly declare that (a == NULL && b == NULL) is
equivalent to qsort, we could end up with wonky sorting order.  I.e.,
it's *possible* that some NULLs could end up in the middle of the
array.

Regardless of whether it will ever happen in practice, it makes the
code more clear to also handle the "both are NULL" case.

Also fix the 2-spacing indents.

Reviewed by Dave Goodell.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32259.
2014-07-21 16:22:48 +00:00
Rolf vandeVaart
947a4e14b4 Add a lock and clean up handling of some error conditions,
This commit was SVN r32258.
2014-07-17 19:33:10 +00:00
Mike Dubman
da8df859b3 MXM: use builk connection establishment API
fixed by Vasily, reviewed by Yossi/Miked

cmr=v1.8.2:reviwer=ompi-rm1.8

This commit was SVN r32256.
2014-07-17 08:35:55 +00:00
Rolf vandeVaart
26e3282a18 One more minor movement for easier reading. No functional change.
This commit was SVN r32252.
2014-07-16 20:59:07 +00:00
Rolf vandeVaart
c332ca75ff Change function name for clarity.
This commit was SVN r32251.
2014-07-16 20:46:10 +00:00
Rolf vandeVaart
a2dd4ca226 Remove hack that is no longer needed.
This commit was SVN r32250.
2014-07-16 14:00:17 +00:00
Rolf vandeVaart
61821adf2f Fix self deadlock bugs.
This commit was SVN r32249.
2014-07-15 20:50:41 +00:00
Ralph Castain
6c5e592785 Revert r32222, r32210, and r32203 as they created a problem when daemon collectives did not involve app procs on every node. Instead, modify the ompi/mca/rte/orte/rte_orte.h to add a new function that allows apps to request new daemon collective ids for use in barrier and modex operations. This will only appear in ORTE-based installations, but it is only being used by a couple of researchers at the moment.
Update the orte/test/mpi/coll_test.c test to show the revised example.

This commit was SVN r32234.

The following SVN revision numbers were found above:
  r32203 --> open-mpi/ompi@a523dba41d
  r32210 --> open-mpi/ompi@2ce11ed5c4
  r32222 --> open-mpi/ompi@d55f16db50
2014-07-15 03:48:00 +00:00
Nathan Hjelm
f960e4273e Fix typo in r32196
The wrong descriptor field was used when calculating the size received when
using the RDMA rendevous protcol.

This commit was SVN r32232.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-14 21:00:53 +00:00
Ralph Castain
3d1b32a2c6 Silence warning
cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32231.
2014-07-14 19:27:30 +00:00
Mike Dubman
e342a11c2e opal envlist mca: implement Jeff`s quibbles
fixed by Elena, reviewed by Miked

This commit was SVN r32216.
2014-07-11 07:23:20 +00:00
Gilles Gouaillardet
77184b5c4c Fix a cornercase with MPI_PROC_NULL persistent requests
Handle OMPI_REQUEST_NOOP in MPI_Startall rather than PML

cmr=v1.8.2:reviewer=bosilca:ticket=4764

This commit was SVN r32213.

The following Trac tickets were found above:
  Ticket 4764 --> https://svn.open-mpi.org/trac/ompi/ticket/4764
2014-07-11 04:37:01 +00:00
Gilles Gouaillardet
d3ff5d77e1 scif: Fix compile error related to r32196
This commit was SVN r32212.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-11 04:32:25 +00:00
Jeff Squyres
7384ee9e44 usnic: handle NULL endpoints in connectivity map
The connectivity map output routine needs to handle the case where
entries in the endpoints array are NULL (e.g., if one process has 2
endpoints and another process has only 1 endpoint).

Fixes Cisco bug CSCup83649.

cmr=v1.8.2

This commit was SVN r32211.
2014-07-11 00:43:45 +00:00
Ralph Castain
a523dba41d NOTE: this modifies the MPI-RTE interface
We have been getting several requests for new collectives that need to be inserted in various places of the MPI layer, all in support of either checkpoint/restart or various research efforts. Until now, this would require that the collective id's be generated at launch. which required modification
s to ORTE and other places. We chose not to make collectives reusable as the race conditions associated with resetting collective counters are daunti
ng.

This commit extends the collective system to allow self-generation of collective id's that the daemons need to support, thereby allowing developers to request any number of collectives for their work. There is one restriction: RTE collectives must occur at the process level - i.e., we don't curren
tly have a way of tagging the collective to a specific thread. From the comment in the code:

 * In order to allow scalable
 * generation of collective id's, they are formed as:
 *
 * top 32-bits are the jobid of the procs involved in
 * the collective. For collectives across multiple jobs
 * (e.g., in a connect_accept), the daemon jobid will
 * be used as the id will be issued by mpirun. This
 * won't cause problems because daemons don't use the
 * collective_id
 *
 * bottom 32-bits are a rolling counter that recycles
 * when the max is hit. The daemon will cleanup each
 * collective upon completion, so this means a job can
 * never have more than 2**32 collectives going on at
 * a time. If someone needs more than that - they've got
 * a problem.
 *
 * Note that this means (for now) that RTE-level collectives
 * cannot be done by individual threads - they must be
 * done at the overall process level. This is required as
 * there is no guaranteed ordering for the collective id's,
 * and all the participants must agree on the id of the
 * collective they are executing. So if thread A on one
 * process asks for a collective id before thread B does,
 * but B asks before A on another process, the collectives will
 * be mixed and not result in the expected behavior. We may
 * find a way to relax this requirement in the future by
 * adding a thread context id to the jobid field (maybe taking the
 * lower 16-bits of that field).

This commit includes a test program (orte/test/mpi/coll_test.c) that cycles 100 times across barrier and modex collectives.

This commit was SVN r32203.
2014-07-10 18:53:12 +00:00
Nathan Hjelm
1b9621eeb0 Fix typo in r32196
This commit was SVN r32202.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-10 18:43:49 +00:00
Nathan Hjelm
32ab6f850e osc/rdma: fix warning
cmr=v1.8.2:reviewer=rhc

This commit was SVN r32201.
2014-07-10 18:42:55 +00:00
Jeff Squyres
3c4674484d usnic: Fix compile errors related to r32196
This commit was SVN r32198.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-10 17:18:03 +00:00
Nathan Hjelm
a14e0f10d4 Per RFC: Remove des_src and des_dst members from the
mca_btl_base_segment_t and replace them with des_local and des_remote

This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.

In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.

RFC text:

This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.

What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.

Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.

This commit was SVN r32196.
2014-07-10 16:31:15 +00:00
Howard Pritchard
0bc7405e07 Subject: fix name conflict when both ugni and scif installed on system
Description: This mod fixes two name conflicts between the ugni and scif btls.
References:4771
Closes:4771

cmr=v1.8.2:reviewer=hjelmn

This commit was SVN r32183.
2014-07-09 19:33:58 +00:00
Nathan Hjelm
56ad231b7c coll/ml: temporarily disable binding check
This commit was SVN r32178.
2014-07-09 14:39:49 +00:00
Joshua Ladd
057370364d Opal: Add a new MCA variable type "version_string". Also add a
new flag to ompi_info that allows a user to print all MCA variables of a specific type.  

 --type version_string

This command will print all MCA variables of type version_string.

This feature was developed by Elena Shipunova and was reviewed by Josh Ladd.

This commit was SVN r32166.
2014-07-09 01:37:23 +00:00
Nathan Hjelm
b6abe68972 osc/rdma: check for more types of window access violations
This commit adds a check to see if the target is in an access epoch. If
not we return OMPI_ERR_RMA_SYNC. This fixes test_start3 in the onesided
test suite. The cost of this extra check is 1 byte/peer for the boolean
flag indicating that the peer is in an access epoch.

I also fixed a problem where mupliple unexpected post messages are not
correctly handled.

cmr=v1.8.2:reviewer=jsquyres

This commit was SVN r32160.
2014-07-08 21:11:12 +00:00
Jeff Squyres
d63cf04d2e btl_usnic_map.c: Arrgh! Forgot to svn add this file.
cmr=v1.8.2:ticket=trac:4773

This commit was SVN r32159.

The following Trac tickets were found above:
  Ticket 4773 --> https://svn.open-mpi.org/trac/ompi/ticket/4773
2014-07-08 20:09:31 +00:00
Jeff Squyres
1e17ab461b usnic: add btl_usnic_connectivity_map MCA param to output link information
If the btl_usnic_connectivity_map MCA param is set to a non-NULL
value, then each MPI process will output a file named
<prefix>-<hostname>.pid<pid>.job<jobid>.mcwrank<MCW rank>.txt.  Its
contents will detail which usNIC device(s) (and therefore which
link(s)) are being used to communicate with each peer MPI process.

Here is a sample output file (named
mpi005.pid26071.job1640759297.mcwrank0.txt):

{{{
device=usnic_0,interface=eth4,ip=10.10.0.5/16,mac=24:57:20:05:20:00,mtu=9000
device=usnic_1,interface=eth5,ip=10.2.0.5/16,mac=24:57:20:05:21:00,mtu=9000
device=usnic_2,interface=eth6,ip=10.3.0.5/16,mac=24:57:20:05:50:00,mtu=9000
peer=1,hostname=mpi006,device=usnic_0@peer_ip=10.10.0.6/16@peer_mac=24:57:20:06:20:00,device=usnic_1@peer_ip=10.2.0.6/16@peer_mac=24:57:20:06:21:00,device=usnic_2@peer_ip=10.3.0.6/16@peer_mac=24:57:20:06:50:00
peer=2,hostname=mpi007,device=usnic_0@peer_ip=10.10.0.7/16@peer_mac=24:57:20:07:20:00,device=usnic_1@peer_ip=10.2.0.7/16@peer_mac=24:57:20:07:21:00,device=usnic_2@peer_ip=10.3.0.7/16@peer_mac=24:57:20:07:50:00
peer=3,hostname=mpi008,device=usnic_0@peer_ip=10.10.0.8/16@peer_mac=24:57:20:08:20:00,device=usnic_1@peer_ip=10.2.0.8/16@peer_mac=24:57:20:08:21:00,device=usnic_2@peer_ip=10.3.0.8/16@peer_mac=24:57:20:08:50:00
}}}

Reviewed by Reese Faucette

cmr=v1.8.2

This commit was SVN r32156.
2014-07-08 19:14:46 +00:00
Nathan Hjelm
309a6cf951 coll/ml: set n_resources to 0 when destructing an lmngr
Also keep track of the allocation base so we free the correct pointer
when cleaning up.

cmr=v1.8.2:reviewer=manjugv

This commit was SVN r32151.
2014-07-07 15:11:26 +00:00
Gilles Gouaillardet
8d3bea2771 Fix the cornercase with MPI_PROC_NULL persistent requests.
This corner case is now handled in the pml so the same code
is invoked for both MPI_Start and MPI_Startall.
This also correctly report an error if MPI_Startall is invoked twice
on a MPI_PROC_NULL persistent request.

This commit was SVN r32139.
2014-07-04 04:58:52 +00:00
Edgar Gabriel
a16e4c5bf9 As discussed during the Open MPI meeting, make ompio the default parallel I/O
library on the trunk in order to expose it to more testing.

This commit was SVN r32138.
2014-07-03 20:04:58 +00:00
Jeff Squyres
e022dd30bc usnic: EHOSTUNREACH means there is no route
ibv_create_ah() can also return EHOSTUNREACH, which means that there
is no route to the peer.  Treat that as a non-fatal warning.

Reviewed by Reese Faucette.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32135.
2014-07-03 17:19:30 +00:00
Jeff Squyres
81edddff61 usnic: make this show_help message like the others
There's no need for the port number (since usNIC has no port numbers),
and make the wording the same as other help messages.

Reviewed by Reese Faucette.

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32134.
2014-07-03 17:17:34 +00:00
George Bosilca
843ef1fcb0 ompi_mpi_abort had one extra argument that was never used. Clean it up.
This commit was SVN r32124.
2014-07-03 00:34:44 +00:00
George Bosilca
2883adcdf3 Remove useless variables.
This commit was SVN r32123.
2014-07-03 00:30:54 +00:00
Gilles Gouaillardet
8a2a0293fd fix sort_devs_by_distance in btl/openib
no need to #include <math.h> ...

cmr=v1.8.2:reviewer=miked:ticket=4759

This commit was SVN r32121.

The following Trac tickets were found above:
  Ticket 4759 --> https://svn.open-mpi.org/trac/ompi/ticket/4759
2014-07-02 08:08:10 +00:00
Gilles Gouaillardet
134eee1c4f fix sort_devs_by_distance in btl/openib
The distances as returned by hwloc_get_whole_distance_matrix_by_type are typ float.
This patch handle all distances as float.

cmr=v1.8.2:reviewer=miked

This commit was SVN r32120.
2014-07-02 07:56:40 +00:00