1
1

69 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
34897cee9f usnic: unify teardown between trunk and v1.8 branches
Make the del_procs, module finalize, and endpoint destructors be the
same between trunk and v1.8, with one exception: the very beginning of
v1.8 module_finalize calls del_procs for each proc to simulate/pretend
the trunk/v1.9 PML behavior of calling del_procs before module_finalize.

This commit was SVN r32437.
2014-08-05 22:31:55 +00:00
Jeff Squyres
1a8d72119f usnic: Fix configure.ac typo
This commit was SVN r32436.
2014-08-05 22:31:07 +00:00
Dave Goodell
13b104bdef usnic: fix endpoint destruction on the trunk
Fixes an assertion failure in --enable-debug builds and SEGVs in normal
builds.

I'm not 100% sure I like this model, but it at least seems to be
consistent.  Some variation on this scheme will need to be adapted to
the trunk, where usnic_del_procs() is called by the PML instead of
internally in usnic_finalize().

A related bug (but with different mechanics) is #4832.

This commit was SVN r32424.
2014-08-04 21:30:21 +00:00
Dave Goodell
490c484f8c usnic: fix uninitialized param to accept(2)
This commit was SVN r32423.
2014-08-04 21:30:08 +00:00
Dave Goodell
61a9b49d5b usnic: fix usnic breakage in ORCM repo
This commit was SVN r32416.
2014-08-04 19:34:55 +00:00
Jeff Squyres
ff4717b727 usnic: cagent now checks that incoming pings are expected
Previously, the connectivity agent was pretty dumb: it took whatever
pings it got and ACKed them.  Then we added an agent check to ensured
that the ping actually came from the source interface that it said it
came from.  Now we add another check such that when a ping is received
on interface X that corresponds to usnic module Y, we ensure that the
source interface of the ping is on the all_endpoints list for module Y
(i.e., module Y expects to be able to talk to that peer interface).

This detects cases where peers have come to different conclusions
about which interfaces should be used to communicate (which is bad!).
This usually reflects a network misconfiguration.

Fixes CSCuq05389.

This commit was SVN r32383.
2014-07-31 22:30:20 +00:00
Ralph Castain
db89071dc2 Cleanup the moved component's Makefile.am to use the opal instead of ompi directories
This commit was SVN r32370.
2014-07-31 04:41:04 +00:00
Jeff Squyres
959bdace3c usnic: check that connectivity pings came from where they said they came from
Ensure that incoming "ping" messages came from the IP address that
they think they came from.  If they don't, drop them (because it is
probably routing error), which will likely eventually cause the
connectivity checker to timeout, and therefore cause the job to abort.

This commit was SVN r32368.
2014-07-30 21:03:56 +00:00
Jeff Squyres
20349da03b usnic: minor cleanup
This commit was SVN r32367.
2014-07-30 20:56:49 +00:00
Jeff Squyres
f1fb4970a5 usnic: remove all trailing whitespace
Style cleanup only; no code changes.

This commit was SVN r32366.
2014-07-30 20:56:15 +00:00
Jeff Squyres
a6c6aa2815 usnic: remove compatibility with v1.6 series
Update compat.h to only handle compatability between v1.7/v1.8 and
v1.9/2.0 (i.e., the current trunk).  Remove what seems to be the last
vestiages of OMPI/ORTE pollution in the now-OPAL-ized usnic BTL.

Currently use a hard-coded constant for the MCW size (i.e.,
MPI_COMM_WORLD size) for some initialization values in the v1.9/2.0
series; still need to figure out something better there.

This commit was SVN r32365.
2014-07-30 20:55:26 +00:00
Jeff Squyres
aed8b43db2 usnic: remove last vestiges of ORTE process names
These were missed when the BTLs moved to OPAL.

This commit was SVN r32364.
2014-07-30 20:53:41 +00:00
Jeff Squyres
d195b8caf4 usnic: fix a bunch of OMPI_* and ORTE_* constant usage
Somehow a bunch of OMPI_* and ORTE_* constants didn't get renamed in
the BTL move to OPAL.  This commit fixes that.

This commit was SVN r32363.
2014-07-30 20:52:54 +00:00
Jeff Squyres
2447c8479f usnic: do not call ompi_rte_abort()
Adapt to moving down to OPAL: find a PML-registered error callback,
and use that when we don't have a module context and we need to abort.

Failing that, just call exit().

This commit was SVN r32362.
2014-07-30 20:52:06 +00:00
Jeff Squyres
10a58992af usnic: add --with-usnic configure switch
If --with-usnic is specified and we can't build the usnic BTL, abort.
If --without-usnic is specified, gracefully skip building the usnic
BTL.  If neither is specified, do the OMPI-default behavior: try to
configure/build the usnic BTL, and if we can't, skip it.
    
Fixes CSCuq13889.

This commit was SVN r32349.
2014-07-30 02:11:04 +00:00
George Bosilca
3d5578d6df No vestiges of the RTE or OMPI left.
This commit was SVN r32327.
2014-07-26 22:19:49 +00:00
George Bosilca
f217661ee0 Use opal_process_info whenever possible. Some other minor cleanups.
This commit was SVN r32325.
2014-07-26 21:48:23 +00:00
Ralph Castain
fbed15072c Add missing rename of file
This commit was SVN r32318.
2014-07-26 01:21:07 +00:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00