1
1
Граф коммитов

25930 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
d1e1ec51b6 ompio: correctly fix a memory plug
as newly reported by Coverity with CID 1372660
2016-09-08 18:50:18 +09:00
Artem Polyakov
84e178ce94 Merge pull request #1821 from artpol84/fix_waitsome_v2
MPI_Waitsome performance improvement (version #2)
2016-09-08 13:55:37 +07:00
Gilles Gouaillardet
b2a2be0e5a odls: fix memory leak plug
This fixes commit open-mpi/ompi@e2c343cdfc.
2016-09-08 10:02:52 +09:00
Nathan Hjelm
63d73a5dd0 Merge pull request #2061 from hjelmn/cid_inter
comm/cid: use ibcast to distribute result in intercomm case
2016-09-07 16:36:00 -06:00
Jeff Squyres
fd829ac389 Merge pull request #1982 from jsquyres/pr/fix-pkg-config-static
pkg-config: fix static linking
2016-09-07 14:55:50 -04:00
Nathan Hjelm
54cc829aab comm/cid: use ibcast to distribute result in intercomm case
This commit updates the intercomm allgather to do a local comm bcast
as the final step. This should resolve a hang seen in intercomm
tests.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-09-07 10:49:04 -06:00
Jeff Squyres
b811b0a15c Merge pull request #2060 from jsquyres/pr/remove-unused-var
orte proc_info.c: remove unused variable
2016-09-07 06:33:26 -04:00
Artem Polyakov
9eba1b0b75 Merge pull request #2042 from artpol84/pmix_sdirs
Several fixes related to session directories:
2016-09-07 14:15:47 +07:00
Artem Polyakov
a9a7f39773 ess/pmi: fix the comments about MCA/PMIx setting conflict resolution. 2016-09-07 07:47:35 +03:00
Gilles Gouaillardet
be41b120d0 orted: plug misc memory leaks
as reported by Coverity with CID 1362603 and 1362606
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
cd2b5a82ed hwloc: plug memory leak
as reported by Coverity with CID 1270441
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
e2c343cdfc odls: plus memory leak
as reported by Coverity with CID 710645
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
213a981041 io/ompio: plug memory leaks
as reported by Coverity with CIDs 1369022 and 1369023
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
c09899f6af plm: plus resource leaks
as reported by Coverity with CIDs 72274 and 1196733
2016-09-07 10:08:44 +09:00
Gilles Gouaillardet
44a66e208c threads: fix WAIT_SYNC_INIT with a zero count
WAIT_SYNC_INIT(sync,0); WAIT_SYNC_RELEASE(sync);
hanged because sync->signaled was initialised to true, and
there is no reason to invoke WAIT_SYNC_SIGNALED(sync) before
WAIT_SYNC_RELEASE(sync)
this commit initializes sync->signaled to true unless the count is zero.

Thanks George for the review and guidance.
2016-09-07 10:03:40 +09:00
Jeff Squyres
722d5eecf1 orte proc_info.c: remove unused variable
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-06 16:38:15 -07:00
Nathan Hjelm
08d08e6c69 Merge pull request #2048 from hjelmn/pgi_asm
config: re-enable GCC inline ASM check for PGI
2016-09-06 16:24:15 -06:00
rhc54
39861ee987 Merge pull request #2058 from rhc54/topic/sync
Fix typo on the COLL_SYNC macro
2016-09-06 17:17:29 -05:00
Nathan Hjelm
27a2509fec Merge pull request #2051 from hjelmn/ppc_asm
opal/asm: updates to powerpc assembly
2016-09-06 15:13:28 -06:00
Ralph Castain
7f3fac48ab Fix typo on the COLL_SYNC macro 2016-09-06 12:43:07 -07:00
Josh Hursey
f6337f9eae Merge pull request #2047 from jjhursey/topic/mixed-host2
orte: !FQDN implementation to use opal_net_isaddr
2016-09-06 13:08:54 -05:00
rhc54
8b46118e87 Merge pull request #2057 from rhc54/topic/cid
Coverity fixes
2016-09-06 11:19:03 -05:00
Todd Kordenbrock
a17dff281d Merge pull request #1900 from PDeveze/mtl-portals4-short_msg-split_msg
Mtl portals4 short msg split msg
2016-09-06 11:14:19 -05:00
Ralph Castain
f85dcaee2a Fixes CID 1369067 and CID 1196684
Fixes CID 1369648

    Fixes CID 1372409
2016-09-06 08:43:15 -07:00
Jeff Squyres
527efec4fb Merge pull request #2050 from jsquyres/pr/btl-tcp-help-messages
Add a show_help message to TCP BTL when peer unexpectedly disconnects
2016-09-06 09:40:31 -04:00
Jeff Squyres
1953e3406f btl/tcp: add show_help message when peer hangs up
We commonly see messages on the users list where a peer has hung up
because it has crashed.  Instead of having just a BTL_ERROR message,
make this a real opal_show_help() message that tells the user that the
peer unexpectedly hung up, and they should look into *why* that peer
hung up.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-06 09:40:03 -04:00
Gilles Gouaillardet
894be7860a gcc_builtin/atomic: Silence numerous warnings from Studio compilers
This commit adds selective use of a compiler-specific pragma to
silence the numerous warnings the Sun/Oracle/Studio compilers emit for
the GNU-style inline asm used in atomic.h.

Thanks Paul Hargrove for the initial patch and the guidance.
2016-09-06 09:07:16 +09:00
Gilles Gouaillardet
7b39d9065c Merge pull request #2054 from ggouaillardet/topic/mca_btl_tcp_proc_insert
btl/tcp: make mca_btl_tcp_proc_insert re-entrant
2016-09-06 08:54:38 +09:00
Gilles Gouaillardet
91e1200c14 ompi/request: correctly handle zero count in ompi_request_default_wait_{all,any,some} 2016-09-05 17:19:30 +09:00
Gilles Gouaillardet
4b208e4463 btl/tcp: make mca_btl_tcp_proc_insert re-entrant
otherwise bad things happen with
 --mca btl_tcp_progress_thread 1 (non default)
and
 --mca mpi_add_procs_cutoff 0 (default)
2016-09-05 15:57:34 +09:00
Artem Polyakov
74a11d7832 Fix session dir cleanup code. 2016-09-05 07:53:55 +03:00
Artem Polyakov
dc0ab674de Add PMIx key to provide RM with ability to indicate that it will cleanup
session directories provided at through OPAL_PMIX_TMPDIR,
OPAL_PMIX_NSDIR, OPAL_PMIX_PROCDIR
2016-09-05 07:48:44 +03:00
Artem Polyakov
81195ab724 Several fixes related to session directories:
* enable OMPI to retrieve paths from RM through PMIx
* cleanups related to tempdirs.
2016-09-05 07:48:44 +03:00
Ralph Castain
fb51d65049 Minor change: check for NULL before using the job map to avoid segfault when erroring out prior to creating the map 2016-09-04 07:53:12 -07:00
Alex Mikheev
439456ae96 OSHMEM: spml ikrit: fixes zero copy
Allow mxm to use zero copy in put() and get() for the large messages.
2016-09-04 12:16:09 +03:00
Nathan Hjelm
a36bdfe69f opal/asm: updates to powerpc assembly
This commit contains the following changes:

 - There is a bug in the PGI 16.x betas for ppc64 that causes them to
   emit the incorrect instruction for loading 64-bit operands. If not
   cast to void * the operands are loaded with lwz (load word and
   zero) instead of ld. This does not affect optimized mode. The work
   around is to cast to void * and was implemented similar to a
   work-around for a xlc bug.

 - Actually implement 64-bit add/sub. These functions were missing and
   fell back to the less efficient compare-and-swap implementations.

Thanks to @PHHargrove for helping to track this down. With this update
the GCC inline assembly works as expected with pgi and ppc64.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-02 23:47:47 -06:00
Jeff Squyres
95c6f6cfc0 btl/tcp: fix help message
It looks like one help message was accidentally pasted in the middle
of another.  Disentangle the two messages from each other, and
slightly tweak the one message to say that the job may also crash (in
addition to hanging).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-02 17:14:22 -04:00
rhc54
9c496f767b Merge pull request #1602 from rhc54/topic/psm
Enable PSM to support dynamic processes
2016-09-02 14:41:19 -05:00
Nathan Hjelm
795833bfac config: re-enable GCC inline ASM check for PGI
We disabled this support a long time ago. Probably safe to assume
whatever bug we were working around no longer exists.

Closes open-mpi/ompi#2044

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-02 12:44:08 -06:00
Joshua Hursey
fe937d1e82 orte: !FQDN implementation to use opal_net_isaddr
* Switch to use opal_net_isaddr() for checking if a name is an IP
   address - as it is a bit cleaner, and uses common functionality.
2016-09-02 13:31:49 -05:00
Ralph Castain
4e0788e9ad Enable PSM to support dynamic processes
Fix comm_spawn to correctly reference the actual parent process that requested the spawn when looking for the parent job object
2016-09-02 10:22:04 -07:00
Nathan Hjelm
3274203f8a Merge pull request #2046 from hjelmn/ugni_fix
btl/ugni: fix erroneous warning message
2016-09-02 10:32:28 -06:00
Nathan Hjelm
f93c1f2106 btl/ugni: fix erroneous warning message
This commit prevents the connection code from trying to connect an
endpoint if the directed datagram has been posted but not received.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-02 09:17:44 -06:00
Ralph Castain
5c9ea565b6 Update NEWS 2016-09-01 18:10:21 -07:00
Ralph Castain
34f04a7924 Remove spurious Makefile.am line 2016-09-01 15:31:09 -07:00
Nathan Hjelm
1ce5847e8b osc/rdma: add support for network AMOs
This commit adds support for using network AMOs for MPI_Accumulate,
MPI_Fetch_and_op, and MPI_Compare_and_swap. This support is only
enabled if the ompi_single_intrinsic info key is specified or the
acc_single_interinsic MCA variable is set. This configuration
indicates to this implementation that no long accumulates will be
performed since these do not currently mix with the AMO
implementation.

This commit also cleans up the code somwhat. This includes removing
unnecessary struct keywords where the type is also typedef'd.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-01 15:47:33 -06:00
rhc54
fde6e6c6f8 Merge pull request #2043 from rhc54/topic/notifycomplete
Implement notification of completion on comm_spawn'd child jobs.
2016-09-01 16:42:30 -05:00
Ralph Castain
0ea1cff733 Implement notification of completion on comm_spawn'd child jobs. Add a configure flag to enable PMIx 3's shared memory datastore, and set it disable by default so that comm_spawn functions again. Will reverse the default once that feature is fully functional 2016-09-01 13:10:10 -07:00
Nathan Hjelm
43b2e3a844 Merge pull request #2041 from hjelmn/osc_pt2pt_fix
osc/pt2pt: do not use frag send to send lock request
2016-09-01 13:02:45 -06:00
Nathan Hjelm
cb1cb5ffed osc/pt2pt: do not use frag send to send lock request
This commit cleans up some code in the passive target path. The code
used the buffered frag control send path but it is more appropriate to
use the unbuffered one. This avoids checking structures that are
should not be in use in this path.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-01 09:57:27 -06:00