Ralph Castain
96cd42699e
Cleanup warnings for uninitialized vars and convert bare debug output to verbose
2015-05-21 07:41:26 -07:00
Jeff Squyres
0d4a6d7326
Merge pull request #588 from jsquyres/pr/keepalive-madness
...
Keepalive cleanups
2015-05-20 21:38:44 -04:00
Jeff Squyres
3069daa015
oob_tcp_listener: slightly refactor EAGAIN/EWOULDBLOCK
...
Have only a single level of "if" conditionals. Also, slightly change
the logic such that we only die/break out of the loop if we get EMFILE
-- all other errors are ok to go on to the next fd.
Finally, use a real show_help() message to warn when other errors occur.
2015-05-20 21:10:11 -04:00
Jeff Squyres
e43c8dc291
oob tcp: label a few #endif's
...
Only bother labeling the ones that are a little far away from their
corresponding #if statements.
2015-05-20 21:10:11 -04:00
Jeff Squyres
4b2f0d4827
oob tcp: reset MCA params from level 9
...
Set various MCA param levels
2015-05-20 21:10:11 -04:00
Jeff Squyres
1a4c9960e1
oob tcp: set KEEPALIVE timeout 60s, retry interval 5s
...
The timeout is frequency at which to send keepalive pings; the retry
interval is how often to send successive pings once a keepalive has
not replied.
Also update comments and MCA param help strings.
60 seconds -- squashme
2015-05-20 21:08:37 -04:00
Nathan Hjelm
108f55a963
btl/vader: clean up progress of waiting endpoints
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 16:14:58 -06:00
Jeff Squyres
c95215dfc2
oob_tcp: do not set KEEPALIVE on listening sockets
2015-05-20 17:28:45 -04:00
Jeff Squyres
32d81af35f
oob tcp: re-enable keepalive option for Mac
...
Plus very minor #if/#endif reduction.
2015-05-20 17:28:45 -04:00
Nathan Hjelm
69e70776aa
btl/vader: fix double unlock
...
References #594
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 14:35:22 -06:00
Nathan Hjelm
ce48eabd84
pml/ob1: use c99 flexible array members instead of size 1 arrays
...
This commit updates several ob1 structures to take advantage of C99's
flexible array member.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
rhc54
95c40e64b9
Merge pull request #584 from nkogteva/oob_ud_stress_test
...
oob ud: fixed a bug that prevented the work with QoS framework
2015-05-20 09:56:08 -06:00
Gilles Gouaillardet
b6c67e051d
io/ompio: fix misc memory leaks
...
as reported by Coverity with CIDs 72147-72149,72187,72188,731274,731275,741356,
1269889,1269893,1271535 and 1269872
2015-05-20 17:19:39 +09:00
Gilles Gouaillardet
dd28b1f680
orted/dfs: fix misc memory leaks
...
as reported by Coverity with CIDs 739887, 747706, 1196707-1196709 and 1269849
2015-05-20 13:09:46 +09:00
Howard Pritchard
62a278d29c
Merge pull request #590 from hppritcha/topic/coverity_133
...
pmix/base: fix coverity error
2015-05-18 06:52:37 -06:00
Gilles Gouaillardet
69f900ab9d
libfabric: check the psm_epconn_t type is available before building the PSM provider
...
embedded libfabric configury does it its own way, so "backport" ofiwg/libfabric#1031
2015-05-18 14:04:41 +09:00
Howard Pritchard
0980423c5f
pmix/base: fix coverity error
...
Remove some obviously dead code and thus fix a coverity
error - CID #133
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-16 13:24:03 -06:00
Howard Pritchard
4d77897d70
Merge pull request #589 from hppritcha/topic/fix_gni_common_symbol
...
btl/ugni: silence common symbol squawk
2015-05-16 12:38:56 -06:00
Howard Pritchard
d9f080b0c7
btl/ugni: silence common symbol squawk
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-16 10:23:06 -05:00
Howard Pritchard
00dafb39f6
Merge pull request #586 from hppritcha/topic/pmix_cray_loc_fix
...
pmix/cray: fix locality setting
2015-05-15 16:34:50 -06:00
Howard Pritchard
a1d65cfd8b
pmix/cray: fix locality setting
...
Code for setting proc node locality
was absent after the removal of Cray
PMI KVS usage. This commit puts that
functionality back in place.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-15 12:17:15 -07:00
Ralph Castain
7929387b4c
Merge branch 'master' of https://github.com/open-mpi/ompi
2015-05-15 07:14:27 -06:00
Ralph Castain
d3d3e73099
Per request from George, use defined(__APPLE__) instead of OPAL_HAVE_MAC. Don't try to close a negative socket
2015-05-15 07:13:42 -06:00
Gilles Gouaillardet
c05b271c68
man: fix a trivial typo in MPI_Neighbor_allgather.3in
2015-05-15 16:02:01 +09:00
George Bosilca
675dccf9d9
Print the port in host byte order.
2015-05-15 00:14:28 -04:00
Ralph Castain
0a345d34e6
Plug the memory leak identified by George
2015-05-14 21:33:48 -06:00
Howard Pritchard
578430c36d
oob/alps: remove comment with personal reference
...
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-14 20:06:21 -07:00
Ralph Castain
8e30579e6e
The Mac appears to have problems with the keepalive support - once keepalive starts, the memory footprint soars. So disable keepalive on the Mac
2015-05-14 18:09:13 -06:00
Gilles Gouaillardet
1488e82efd
osc/pt2pt: enable heterogeneous support
2015-05-14 16:42:48 +09:00
Gilles Gouaillardet
c4ebdba035
always align words if heterogeneous support is enabled
2015-05-14 15:54:21 +09:00
Gilles Gouaillardet
973a9ec247
configury: fix the error message of the --enable-mpi-fortran option
2015-05-14 15:39:43 +09:00
MPI Team
6dce955e55
update github wiki script: remove debugging comment
2015-05-13 19:05:01 -04:00
Jeff Squyres
bd0a4f0f8b
Script to update Github wiki ComponentOwners page
...
First cut.
2015-05-13 18:58:02 -04:00
Jeff Squyres
ccfee0cd2d
check-owner.pl: fix comments
2015-05-13 18:31:12 -04:00
Todd Kordenbrock
c42e277385
mtl-portals4: thread multiple updates
...
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.
Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Nadezhda Kogteva
d9dcf8352e
oob ud: fixed a bug that prevented the work with QoS framework (oob_stress_channel test)
2015-05-13 11:40:01 +03:00
Yohann Burette
27f1884cf8
mtl/ofi: Reworked header files. Added compat to ease maintenance.
2015-05-12 15:47:50 -07:00
rhc54
b59fa14004
Merge pull request #583 from rhc54/topic/mallocwarnings
...
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27
Silence malloc(0) warnings reported by Lisandro
2015-05-12 12:38:58 -07:00
Jeff Squyres
8e8d104520
oob ud: ibv_get_device_list()==NULL can mean no devices present
...
...which is not an error. Don't complain about it.
2015-05-12 10:54:39 -07:00
Gilles Gouaillardet
5142194058
oshmem: there's no fortran sentinels in oshmem
...
Turns out that this is just copy-n-pasted code from OMPI. To be
clear: there's no need for the oshmem layer to instantiate sentinels
like mpi_fortran_bottom.
Thanks @jsquyres for pointing this.
2015-05-12 13:01:16 +09:00
Jeff Squyres
8f941a6613
oob ud: better error msgs, tolerate systems without UD devices
...
It is perfectly ok to be on a system without UD devices.
Also, make some of the error messages better -- so that the user has a
clue about where the error messages are coming from, and what they
should do.
2015-05-11 13:11:51 -07:00
Jeff Squyres
e95010b095
common verbs: only install fake usnic driver when relevant
...
Only install the fake usnic libibverbs driver when there are actually
usnic kernel devices present. This prevents some run-time weirdness
on the Cray verbs emulation environment, where apparently
ibv_register_driver() either is not implemented or does not work
properly.
2015-05-11 12:57:06 -07:00
Ryan Grant
bbeaf41a52
Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
...
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9
Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
...
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9
Show the name of the collective that failed to get initialized.
2015-05-11 15:10:37 -04:00
Mike Dubman
894ba28390
Merge pull request #559 from nkogteva/oob_ud
...
oob ud: made component more user adaptive; opal outputs were replaced by...
2015-05-11 21:09:28 +03:00
Todd Kordenbrock
9df163f116
portals4: use a single Memory Descriptor to cover all of memory
...
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created. Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d
mtl-portals4: add status to short recv blocks to coordinate out of order events
...
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK). This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00
Ralph Castain
3cee4152fc
Fix the intercommunictor issue reported by Gilles. Instead of directly checking the reachability bitmap, ask the component if the proc is reachable when doing a send as the component is the final arbiter in such cases. Recirculate any messages that a daemon is trying to send to void race conditions. Cleanup listener sockets so we don't leak them
2015-05-11 09:16:25 -07:00