1
1
Граф коммитов

22948 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
0d4a6d7326 Merge pull request #588 from jsquyres/pr/keepalive-madness
Keepalive cleanups
2015-05-20 21:38:44 -04:00
Jeff Squyres
3069daa015 oob_tcp_listener: slightly refactor EAGAIN/EWOULDBLOCK
Have only a single level of "if" conditionals.  Also, slightly change
the logic such that we only die/break out of the loop if we get EMFILE
-- all other errors are ok to go on to the next fd.

Finally, use a real show_help() message to warn when other errors occur.
2015-05-20 21:10:11 -04:00
Jeff Squyres
e43c8dc291 oob tcp: label a few #endif's
Only bother labeling the ones that are a little far away from their
corresponding #if statements.
2015-05-20 21:10:11 -04:00
Jeff Squyres
4b2f0d4827 oob tcp: reset MCA params from level 9
Set various MCA param levels
2015-05-20 21:10:11 -04:00
Jeff Squyres
1a4c9960e1 oob tcp: set KEEPALIVE timeout 60s, retry interval 5s
The timeout is frequency at which to send keepalive pings; the retry
interval is how often to send successive pings once a keepalive has
not replied.

Also update comments and MCA param help strings.

60 seconds -- squashme
2015-05-20 21:08:37 -04:00
Nathan Hjelm
108f55a963 btl/vader: clean up progress of waiting endpoints
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 16:14:58 -06:00
Jeff Squyres
c95215dfc2 oob_tcp: do not set KEEPALIVE on listening sockets 2015-05-20 17:28:45 -04:00
Jeff Squyres
32d81af35f oob tcp: re-enable keepalive option for Mac
Plus very minor #if/#endif reduction.
2015-05-20 17:28:45 -04:00
Nathan Hjelm
69e70776aa btl/vader: fix double unlock
References #594

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 14:35:22 -06:00
Nathan Hjelm
ce48eabd84 pml/ob1: use c99 flexible array members instead of size 1 arrays
This commit updates several ob1 structures to take advantage of C99's
flexible array member.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-20 10:31:35 -06:00
rhc54
95c40e64b9 Merge pull request #584 from nkogteva/oob_ud_stress_test
oob ud: fixed a bug that prevented the work with QoS framework
2015-05-20 09:56:08 -06:00
Gilles Gouaillardet
b6c67e051d io/ompio: fix misc memory leaks
as reported by Coverity with CIDs 72147-72149,72187,72188,731274,731275,741356,
1269889,1269893,1271535 and 1269872
2015-05-20 17:19:39 +09:00
Gilles Gouaillardet
dd28b1f680 orted/dfs: fix misc memory leaks
as reported by Coverity with CIDs 739887, 747706, 1196707-1196709 and 1269849
2015-05-20 13:09:46 +09:00
Howard Pritchard
62a278d29c Merge pull request #590 from hppritcha/topic/coverity_133
pmix/base: fix coverity error
2015-05-18 06:52:37 -06:00
Gilles Gouaillardet
69f900ab9d libfabric: check the psm_epconn_t type is available before building the PSM provider
embedded libfabric configury does it its own way, so "backport" ofiwg/libfabric#1031
2015-05-18 14:04:41 +09:00
Howard Pritchard
0980423c5f pmix/base: fix coverity error
Remove some obviously dead code and thus fix a coverity
error - CID #133

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-16 13:24:03 -06:00
Howard Pritchard
4d77897d70 Merge pull request #589 from hppritcha/topic/fix_gni_common_symbol
btl/ugni: silence common symbol squawk
2015-05-16 12:38:56 -06:00
Howard Pritchard
d9f080b0c7 btl/ugni: silence common symbol squawk
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-16 10:23:06 -05:00
Howard Pritchard
00dafb39f6 Merge pull request #586 from hppritcha/topic/pmix_cray_loc_fix
pmix/cray: fix locality setting
2015-05-15 16:34:50 -06:00
Howard Pritchard
a1d65cfd8b pmix/cray: fix locality setting
Code for setting proc node locality
was absent after the removal of Cray
PMI KVS usage.  This commit puts that
functionality back in place.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-15 12:17:15 -07:00
Ralph Castain
7929387b4c Merge branch 'master' of https://github.com/open-mpi/ompi 2015-05-15 07:14:27 -06:00
Ralph Castain
d3d3e73099 Per request from George, use defined(__APPLE__) instead of OPAL_HAVE_MAC. Don't try to close a negative socket 2015-05-15 07:13:42 -06:00
Gilles Gouaillardet
c05b271c68 man: fix a trivial typo in MPI_Neighbor_allgather.3in 2015-05-15 16:02:01 +09:00
George Bosilca
675dccf9d9 Print the port in host byte order. 2015-05-15 00:14:28 -04:00
Ralph Castain
0a345d34e6 Plug the memory leak identified by George 2015-05-14 21:33:48 -06:00
Howard Pritchard
578430c36d oob/alps: remove comment with personal reference
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-05-14 20:06:21 -07:00
Ralph Castain
8e30579e6e The Mac appears to have problems with the keepalive support - once keepalive starts, the memory footprint soars. So disable keepalive on the Mac 2015-05-14 18:09:13 -06:00
Gilles Gouaillardet
1488e82efd osc/pt2pt: enable heterogeneous support 2015-05-14 16:42:48 +09:00
Gilles Gouaillardet
c4ebdba035 always align words if heterogeneous support is enabled 2015-05-14 15:54:21 +09:00
Gilles Gouaillardet
973a9ec247 configury: fix the error message of the --enable-mpi-fortran option 2015-05-14 15:39:43 +09:00
MPI Team
6dce955e55 update github wiki script: remove debugging comment 2015-05-13 19:05:01 -04:00
Jeff Squyres
bd0a4f0f8b Script to update Github wiki ComponentOwners page
First cut.
2015-05-13 18:58:02 -04:00
Jeff Squyres
ccfee0cd2d check-owner.pl: fix comments 2015-05-13 18:31:12 -04:00
Todd Kordenbrock
c42e277385 mtl-portals4: thread multiple updates
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.

Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
2015-05-13 17:06:18 -05:00
Nadezhda Kogteva
d9dcf8352e oob ud: fixed a bug that prevented the work with QoS framework (oob_stress_channel test) 2015-05-13 11:40:01 +03:00
Yohann Burette
27f1884cf8 mtl/ofi: Reworked header files. Added compat to ease maintenance. 2015-05-12 15:47:50 -07:00
rhc54
b59fa14004 Merge pull request #583 from rhc54/topic/mallocwarnings
Silence malloc(0) warnings reported by Lisandro
2015-05-12 13:37:38 -07:00
Ralph Castain
9a70765f27 Silence malloc(0) warnings reported by Lisandro 2015-05-12 12:38:58 -07:00
Jeff Squyres
8e8d104520 oob ud: ibv_get_device_list()==NULL can mean no devices present
...which is not an error.  Don't complain about it.
2015-05-12 10:54:39 -07:00
Nathan Hjelm
427aebbaca Fix cuda support MCA variables
This commit fixes some issues with the cuda support parameters. There
were a couple of duplicate registrations and an incorrect synonym (one
variable was made a synonym of mpi_preconnect_mpi).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-12 09:52:51 -06:00
Nathan Hjelm
9caffa5dd8 mca/base: fix source file name bug for synonyms
This commit fixes synonyms so the source file is correctly printed out
by ompi_info. This commit also adds support for printing out the line
number where the variable is set.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-05-12 09:52:31 -06:00
Gilles Gouaillardet
5142194058 oshmem: there's no fortran sentinels in oshmem
Turns out that this is just copy-n-pasted code from OMPI.  To be
clear: there's no need for the oshmem layer to instantiate sentinels
like mpi_fortran_bottom.

Thanks @jsquyres for pointing this.
2015-05-12 13:01:16 +09:00
Jeff Squyres
8f941a6613 oob ud: better error msgs, tolerate systems without UD devices
It is perfectly ok to be on a system without UD devices.

Also, make some of the error messages better -- so that the user has a
clue about where the error messages are coming from, and what they
should do.
2015-05-11 13:11:51 -07:00
Jeff Squyres
e95010b095 common verbs: only install fake usnic driver when relevant
Only install the fake usnic libibverbs driver when there are actually
usnic kernel devices present.  This prevents some run-time weirdness
on the Cray verbs emulation environment, where apparently
ibv_register_driver() either is not implemented or does not work
properly.
2015-05-11 12:57:06 -07:00
Ryan Grant
bbeaf41a52 Merge pull request #580 from tkordenbrock/topic/mtl.add.status.to.short.recv.blocks
mtl-portals4: add status to short recv blocks to coordinate out of or…
2015-05-11 13:44:45 -06:00
Ryan Grant
265682bdb9 Merge pull request #581 from tkordenbrock/topic/remove.overlapping.multiMD.code
portals4: use a single Memory Descriptor to cover all of memory
2015-05-11 13:20:32 -06:00
George Bosilca
78f5f0f8a9 Show the name of the collective that failed to get initialized. 2015-05-11 15:10:37 -04:00
Mike Dubman
894ba28390 Merge pull request #559 from nkogteva/oob_ud
oob ud: made component more user adaptive; opal outputs were replaced by...
2015-05-11 21:09:28 +03:00
Todd Kordenbrock
9df163f116 portals4: use a single Memory Descriptor to cover all of memory
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created.  Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
2015-05-11 11:49:41 -05:00
Todd Kordenbrock
074583060d mtl-portals4: add status to short recv blocks to coordinate out of order events
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK).  This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
2015-05-11 11:48:25 -05:00