1
1
Граф коммитов

26829 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
5683e7836f Merge pull request #2965 from hjelmn/deprecated_fix
mca/base: fix deprecated variable help message
2017-02-14 12:22:11 -07:00
Nathan Hjelm
1df6bdd30e schizo/alps: set orte_bound_at_launch when launched with aprun
Set the orte_bound_at_launch MCA variable. This resolves a launch
performance bug when using aprun to launch Open MPI processes. If
this variable is not set it can take minutes longer to launch with
high ppn.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 11:13:48 -07:00
Nathan Hjelm
3b912ea2a7 pmix/cray: performance improvements and cleanup
Do not use opal_output_verbose inside O(n) loops. This was causing us
to make O(n) calls to snprintf which was greatly slowing launch at
scale.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 11:13:10 -07:00
Nathan Hjelm
cc4a0fabcf Merge pull request #2727 from hjelmn/osc_rdma
osc/rdma: fix typo in check for MPI_MODE_NOCHECK
2017-02-14 10:50:33 -07:00
Nathan Hjelm
9e692ce264 mca/base: add new base enumerator (auto_bool)
This commit adds a new base enumerator type for variables that take of
the values -1, 0, and 1. These values are mapped to the strings auto,
false, true. This commit updates the mpi_leave_pinned MCA variable to
use the new enumerator.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 10:21:45 -07:00
Nathan Hjelm
33676c9960 mca/base: fix deprecated variable help message
Actually print out the original variable name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 09:55:43 -07:00
Joshua Hursey
78006f93a4 coll: Move reduce_local into the coll framework
* Since we are adding a new function to `mca_coll_base_module_2_1_0_t`
   we need to increase the version of the module structure to `2_2_0`.
 * Add a comment just above the PREDEFINED_COMMUNICATOR_PAD describing
   it's purpose and when it should change. To help future developers
   trying to answer the question noted in the comment.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-14 08:56:07 -06:00
Joshua Hursey
843fcca03c plm/rsh: Fix signal handling for rsh launcher
* Similar to the other launchers (i.e., slurm, alps) we need to put the
   children in a separate process group to prevent SIGINT (from a CTRL-C)
   from being delivered to the whole process group and prematurely
   killing the rsh/ssh connections to the remote daemons.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-14 08:54:17 -06:00
Ralph Castain
9ecfbba2a1 Merge pull request #2963 from rhc54/topic/pmixupdate
Update to lastest PMIx master
2017-02-14 05:38:32 -08:00
Ralph Castain
35578b4009 Update to lastest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-13 23:19:26 -08:00
Ralph Castain
68a384a5fd Merge pull request #2962 from rhc54/topic/rsh
Fix plm/rsh runtime check
2017-02-13 17:48:09 -08:00
Gilles Gouaillardet
9ea743960a Merge pull request #2953 from ggouaillardet/topic/libnbc_ialltoallvw_zero
coll/libnbc: fix and optimize zero size ialltoall{v,w}
2017-02-14 10:08:56 +09:00
Ralph Castain
dee2d8646d Fix plm/rsh runtime check
Fix the check for rsh/ssh so we allow the check for SGE and LoadLeveler to occur if user doesn't specify their own launch agent. Fix a Coverity warning

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-13 16:54:03 -08:00
Gilles Gouaillardet
e70a30cca4 coll/libnbc: optimize zero size ialltoall{v,w} with MPI_IN_PLACE
and incidentally avoids malloc(0)

Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#2945

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-13 15:21:28 +09:00
Gilles Gouaillardet
12949547f4 coll/libnbc: fix a2aw_sched_linear() with zero size datatype or zero count
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-13 15:21:28 +09:00
Gilles Gouaillardet
bf0fc4a84c opal/datatype: correctly handle zero size datatype or zero count
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-13 15:21:28 +09:00
Jeff Squyres
81e57bb7db nightly-tarball scripts: more quoting fixes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-12 04:19:35 +00:00
Jeff Squyres
2d4fc45429 nightly-tarball scripts: fix quoting
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-11 22:40:10 +00:00
Jeff Squyres
b385ac4f09 nightly-tarball scripts: more debugging and robustness
Check the exit status of major commands, as well as (optionally)
output the pwd and command being executed (when debugging).  Also,
read the $debug variable from the environment; if it's set, go into
debugging mode (vs. requiring a modification to the script to enable
debugging mode).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-11 21:50:10 +00:00
Jeff Squyres
0178307d36 openmpi-nightly-tarball: remove spurrious echo statement
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-11 21:50:10 +00:00
Jeff Squyres
704d6a0309 create_tarball: read $debug from environment
If $debug is set in the environment, use that.  This allows enabling
debug mode without requiring an edit to the script.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-11 21:50:10 +00:00
Jeff Squyres
a8247a76c9 Merge pull request #2948 from jsquyres/pr/update-warn-component-unused
help btl base: tell how to disable the warning
2017-02-09 21:10:01 -05:00
Jeff Squyres
e272250531 help btl base: tell how to disable the warning
As reported in
https://www.mail-archive.com/users@lists.open-mpi.org/msg30607.html,
give instructions in the show_help message how to disable the
warning.  Thanks to Susan Schwarz for reporting the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-09 15:51:30 -08:00
Jeff Squyres
51def91003 nightly tarballs: compare the hashes to know if they're new
The filenames contain date/timestamps; if you compare those, the
tarball generated every night will *always* be new.  Instead, separate
out the git hash from the old and new tarballs, and compare those.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-09 16:56:00 +00:00
Josh Hursey
cad4c03e5c Merge pull request #2812 from jjhursey/fix/ibm/basic-neighbor
coll/basic: Expand check for negative input values
2017-02-09 08:53:16 -06:00
Joshua Hursey
383330a50d coll/basic: Expand check for negative input values
* Negative values are parameter errors for neighborhood collectives
   - Add checks to the mpi/c interface `MPI_PARAM_CHECK`
 * Fix a success check for neighbor_alltoallw with dist_graph

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-08 14:26:32 -06:00
Gilles Gouaillardet
be26152839 Merge pull request #2939 from ggouaillardet/topic/pmix2x_6ed27be839e3f17a2b93885321e15fb26d802e93
pmix2x: Update to latest PMIx master
2017-02-08 16:40:57 +09:00
Gilles Gouaillardet
3d0541f2bf mpool/memkind: add a missing include file
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-08 16:06:22 +09:00
Gilles Gouaillardet
7acef4833e pmix2x: Update to latest PMIx master
pmix/master@6ed27be839

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-08 13:23:27 +09:00
KAWASHIMA Takahiro
4b2eba34a6 Merge pull request #2933 from kawashima-fj/pr/dstore-config-desc
pmix/pmix2x: Correct configure option description
2017-02-08 13:03:27 +09:00
George Bosilca
bc2890ed11
Upon a new connection go over all available ifaces.
Add a verbose to show all the failed attempts to match the
remote interfaces based on the modex info.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-02-07 19:15:49 -05:00
Jeff Squyres
0bf5ece4d5 Merge pull request #2935 from jsquyres/pr/fix-pmix-zlib-protection
pmix: fix zlib protection macro usage
2017-02-07 16:33:41 -05:00
Artem Polyakov
4018409b8c Merge pull request #2925 from artpol84/spawn/master
orte: Fix MPI_Spawn
2017-02-07 11:50:27 -08:00
Nathan Hjelm
9f073d76dc Merge pull request #2926 from Zzzoom/amd64_timer_perf
Improve x86-64 timer performance
2017-02-07 10:54:23 -07:00
Jeff Squyres
100b112d3c pmix: fix zlib protection macro usage
It's possible that we can have zlib.h but still not have zlib support.
Use the correct macro to protect the usage of calling zlib functions.

This fixes 32-bit MTT builds at Cisco (e.g.,
https://mtt.open-mpi.org/index.php?do_redir=2389).

Submitted upstream to PMIX: https://github.com/pmix/master/pull/290

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-07 05:52:32 -08:00
KAWASHIMA Takahiro
750406f67b pmix/pmix2x: Correct configure option description
`--enable-pmix-dstore` option was enabled by default in f4a5511.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-02-07 11:52:56 +09:00
Gilles Gouaillardet
c62498ab3d btl/tcp: remove reference to just removed tcp_local
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-07 09:32:09 +09:00
Jeff Squyres
368ab4d9a5 Merge pull request #2684 from bosilca/topic/tcp_fixes
Remove the tcp_local field from the TCP component.
2017-02-06 16:32:06 -05:00
Nathan Hjelm
2c1980ae39 Merge pull request #2923 from hjelmn/oob_fix
oob/tcp: cleanup peers before event bases
2017-02-06 09:34:10 -07:00
Nathan Hjelm
3c18f2f1d9 Merge pull request #2924 from hjelmn/ras_slurm
ras/slurm: fix compile error due to missing header
2017-02-06 09:33:58 -07:00
Gilles Gouaillardet
d4d4cab5bf orte/util: fix OPAL_HAVE_ZLIB usage
use #if instead of #ifdef

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-05 16:24:10 +01:00
Geoff Paulsen
4917e44a7d Merge pull request #2832 from jjhursey/topic/ibm/osc-base-dt-abort
osc/base: Detect unsupported data types and abort
2017-02-05 04:26:04 -06:00
Carlos Bederián
ccea3de44c amd64 timers: use lfence instead of cpuid for serialization
Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>
2017-02-04 18:50:29 -03:00
Carlos Bederián
4009ba6b94 opal_progress: use usec native timer only when a native cycle counter isn't available
Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>
2017-02-04 18:31:14 -03:00
Howard Pritchard
f4ad119693 Merge pull request #2914 from hppritcha/topic/nbc_compiler_warning
swat some compiler warnings
2017-02-04 11:56:52 -05:00
Artem Polyakov
9f7e2098ac orte: Fix MPI_Spawn
Register namespace even if there is no node-local processes that
belongs to it. We need this for the MPI_Spawn case.

Addressing https://github.com/open-mpi/ompi/issues/2920.
Was introduced in be3ef77739.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-02-04 12:07:00 +07:00
Nathan Hjelm
b928a6b9ea ras/slurm: fix compile error due to missing header
On some systems this component fails to build due to the missing
netdb.h header.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-03 15:22:34 -07:00
Nathan Hjelm
1c4b735f5f oob/tcp: cleanup peers before event bases
This commit fixes an error in teardown where the event bases are town
down before the peer structures are released. This causes us to call
event_del on an invalid event base. At best this makes valgrind
complain and at worst this causes aborts or segvs.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-03 15:18:41 -07:00
Howard Pritchard
acaecb2448 swat some compiler warnings
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-02-03 08:28:15 -07:00
Ralph Castain
ead453ee8e Merge pull request #2911 from rhc54/topic/retry
For performance, try to send the oob/tcp message a few times before dropping back into the event library
2017-02-02 12:57:18 -08:00