1
1
Граф коммитов

29296 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
12790e8ec6 Protect PMIx from bad configure entry
Ignore with-hwloc=internal or external as those are meaningless to pmix
(will upstream)

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit c498a7e77a)
2018-10-09 03:45:58 -07:00
Ralph Castain
3e2cc6f46a Fail configure if pmix won't build
If we are using the internal PMIx component and the embedded library fails to configure, then fail - don't silently fail to build and then fail in execution

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit f379ba9c8e)
2018-10-09 03:45:37 -07:00
Ralph Castain
4aa11ec763 Strip --with-foo=internal from opal_subdir_args
Our components that have a --with-foo configure option won't know what
to do with a value of "internal". This scenario only occurs with hwloc
and libevent, both of which are statically contained in libopen-pal

Thanks to @jsquyres for the diff

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit e836dbd506)
2018-10-09 03:41:48 -07:00
Gilles Gouaillardet
2e2366d193 util/hostfile: fix a double free error
As reported at https://stackoverflow.com/questions/52707242/mpirun-segmentation-fault-whenever-i-use-a-hostfile
mpirun crashes when the hostfile contains a "user@host" line.
The root cause is username was not strdup'ed and free'd twice by opal_argv_free() and free()

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

(cherry picked from commit open-mpi/ompi@5803385d44)
2018-10-09 13:06:55 +09:00
Geoff Paulsen
212419290e
Merge pull request #5859 from amaslenn/mlnx-no-verbs-v4
platform/mellanox: disable openib/verbs — v4.0.x
2018-10-08 14:13:12 -05:00
Nathan Hjelm
fba5eda436 btl/vader: fix race condition in writing header
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
(cherry picked from commit 8291f6722d)
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2018-10-08 08:49:01 -06:00
Andrey Maslennikov
7a930039fb platform/mellanox: disable openib/verbs
Signed-off-by: Andrey Maslennikov <andreyma@mellanox.com>
(cherry picked from commit 7180ab144a)
2018-10-08 15:36:56 +03:00
Geoff Paulsen
499ddedd7c
Merge pull request #5844 from kawashima-fj/pr/v4.0.x/pcollreq-f08-signatures
v4.0.x: mpiext/pcollreq: Correct f08 routine signatures
2018-10-05 13:42:35 -05:00
Geoff Paulsen
ab7cf1095d
Merge pull request #5845 from kawashima-fj/pr/v4.0.x/pcollreq-man
v4.0.x: mpiext/pcollreq: Add Fortran bindings in man
2018-10-05 13:40:03 -05:00
KAWASHIMA Takahiro
4dd21111f0 mpiext/pcollreq: Add Fortran bindings in man
Fortran bindings were added to persistent collectives in 9e0115c980
but man was not updated.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 43d85dbc81)
2018-10-05 09:43:39 +09:00
KAWASHIMA Takahiro
092cf1937d man: Correct markup of MPI_Neighbor_allgather
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 994b345253)
2018-10-05 09:43:39 +09:00
KAWASHIMA Takahiro
080c52f906 mpiext/pcollreq: Add missing f08 asynchronous
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit be91a26fd8)
2018-10-05 09:33:17 +09:00
KAWASHIMA Takahiro
fcc698f27f mpiext/pcollreq: Correct f08 routine signatures
Changes of nonblocking collectives in e98d794e8b and f750c6932c
are applied to persistent collectives.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit 357531847e)
2018-10-05 09:33:16 +09:00
KAWASHIMA Takahiro
b9316d3136 fortran/use-mpi-f08: Correct f08 routine signatures
Following the commit f750c6932c, I compared
`ompi/mpi/fortran/use-mpi-f08/*.F90` and
`ompi/mpi/fortran/use-mpi-f08/profile/p*.F90`, and
`ompi/mpi/fortran/use-mpi-f08/mod/mpi-f08-interfaces.F90` and
`ompi/mpi/fortran/use-mpi-f08/mod/pmpi-f08-interfaces.F90`.

There are many differences. Some are bugs of `MPI_*`, some are
bugs of `PMPI_*`. I'm not sure how these bugs affect applications.

To make it easy to compare these files future, I also removed
editorial differences.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
(cherry picked from commit cf6d28cb66)
2018-10-05 09:04:17 +09:00
Geoff Paulsen
c0796664b1
Merge pull request #5780 from jsquyres/pr/v4.0.x/moar-fortran-fixes
v4.0.x: Fortran 08 bindings fixes
2018-10-04 16:08:30 -05:00
Geoff Paulsen
704349746d
Merge pull request #5835 from gpaulsen/topic/v4.0.x/rc4
Updating VERSION to v4.0.0rc4
2018-10-04 15:20:33 -05:00
Geoff Paulsen
4e2d26e08b
Merge pull request #5834 from jsquyres/pr/v4.0.x/2-more-vader-fixes
v4.0.x: 2 more vader fixes
2018-10-03 14:09:45 -05:00
Geoffrey Paulsen
ed2bd82075 Updating VERSION to v4.0.0rc4
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-10-03 11:49:32 -05:00
Nathan Hjelm
fa768d748f btl/vader: work around Oracle compiler bug
This commit works around an Oracle C compiler bug in 5.15 (not sure
when it was introduced). The bug is triggered when we chain
assignments of atomic variables. Ex:

_Atomic intptr x, y;
intptr_t z = 0;

x = y = z;

Will produce a compiler error of the form:

operand cannot have void type: op "="
assignment type mismatch:
	long "=" void

To work around the issue we are removing the chain assignment and
setting the head and tail on different lines.

Fixes #5814

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit dfa8d3a81a)
2018-10-03 11:36:17 -04:00
Nathan Hjelm
df6dd69db8 btl/vader: ensure the fast box tag is always read first
On some platfoms reading a 64-bit value is non-atomic and it is
possible that the two 32-bit values are read in the wrong order. To
ensure the tag is always read first this commit reads the tag before
reading the full 64-bit value.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
(cherry picked from commit 66a7dc4c72)
2018-10-03 11:36:17 -04:00
Geoff Paulsen
5cae0ec25b
Merge pull request #5794 from bwbarrett/v4.0.x-ofi-mtl-selection
mtl ofi: Change from opt-in to opt-out provider selection
2018-10-03 08:31:07 -05:00
Geoff Paulsen
1844d87e4c
Merge pull request #5802 from jsquyres/pr/v4.0.x/misc-updates
mpi.h: remove MPI_UB/MPI_LB when not enabling MPI-1 compat
2018-10-03 08:30:46 -05:00
Geoff Paulsen
593d652077
Merge pull request #5823 from jsquyres/pr/v4.0.x/fix-tcp-btl-show-help-ip-address
v4.0.x: btl/tcp: output the IP address correctly
2018-10-03 08:29:37 -05:00
Geoff Paulsen
9d1a6db1a0
Merge pull request #5826 from jsquyres/pr/v4.0.x/tcp-btl-socklen-fix
v4.0.x: TCP BTL socklen fix
2018-10-02 16:14:13 -05:00
George Bosilca
b63bee5da4 Small pedantic fixes.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit a3a492b42c)
2018-10-02 14:37:31 -04:00
George Bosilca
d450e460d6 Provide the correct socklen to bind.
Get Brian's patch from #5825 and his log message:
Fix a failure in binding the initiating side of a connection
on MacOS. MacOS doesn't like passing the size of the storage
structure (sockaddr_storage) instead of the expected size of
the structure (sockaddr_in or sockaddr_in6), which was causing
bind() failures. This patch simply changes the structure size
to the expected size.

Add a more clear error message in debug mode.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
(cherry picked from commit 9164e26e2f)
2018-10-02 14:37:31 -04:00
Jeff Squyres
81f2f19398 btl/tcp: output the IP address correctly
Per
https://github.com/open-mpi/ompi/issues/3035#issuecomment-426085673,
it looks like the IP address for a given interface is being stashed in
two places: on the endpoint and on the module.

1. On the endpoint, it is storing the moral equivalent of a
   (struct sockaddr_in.sin_addr).
2. On the module, it is storing a full (struct sockaddr_storage).

The call to opal_net_get_hostname() expects a full (struct sockaddr*)
-- not just the stripped-down (struct sockaddr_in.sin_addr).  Hence,
when the original code was passing in the endpoint's (struct
sockaddr_in.sin_addr) and opal_net_get_hostname() was treating it
like a (struct sockaddr), hilarity ensued (i.e., we got the wrong
output).

This commit eliminates the call to opal_net_get_hostname() and just
calls inet_ntop() directly to convert the (struct
sockaddr_in.sin_addr) to a string.

NOTE: Per the github comment cited above, there can be a disparity
between the IP address cached on the endpoint vs. the IP address
cached on the module.  This only happens with interfaces that have
more than one IP address.  This commit does not fix that issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 5dae086f7e)
2018-10-02 10:49:51 -04:00
Geoff Paulsen
1ae1f7d3c6
Merge pull request #5790 from yosefe/topic/shmem-lock-progress-v4.0.x
shmem/lock: progress communications while waiting for shmem_lock
2018-10-01 09:13:19 -05:00
Geoff Paulsen
e6b8738132
Merge pull request #5806 from gpaulsen/topic/v4.0.x/NEWS/mtl_ofi
Topic/v4.0.x/news/mtl ofi
2018-10-01 09:06:24 -05:00
Geoff Paulsen
a4666dc008
Merge pull request #5805 from gpaulsen/topic/v4.0.x/NEWS/vers
Topic/v4.0.x/news/vers
2018-10-01 09:05:40 -05:00
Geoffrey Paulsen
0f984be381 NEWS: PR5794 - change MTL OFI selection
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-09-28 16:54:04 -05:00
Geoffrey Paulsen
8138b5b04a NEWS: updated versions of included hwloc and PMIx
Updated versions of included hwloc and PMIx to match
   corresponding VERSION files.

   Updated the spelling of "Open SHMEM" to "OpenSHMEM".

Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-09-28 16:47:26 -05:00
Geoff Paulsen
a7e275cf3e
Merge pull request #5771 from jsquyres/pr/v4.0.x/readme-update-configure-cli-with-options
v4.0.x: README: Add note about --with-foo and RPATH
2018-09-28 16:20:47 -05:00
Jeff Squyres
46dd266e45 mpi.h: remove MPI_UB/MPI_LB when not enabling MPI-1 compat
When --enable-mpi1-compatibility was specified, the ompi_mpi_ub/lb
symbols were #if'ed out of mpi.h.  But the #defines for MPI_UB/LB
still remained.  This commit also #if's out the MPI_UB/LB macros when
--enable-mpi1-compatibility is specified.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 7223334d4d)
2018-09-28 10:01:48 -07:00
Brian Barrett
10d0a430c4 mtl ofi: Change from opt-in to opt-out provider selection
Change default provider selection logic for the OFI MTL.  The
old logic was whitelist-only, so any new HPC NIC provider would
have to ask users to do extra work or wait for an OMPI release
to be whitelisted.  The reason for the logic was to avoid
selecting a "generic" provider like sockets or shm that would
frequently have worse performance than the optimized BTL options
Open MPI supports.

With the change, we blacklist the (small, relatively static) list
of providers that duplicate internal capabilities.  Users can use
one of thse blacklisted providers in two ways: first, they can
explicitly request the provider in the include list (which will
override the default exclude list) and second, the can set a new
empty exclude list.

Since most HPC networks require special libraries and therefore
an explicit build of libfabric, it is highly unlikely that this
change will cause users to use libfabric when they didn't want to
do so.  It does, however, solve the whitelisting problem.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
(cherry picked from commit c5eaa38491)
2018-09-27 18:41:47 +00:00
Geoff Paulsen
0fc5034802
Merge pull request #5791 from hoopoepg/topic/update-function-macro-v4.0
OPAL/COMMON/UCX: used __func__ macro instead of __FUNCTION__ - v4.0
2018-09-27 07:55:16 -05:00
Sergey Oblomov
68d3baffd5 OPAL/COMMON/UCX: used __func__ macro instead of __FUNCTION__
- used __func__ macro instead of __FUNCTION__ to unify
  macro usage with other components

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
(cherry picked from commit 9a51e257d1)
2018-09-27 12:04:07 +03:00
Yossi Itigin
cda310733f shmem/lock: progress communications while waiting for shmem_lock
(cherry picked from commit 4101150)

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-09-27 11:42:34 +03:00
Geoff Paulsen
2b471430af
Merge pull request #5787 from gpaulsen/v4.0.x
Updating VERSION to v4.0.0rc3
2018-09-26 20:06:44 -05:00
Geoffrey Paulsen
27f3262403 Updating VERSION to v4.0.0rc3
Signed-off-by: Geoffrey Paulsen <gpaulsen@us.ibm.com>
2018-09-26 18:38:29 -05:00
Jeff Squyres
37a9cf5c82 Squash a bunch of harmless compiler warnings.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 6bb356ab87)
2018-09-26 14:42:13 -07:00
Gilles Gouaillardet
ce5959ba6c fortran/use-mpi-f08: Corrections to PMPI signatures of collectives
Corrected the signatures of the collectives used by the Fortran 2008
interface to state correct intent for inout arguments and use the
ASYNCHRONOUS attribute in non-blocking collective calls.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
(cherry picked from commit f750c6932c)
2018-09-26 12:34:46 -07:00
Philipp Otte
e98eae3da6 fortran/use-mpi-f08: Corrections to Fortran08 signatures of collectives
Corrected the signatures of the collectives used by the Fortran 2008
interface to state correct intent for inout arguments and use the
ASYNCHRONOUS attribute in non-blocking collective calls. Also corrected
the C-bindings in Fortran accordingly

Signed-off-by: Philipp Otte <philipp.j.otte@googlemail.com>
(cherry picked from commit e98d794e8b)
2018-09-26 12:34:46 -07:00
Jeff Squyres
6b91855ecc README: additional clarification about --with-<foo>-libdir.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 36c9f92117)
2018-09-25 08:56:51 -07:00
Howard Pritchard
ec2e6eb9b1
Merge pull request #5766 from jsquyres/pr/v4.0.x/fix-ompi-info-mca-var-settable-output
v4.0.x: mca_base_var: fix output bug about settable vars
2018-09-25 09:25:48 -06:00
Jeff Squyres
02c5838cdf README: Add note about --with-foo and RPATH
Specifically mention our intended behavior about /usr and /usr/lib
(and why we don't add /usr/lib[64] and /usr/local/lib[64] to RPATH).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 9367440e32)
2018-09-25 06:35:34 -07:00
Jeff Squyres
8bdf4553d9 mca_base_var: fix output bug about settable vars
Fix the test that determined whether we output "writeable" or
"read-only" for MCA vars (it was checking the wrong flag).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
(cherry picked from commit 176da51aec)
2018-09-24 14:14:51 -07:00
Geoff Paulsen
9d9ae9286c
Merge pull request #5753 from gpaulsen/man-page-script-abstraction-break
Fix script abstraction break: mv make_manpage.pl to config
2018-09-23 09:01:19 -05:00
Geoff Paulsen
d03ee166cd
Merge pull request #5761 from amaslenn/platform-mellanox-v4
platform/mellanox: cleanup autodetect config — v4.0.x
2018-09-23 09:00:36 -05:00
Geoff Paulsen
97d7b004bb
Merge pull request #5762 from amaslenn/platform-mellanox-conf-v4
platform/mellanox: update default configuration — v4.0.x
2018-09-23 09:00:04 -05:00