1
1
Граф коммитов

3752 Коммитов

Автор SHA1 Сообщение Дата
Austen Lauria
0d4004cc3c Fix miscellaneous compiler warnings.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
2019-10-01 16:27:25 -04:00
Gilles Gouaillardet
1c4a3598d0 pmix/pmix4x: refresh to the latest open PMIx master
refresh to openpmix/openpmix@ea3b29b1a4

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-10-01 14:27:22 +09:00
Jeff Squyres
8038fac8f9
Merge pull request from adrianreber/check_for_user_ns
Do not use CMA in user namespaces
2019-09-20 22:10:42 -04:00
Nathan Hjelm
ae91b11de2 btl/vader: when using single-copy emulation fragment large rdma
This commit changes how the single-copy emulation in the vader btl
operates. Before this change the BTL set its put and get limits
based on the max send size. After this change the limits are unset
and the put or get operation is fragmented internally.

References 

Signed-off-by: Nathan Hjelm <hjelmn@google.com>
2019-09-05 23:08:53 -07:00
Adrian Reber
fc68d8a90f
Do not use CMA in user namespaces
Trying out to run processes via mpirun in Podman containers has shown
that the CMA btl_vader_single_copy_mechanism does not work when user
namespaces are involved.

Creating containers with Podman requires at least user namespaces to be
able to do unprivileged mounts in a container

Even if running the container with user namespace user ID mappings which
result in the same user ID on the inside and outside of all involved
containers, the check in the kernel to allow ptrace (and thus
process_vm_{read,write}v()), fails if the same IDs are not in the same
user namespace.

One workaround is to specify '--mca btl_vader_single_copy_mechanism none'
and this commit adds code to automatically skip CMA if user namespaces
are detected and fall back to MCA_BTL_VADER_EMUL.

Signed-off-by: Adrian Reber <areber@redhat.com>
2019-09-05 20:15:19 +02:00
Ralph Castain
9a2902c047
Force use of pmix/preg/native component
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 17:54:44 -07:00
Ralph Castain
2ebc1fa1b7
Update to current PMIx 4.0.0 (master)
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-09-03 14:46:55 -07:00
Howard Pritchard
71e1fad4a9
Merge pull request from hkuno/hkuno/mmap_loop
Fix mmap infinite recurse in memory patcher
2019-08-26 14:28:53 -06:00
Howard Pritchard
5a3646fd1d
Merge pull request from guserav/reintroduce-common-ofi
Reintroduce common ofi
2019-08-23 13:12:25 -06:00
Sergey Oblomov
182023febb SPML/UCX: fixed hang in SHMEM_FINALIZE
- used MPI _Barrier to synchronize processes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-21 12:04:46 +03:00
guserav
56c3d9a238 common/ofi: Set HPE as owner of component
Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-20 10:13:02 -07:00
guserav
8a67a95c99 common/ofi: Fix
As discussed in  the common component does not depend
on libfabric yet. This commit introduces this dependency by just calling
fi_version().

Signed-off-by: guserav <erik.zeiske@hpe.com>
2019-08-12 16:17:59 -07:00
guserav
0e25c95eae common/ofi: Fix check for OFI in build files
The changes made in f5e1a672cc
have been done after the common/ofi component was removed and thus the
component doesn't reflect the changes made their.

Namely f5e1a672cc changed:
- How to call OPAL_CHECK_OFI (It sets opal_ofi_happy to yes now)
- Dropped the common part in the build flags for ofi

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-12 16:15:23 -07:00
guserav
4ad78aaa15 Revert "Remove opal/mca/common/ofi."
This reverts commit dd20174532.

Signed-off-by: guserav <erik.zeiske@web.de>
2019-08-09 10:33:21 -07:00
Yossi Itigin
ec9def1406
Merge pull request from hoopoepg/topic/ucx-ppn-hint
UCX: added PPN hint for UCX context
2019-08-07 13:45:38 +03:00
Brian Barrett
827a2bcc3d
Merge pull request from wckzhang/opalifnamesize
opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
2019-08-05 16:16:05 -07:00
Sergey Oblomov
43186e494b UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-05 18:07:06 +03:00
Harumi Kuno
fca8436064 Fix mmap infinite recurse in memory patcher
This commit fixes issue  by removing
MacOS/Darwin-specific logic from intercept_mmap.
It also opportunistically converts tabs to spaces.

Signed-off-by: Harumi Kuno <harumi.kuno@hpe.com>
2019-08-01 10:18:25 -07:00
William Zhang
4ebb37a26c opal/util: Change opal/util/if.h macro IF_NAMESIZE to OPAL_IF_NAMESIZE
Due to IF_NAMESIZE being a reused and conditionally defined macro,
issues could arise from macro mismatches. In particular, in cases where
opal/util/if.h is included, but net/if.h is not, IF_NAMESIZE will be 32.
If net/if.h is included on Linux systems, IF_NAMESIZE will be 16. This
can cause a mismatch when using the same macro on a system. Thus
different parts of the code can have differring ideas on the size of a
structure containing a char name[IF_NAMESIZE]. To avoid this error case,
we avoid reusing the IF_NAMESIZE macro and instead define our own as
OPAL_IF_NAMESIZE.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-29 21:24:39 +00:00
Ralph Castain
c5c93e3391
Update to PMIx master
Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-29 12:20:20 -07:00
Ralph Castain
d202e10c14
Provide locality for all procs on node
Update PMIx to latest master to get supporting updates. For
connect/accept (part of comm_spawn as well), lookup locality for all
participating procs on the node and compute the relative locality so it
can be used for MPI operations.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-07-22 09:23:38 -07:00
Brian Barrett
41c2007af5
Merge pull request from wckzhang/cleanup
btl tcp: Fix error path memory leak
2019-07-16 15:50:32 -07:00
Gilles Gouaillardet
06c6325bc8
Merge pull request from ggouaillardet/topic/pmix_refresh
pmix/pmix4x: refresh to the latest PMIx master
2019-07-16 12:56:31 +09:00
Gilles Gouaillardet
4510711e95 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@03a8b5daab

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-07-16 11:31:20 +09:00
William Zhang
8c3b8a87c5 btl tcp: Fix error path memory leak
After the OPAL_MODEX_RECV call, remote_addrs was not freed in the error
path. Moved the free call into cleanup to ensure we always free this
memory before leaving the function.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-15 22:35:04 +00:00
William Zhang
c0c3e1b540 reachable: Update documentation on reachable function
Added information on the type of objects provided in the list as well as
the required fields for them.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:36:12 +00:00
William Zhang
c9214cc53e reachable: Change list name from *_if to *_ifs
The parameter names were misleading due to implying a single interface
instead of a list. This will provide more clarity in distinguishing the
list of interfaces from each individual interface.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2019-07-11 21:33:24 +00:00
Jeff Squyres
650dd3e4cf opal/if/linux_ipv6: unify gethostname() call behavior
Unfortunately, https://github.com/open-mpi/ompi/pull/6797 was merged
before all feedback was received (39b799d936).  This PR is a minor
addendum to that commit.

This PR simply removes a meaningless `= {0}` operation.

The use of gethostname() here -- and many other places in the code
base -- is technically unsafe.  See
https://github.com/open-mpi/ompi/issues/6801 for a further description
of the issue and a suggested fix.  But the risk is quite low;
real-world hostnames are usually much shorter than
OPAL_MAXHOSTNAMELEN.  Hence, this PR just removes the meaningless
operation and leaves a real fix for gethostname() usage to a potential
future PR.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 10:38:24 -07:00
Orivej Desh
39b799d936 Fix if_linux_ipv6 verbose output of interface addresses
Previously the verbose output of if_linux_ipv6_open looked like this:

    found interface ab c: 0ab: a b: abc: 0 0: a 0🔡 0 0 scope 0

This changes the output to:

    found interface eth0 inet6 ab0c🆎a0b🔤0:a00:abcd:0 scope 0

Signed-off-by: Orivej Desh <orivej@gmx.fr>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-07-09 05:45:15 -07:00
Gilles Gouaillardet
63aa156bb0 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@99971222ce

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-27 09:35:49 +09:00
Gilles Gouaillardet
5679a88867 pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@f67efc835c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-24 10:17:23 +09:00
Ralph Castain
d4070d5f58
Fix finalize of flux component
Per patches from @SteVwonder and @garlick

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-06-18 21:14:04 -07:00
Gilles Gouaillardet
d9326ff2ca pmix/pmix4x: refresh to the latest PMIx master
refresh to pmix/pmix@186dca196c

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-06-10 15:17:43 +09:00
markalle
008ab98946
Merge pull request from markalle/patcher_additions
shmat/shmdt additions for patcher
2019-05-30 12:16:05 -05:00
Nathan Hjelm
b78066720c btl/uct: add support for UCX 1.6.x
This commit updates the uct btl to support the v1.6.x release of
UCX. This release breaks API.

Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>
2019-05-21 04:31:57 -06:00
Yossi Itigin
84ae05c7bc
Merge pull request from hoopoepg/topic/ucx-common-init-patcher-on-hooks-used-only
COMMON/UCX: init memhooks infra on external hooks only
2019-05-16 22:35:32 +03:00
Sergey Oblomov
ebc457baf5 COMMON/UCX: removed ucs stuff
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:56:30 +03:00
Sergey Oblomov
a0a9306066 COMMON/UCX: init memhooks infra on external hooks only
- initialize memory hooks infrastructure only in case
  if external memory hooks are requested

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 20:13:16 +03:00
Nathan Hjelm
3e1dd36241 btl/uct: check for support before disabling UCX memory hooks
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2019-05-15 13:49:10 -06:00
Jeff Squyres
df5f7afb14 usnic: fix Coverity false positives
Add some Coverity inline notation to tell Coverity that these
functions never return.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-14 13:53:25 -07:00
Jeff Squyres
566e6f1ca3 btl/usnic: remove legacy code
Remove compatibility code for multiple versions of BTL_IN_OPAL,
BTL_VERSION, and RCACHE_VERSION.  This stuff was really only necessary
when we were actively swapping code between multiple release branches
that had large variations in core OMPI infrastructure.  These large
variations have now been around for quite a while, so the need for
this "compat" layer is significantly reduced.  It hasn't been removed
simply because a few of the "compat" names a slightly more friendly
than the real names (e.g., the SEND/RECV/PUT names).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-11 05:19:36 -07:00
Jeff Squyres
8a2441603f btl/usnic: remove all calls to abort()
Inspired by https://github.com/open-mpi/ompi/pull/5205, finally remove
all calls to abort() from the usnic BTL.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2019-05-11 05:17:29 -07:00
Nathan Hjelm
b82a08254f btl/ugni: fix 32-bit compare-and-swap atomics
This commit fixes an error in the 32-bit compare-and-swap atomic support
for Aries networks. The code was incorrectly using the non-fetching
version of cswap which was causing the routing to return
OPAL_ERR_BAD_ARG.

Signed-off-by: Nathan Hjelm <hjelmn@cs.unm.edu>
2019-05-10 09:59:54 -06:00
Yossi Itigin
5d2200a7d6
Merge pull request from brminich/topic/shmem_all2all_put
SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
2019-05-01 12:00:21 +03:00
Mikhail Brinskii
2ef5bd8b36 SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Gilles Gouaillardet
562809fca1 pmix/pmix4x: refresh to the latest PMIx master
refrest pmi4x to pmix/pmix@bde4a8a54f

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-04-23 09:31:43 +09:00
Ralph Castain
f4aa783848 Remove all the linkages back to libpmix in pmix components
This link-back seems to be breaking OMPI for some reason. I'm not sure we need it in PMIx anyway, but we'll investigate over there.

Signed-off-by: Ralph Castain <rhc@pmix.org>
2019-04-16 13:30:20 -07:00
Mark Allen
bdd92a7a64 -cpu-set as a constraint rather than as a binding
The first category of issue I'm addressing is that recent code changes
seem to only consider -cpu-set as a binding option. Eg a command like
this
  % mpirun -np 2 --report-bindings --use-hwthread-cpus \
      --bind-to cpulist:ordered --map-by hwthread --cpu-set 6,7 hostname
which just round robins over the --cpu-set list.

Example output which seems fine to me:
> MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

It should also be possible though to pass a --cpu-set to most other
map/bind options and have it be a constraint on that binding. Eg
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node,pe=2 --cpu-set 6,7,12,13 hostname

The first command above errors that
> Conflicting directives for mapping policy are causing the policy
> to be redefined:
>   New policy:   RANK_FILE
>   Prior policy:  BYHWTHREAD

The error check in orte_rmaps_rank_file_open() is likely too aggressive.
The intent seems to be that any option like "--map-by whatever" will
check to see if a rankfile is in use, and report that mapping via rmaps
and using an explicit rankfile is a conflict.

But the check has been expanded to not just check
    NULL != orte_rankfile
but also errors out if
    (NULL != opal_hwloc_base_cpu_list &&
    !OPAL_BIND_ORDERED_REQUESTED(opal_hwloc_binding_policy))
which seems to be only recognizing -cpu-set as a binding option and
ignoring -cpu-set as a constraint on other binding policies.

For now I've changed the
    NULL != opal_hwloc_base_cpu_list
to
    OPAL_BIND_TO_CPUSET == OPAL_GET_BINDING_POLICY(opal_hwloc_binding_policy)
so it hopefully only errors out if -cpu-set is being used as a binding
policy.  Whether I did that right or not it's enough to get to the next
stage of testing the example commands I have above.

Another place similar logic is used is hwloc_base_frame.c where it has
    /* did the user provide a slot list? */
    if (NULL != opal_hwloc_base_cpu_list) {
        OPAL_SET_BINDING_POLICY(opal_hwloc_binding_policy, OPAL_BIND_TO_CPUSET);
    }
where it used to (long ago) only do that if
    !OPAL_BINDING_POLICY_IS_SET(opal_hwloc_binding_policy)
I think the new code is making it impossible to use --cpu-set as anything
other than a binding policy.

That brings us past the error detection and into the real functionality, some of
which has been stripped out, probably in moving to hwloc-2:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
> MCW rank 0: [B.../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [.B../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

The rank_by() function in rmaps_base_ranking.c makes an array out of objects
returned from
    opal_hwloc_base_get_obj_by_type(,,,i,)
which uses df_search().  That function changed quite a bit from hwloc-1 to 2
but it used to include a check for
    available = opal_hwloc_base_get_available_cpus(topo, start)
which is where the bitmask from --cpu-set goes.  And it used to skip objs that
had hwloc_bitmap_iszero(available).

So I restored that behavior in ds_search() by adding a "constrained_cpuset" to
replace start->cpuset that it was otherwise processing.  With that change in
place the first command works:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by hwthread --cpu-set 6,7 hostname
> MCW rank 0: [..../..B./..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../...B/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

The other command uses a different path though that still ignored the
available mask:
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname
> MCW rank 0: [BB../..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..BB/..../..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
In bind_generic() the code used to call
opal_hwloc_base_find_min_bound_target_under_obj() which used
opal_hwloc_base_get_ncpus(), and that's where it would
intersect objects with the available cpuset and skip over ones
that were't available. To match the old behavior I added a few
lines in bind_generic() to skip over objects that don't intersect
the available mask. After that we get
  % mpirun -np 2 --report-bindings \
      --bind-to hwthread --map-by ppr:2:node:pe=2 --cpu-set 6,7,12,13 hostname
> MCW rank 0: [..../..BB/..../..../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]
> MCW rank 1: [..../..../..../BB../..../..../..../..../..../..../..../....][..../..../..../..../..../..../..../..../..../..../..../....]

I think the above changes are improvements, but I don't feel like they're
comprehensive.  I only traced through enough code to fix the two specific
bugs I was dealing with.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2019-04-12 15:33:56 -04:00
Gilles Gouaillardet
9ce8d7b568 pmix/pmix4x: refresh to the latest PMIx
refrest pmi4x to pmix/pmix@2531c0c3d1

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-04-09 14:03:00 +09:00
KAWASHIMA Takahiro
77286a41aa opal/sys: Introduce OPAL_HAVE_SYS_TIMER_GET_FREQ macro
... to avoid using an architecture name macro in
`opal/mca/timer/linux/timer_linux_component.c`.

The function name `opal_sys_timer_freq` is also changed for
consistency with `opal_sys_timer_get_cycles`.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2019-04-04 11:48:02 +09:00