1
1
Граф коммитов

28765 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
d0d59b1d7d osc/rdma: add support for controlling location of backing store
This commit adds a new MCA variable to set the location of the backing
store: osc_rdma_backing_directory. The default on Linux has been
changed to use /dev/shm to improve performance in cases where /tmp is
not a tmpfs.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-05-29 21:43:33 -06:00
Sergey Oblomov
daad71f036 MCA/UCX: switch/case optimization
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 22:33:33 +03:00
Jeff Squyres
2e8ab41ba5
Merge pull request #5200 from jsquyres/pr/disable-osc-pt2pt-for-thread-multiple
osc/pt2pt: disable when THREAD_MULITPLE
2018-05-29 15:33:15 -04:00
Sergey Oblomov
6be4066e23 MCA/UCX: cswap call if updated to non-blocking API
- minor fixes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 20:07:51 +03:00
Sergey Oblomov
b668e19cd1 Merge remote-tracking branch 'wg/master' into topic/amo-non-blocking-ucp
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 19:08:48 +03:00
Howard Pritchard
5b7c866f59 osc/pt2pt: disable when THREAD_MULTIPLE.
Per discussion at
https://github.com/open-mpi/ompi/issues/2614#issuecomment-392815654,
do not allow for selection of the OSC PT2PT when creating an MPI RMA
window when THREAD_MULTIPLE is active.  Print a helpful message and
return a not-supported error.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>

(cherry picked from commit d0ffd660841623c02d1dfa3151e7f7afd3327698)
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-05-29 08:59:53 -07:00
bosilca
4ebed21b6d
Merge pull request #4670 from ggouaillardet/topic/opal_bitmap
opal/bitmap: fix opal_bitmap_set_bit()
2018-05-29 10:50:21 -04:00
Yossi Itigin
976cd5e307
Merge pull request #5186 from hoopoepg/topic/ucx-amo-error-msg
MCA/UCX: fixed error messages for incorrect msg size
2018-05-29 15:54:02 +03:00
Sergey Oblomov
0c3ed93ef0 MCA/UCX: added opal progress call to wait request routine
- added opal_progress call to wait function to avoid
  possible [dead]lock issues
- wait call is declared as inline
- minor fixes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-29 11:34:40 +03:00
bosilca
9adc38c963
Merge pull request #5197 from mkurnosov/reduce-scatter-block-butterfly
coll: reduce_scatter_block: add butterfly algorithm
2018-05-27 18:12:27 -04:00
Yossi Itigin
705c8a7b9b
Merge pull request #5198 from brminich/shmem_fence
OSHMEM/SMPL/UCX: Add real fence support
2018-05-27 11:25:42 +03:00
Mikhail Kurnosov
28d5837dd9 coll: reduce_scatter_block: add butterfly algorithm
Implements butterfly algorithm for MPI_Reduce_scatter_block.
The algorithm can be used both by commutative and non-commutative
operations, for power-of-two and non-power-of-two number of processes.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-05-27 14:17:41 +07:00
Artem Polyakov
66e774d959
Merge pull request #4638 from karasevb/oshmem/spec_1.3/c11
oshmem: remove `shmem_put/get` when not the C11 case in accordance with the spec v1.3
2018-05-26 17:29:51 -07:00
Jeff Squyres
731fcc86cb
Merge pull request #5188 from sjeaugey/libcuda_warning
cuda: add option to remove warning about missing libcuda.
2018-05-26 15:53:01 -04:00
Mikhail Brinskii
8e9d401938 OSHMEM/SMPL/UCX: Add real fence support
+ Add quiet method to SPML, so it can have different implementation with
fence.
+ Use ucp_worker_fence for spml_fence method of UCX SPML

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2018-05-25 22:43:06 +03:00
Brian Barrett
9fff40647d oshmem: disable if no spmls build
This patch disables the oshmem layer if there are no SPMLs that
will build.  With the limited set of SPMLs available to support
oshmem, many builds end up installing an oshmem library that we
know will not work.  There has been a bit of customer confusion
over oshmem, hopefully this will lead customers in the right
direction.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-05-25 08:48:50 -07:00
Brian Barrett
22bdf85299 dist: Add infrastructre for prjects to not build
Two related changes to allow projects to not build based on
configure test results, as opposed to only reacting to
user configure options today.  Use case is disabling a project
like oshmem because no communication channels can be built.

First, Move PROJECT_* AM_CONDITIONALs from the top of configure to
the bottom, so that we can change the results during configure.
Second, add a DIST_SUBDIRS to Makefile.am (and populate it in
opal_mca) so that "make dist" will work even when a project is
disabled.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-05-25 08:48:50 -07:00
Brian Barrett
70b154f7ff oshmem: Update config code to match OMPI usage
enable_oshmem holds the result of a customer decision and, like
most user options, can have the values "yes" (user wants us to
build feature), "no" (user wants us not to build feature),
"" (user wants us to figure it out), and "<something>" (user
wants us to build feature, with <something> turned on).

This change updates oshmem to not lose this data by not overwriting
enable_oshmem with a yes/no and leaving the original customer
intent in place.  Aside from fixing one bug (below) there are no
customer visible changes in this patch, but it makes it possible
to do the right thing in the upcoming work to allow oshmem to be
disabled based on test results.

There was a cosmetic bug in the existing code where specifying
a feature argument (like --enable-oshmem=awesome) would result
in the "checking if want oshmem" test reporting no, but oshmem
being built anyway.  With this cleanup, the "checking if want
oshmem" test, the final output summary, and what actually happens
will all match.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-05-25 08:48:50 -07:00
Edgar Gabriel
09f73f1cd5
Merge pull request #5194 from edgargabriel/pr/condition-fix
io/ompio: fix an erroneous condition when selecting aggregator selection algorithm
2018-05-24 19:18:37 -05:00
Sylvain Jeaugey
4eb75623ef cuda: add option to remove warning about missing libcuda.
Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
2018-05-24 14:56:46 -07:00
Edgar Gabriel
6b03cee7f1 io/ompio: erroneous condition in selecting aggregator selection logic
fix the logic in the decision which aggregator selection algorithm
to use.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-05-24 15:52:19 -05:00
Brice Goglin
847f2e9933 opal/hwloc: remove now unused available field from opal_hwloc_obj_data_t
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
b260600450 opal/hwloc: simplify df_search() and make it work with hwloc 2.x NUMA nodes
Don't do a recursive search (hence no need for *idx anymore).
Find the level depth, to hide cache-issues first.
Then iterate over that level to find the objects we want.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
a06fc74664 opal/hwloc: remove an obsolete comment about offlines CPUs etc
Only online/available objects are enabled in OMPI now.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
369a7ea279 opal/hwloc: remove df_search_cores and fix things for hwloc 2.x NUMA nodes
Just iterate over cores inside the given object cpuset.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
0cd0c12111 opal/hwloc: remove min_bound() functions
df_search_min_bound() would need to be fixed for hwloc 2.0,
but it's only used in opal_hwloc_base_find_min_bound_target_under_obj()
which isn't used anymore. So just remove all of them.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
c4dffa1d0f rmaps: simplify the lookup for the binding object and fix for hwloc 2.0
Don't bother doing a lookup upwards or downwards for the target object type.
Just use the target depth, iterate over the level until we find the min_bound
object that intersects the locale cpuset.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
d12ef324c9 hwloc 2.0 doesn't have hwloc/myriexpress.h anymore
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
33ea2f0de4 fix OPAL_HWLOC_WANT_SHMEM management in opal/mca/hwloc/external/external.h
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Brice Goglin
bd08a6ead9 hwloc: fix hwloc/shmem.h in the external case
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:53:07 +02:00
Jeff Squyres
af4299ebc5 hwloc: updates for hwloc 2.0.x API
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-05-24 11:53:07 +02:00
Brice Goglin
77cc3fcda5 hwloc: update to hwloc 2.0.1
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2018-05-24 11:52:59 +02:00
Jeff Squyres
36cde21602
Merge pull request #5191 from benmenadue/master
configure: use AC_LINK_IFELSE instead of AC_COMPILE_IFELSE for C11 tests
2018-05-23 11:33:16 -04:00
Joshua Ladd
8231706ad8
Merge pull request #5180 from bwbarrett/feature/remove-mxm
mtl: remove MXM MTL
2018-05-23 09:31:42 -04:00
Ben Menadue
53fda14680 configure: use AC_LINK_IFELSE instead of AC_COMPILE_IFELSE when testing for C11 features to prevent e.g. _Static_assert being treated as an implicitly-defined function.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2018-05-23 15:25:47 +10:00
Sergey Oblomov
bbaffd3681 MCA/UCX: atomic add/swap are moved to new UCX atomic API
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-22 22:23:31 +03:00
Sergey Oblomov
4495da5cb9 MCA/UCX: fixed error messages for incorrect msg size
- supported 4 or 8 bytes only

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-22 19:53:23 +03:00
Brian Barrett
09e4c40ce9 mtl: remove MXM MTL
Remove the MXM MTL, which has been deprecated in preference for
the Yalla PML.  This was discussed at the last developers meeting
and somehow I ended up with the action item to do the removal.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2018-05-21 14:18:30 -07:00
Yossi Itigin
385d91bbd2
Merge pull request #5175 from hoopoepg/topic/reset-stack-on-unordered
PML/UCX: reset converter stack on unordered messages
2018-05-21 20:41:18 +03:00
Sergey Oblomov
5ec26914a6 PML/UCX: do not set offset on ordered data recv
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-21 19:40:07 +03:00
Jeff Squyres
22bfdb194d
Merge pull request #5174 from hoopoepg/topic/typo-in-comment
CONVERTOR: fixed typos in comments
2018-05-17 12:15:02 -04:00
Sergey Oblomov
19607daa32 PML/UCX: create convertor clone instead of stack reset
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 16:39:13 +03:00
Sergey Oblomov
7c5de01c57 PML/UCX: reset converter stack on unordered messages
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-17 13:11:02 +03:00
Sergey Oblomov
52d5ca048e CONVERTOR: fixed typos in comments
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-05-16 22:02:39 +03:00
Jeff Squyres
74257aaee5
Merge pull request #5170 from jsquyres/pr/moar-java-updates-but-mostly-cosmetic-this-time
java: clean up MPI Java configury
2018-05-16 09:42:29 -04:00
Jeff Squyres
9f21ea437c java: clean up MPI Java configury
The Java configury is split into two parts:

1. Determine if we want MPI Java bindings.
2. Find the Java compiler (and related).

This commit does a few things:

- Move the "Find the Java compiler" step from OPAL to OMPI (because
  there is no Java in OPAL, and there doesn't appear to be any
  immanent danger that there will be).
  - As a direct consequence, remove the --enable-java CLI option
    (--enable-mpi-java still remains).  Enabling the MPI Java bindings
    and enabling Java are now considered the same thing (since there
    is no Java elsewhere in the code base, the different was
    meaningless).
- Only invoke the "Find the Java compiler" step if we actually want
  the MPI Java bindings.
- A few miscellaneous Java-related cleanups in configury (E.g., change
  testing "$foo" == "1" to $foo -eq 1, etc.

This commit is mostly s/opal/ompi/gi in many places in configury and
shifting code around.  But it looks bigger than it actually is because
of two reasons:

1. Some files were renamed:
   * ompi_setup_java.m4 -> ompi_setup_mpi_java.m4 (setup MPI Java bindings)
   * opal_setup_java.m4 -> ompi_setup_java.m4 (setup Java compiler)
2. Indenting level changed in (the new) ompi_setup_java.m4.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-05-15 15:15:22 -07:00
bosilca
d72c4e9d20
Merge pull request #5171 from bosilca/topic/fix_merge
Fix merge conflict related to function renaming.
2018-05-15 14:14:10 -04:00
George Bosilca
7191ea120c
Fix merge conflict related to function renaming.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-05-15 11:34:20 -04:00
bosilca
2ab628b92e
Merge pull request #5074 from bosilca/topic/remove_warnings
Remove warnings identified by clang.
2018-05-15 11:15:23 -04:00
bosilca
d13b9a2e25
Merge pull request #5156 from ggouaillardet/topic/reduce_scatter_block
coll: reduce_scatter_block: rename and MCA parameter description fix
2018-05-15 11:13:26 -04:00