1
1

27658 Коммитов

Автор SHA1 Сообщение Дата
bosilca
dc538e9675 Merge pull request #1177 from bosilca/topic/large_msg
Topic/large msg
2017-09-05 13:30:19 -04:00
Mike Dubman
62739c6513 Merge pull request #4165 from alinask/topic/spml-ucx-estimated-num-eps
SPML_UCX: use ompi_proc_world_size() to set the estimated_num_eps value
2017-09-05 18:36:42 +03:00
Alina Sklarevich
007b1803ec SPML_UCX: use ompi_proc_world_size() to set the estimated_num_eps value
before this fix, mca_spml_ucx_component_open was using
oshmem_num_procs() to set the value of params.estimated_num_eps for UCX.
The oshmem_num_procs() function uses oshmem_group_all which will be
initialized after the call to mca_spml_ucx_component_open and therefore,
cannot be used there.

Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-09-04 14:46:00 +03:00
Gilles Gouaillardet
3b8b8c52c5 Merge pull request #1432 from ggouaillardet/topic/memchecker
Fix misc memchecker issues
2017-09-04 13:14:40 +09:00
Gilles Gouaillardet
ecb6b81a05 mpi: correctly handle MPI_IN_PLACE by memchecker in neighborhood collectives
MPI_IN_PLACE is not a valid send buffer for neighborhood collectives, so do not
invoke memchecker in this case.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
66c9485e77 MPI_Isend: memchecker do not mark send buffer as unaccessible after pml isend invokation
Today's MPI standard mandates the send buffer remains accessible during the send operation.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
af8242a121 pml/ob1: have memchecker make recv buffer defined again when mca_pml_ob1_recv completes
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Gilles Gouaillardet
6ee9366243 MPI_Wait: correctly handle MPI_STATUS_IGNORE in MEMCHECKER
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Ralph Castain
c1ce233eaf Merge pull request #4143 from aravindksg/psm2_cuda
Add support for GPU buffers for PSM2 MTL
2017-09-01 21:09:55 -07:00
Ralph Castain
7b22207599 Merge pull request #4163 from rhc54/topic/pmix21
Roll to track PMIx master
2017-09-01 17:36:20 -07:00
Aravind Gopalakrishnan
2e83cf15ce Add support for GPU buffers for PSM2 MTL
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.

This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-09-01 16:59:03 -07:00
George Bosilca
d10522a01c
Set a hard limit on the TCP max fragment size.
Some OSes have hardcoded limits to prevent overflowing over an int32_t.
We can either detect this at configure (which might be a nicer but
incomplete solution), or always force the pipelined protocol over TCP.
As it only covers data larger than 1GB, no performance penalty is to be
expected.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
866899e836
Always abide to the RDMA pipeline limit.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
050bd3b6d7
Make the pipeline depth an int instead of a size_t. While
they are supposed to be unsigned, casting them to a signed
value for all atomic operations is as errorprone as handling
them as signed entities.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
c340da2586
A first cut at the large data problem with TCP. As long
as the writev and readv support a sum larger than a uint32_t
this version will work. For the other OSes a different patch
is required. This patch is a slight modification of the one
proposed by @ggouaillardet.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
4db3730a25
Be consistent for atomic operations and add an entity
of the same type.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
Ralph Castain
2c723f4338 Roll to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-01 12:30:34 -07:00
Nathan Hjelm
79fc9d54dc Revert "* Some recent versions of GCC try very hard to make it impossible to"
This reverts commit b5ea5e0994a827915107e03d6744e73156534a04

This commit reverts a change that is hopefully not necessary. If this
is the case this will fix #4146.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-01 08:47:29 -06:00
Jeff Squyres
89c860b4fc Merge pull request #4154 from ggouaillardet/topic/opal_setup_wrappers
configury: revamp opal_setup_wrappers.m4
2017-09-01 10:03:46 -04:00
Gilles Gouaillardet
59b9602c0b Merge pull request #2102 from ggouaillardet/topic/oshCC
oshmem: add C++ wrapper compilers
2017-09-01 15:24:36 +09:00
Gilles Gouaillardet
77f30a4378 oshmem_info: cleanup oshmem_info output
- there is no C++ bindings in OpenSHMEM
- only Fortran binding is shmem.fh

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 13:25:19 +09:00
Gilles Gouaillardet
d1740a679c oshmem: add C++ wrappers
though there are no C++ bindings for oshmem, we need C++ wrappers
since a C compiler might not be able to compile a C++ source.
the C++ wrappers are :
- shmemc++ / oshc++
- shmemcxx / oshcxx
- shmemCC / oshCC (on case sensitive filesystems)

also add the examples/hello_oshmem_cxx.cc example

Thanks Bert Wesarg for bringing this to our attention

Fixes open-mpi/ompi#2097

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 13:24:34 +09:00
Gilles Gouaillardet
cf2a80d215 configury: fix indentation in opal_setup_wrappers.m4
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 09:09:01 +09:00
Gilles Gouaillardet
0fe7757097 configury: revamp opal_setup_wrappers.m4
Define OPAL_EVAL_LIBTOOL() macro and factorize some code

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 09:08:26 +09:00
Gilles Gouaillardet
d8c795e312 configury: revamp opal_setup_wrappers.m4
Define OPAL_EVAL_LIBTOOL() macro and factorize some code

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 08:59:56 +09:00
Howard Pritchard
fb34a2104e Merge pull request #4081 from hppritcha/topic/readme_freebsd11.1_issue
README: Add a blurb about FreeBSD 11.1
2017-08-31 12:02:20 -06:00
Howard Pritchard
083e6e6f5e README: Add a blurb about FreeBSD 11.1
The clang 4.0 compiler that ships with FreeBSD 11.1 doesn't
work well with OpenMPI.  Workaround is to use a GNU compiler.

Related to #3992.
[skip ci]

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-31 09:25:12 -06:00
Howard Pritchard
97204b8620 Merge pull request #4150 from hppritcha/topic/ofi_swat_compi_warn
rml/ofi: swat a compiler warning
2017-08-30 15:44:44 -06:00
Howard Pritchard
5db9416724 rml/ofi: swat a compiler warning
On the path to -Werror passing builds!

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-30 09:16:49 -06:00
Ralph Castain
49d68f4343 Merge pull request #3873 from ggouaillardet/topic/pmix_info_create_zero
pmix: do not invoke PMIX_INFO_CREATE() with a zero size
2017-08-30 07:40:29 -07:00
Gilles Gouaillardet
c9cca771cc pmix/ext2x: automatically generate ext2x component from pmix2x sources
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-30 09:41:31 +09:00
Geoff Paulsen
0716a1276f Merge pull request #4119 from markalle/nm_test_fix2
remove nmcheck_prefix.pl test due to false positives
2017-08-29 12:24:36 -05:00
Mike Dubman
a84af675ff Merge pull request #4141 from yosefe/topic/pml-ucx-tag-context-bits
pml_ucx: fix tag/context_id layout and upper bounds.
2017-08-28 08:55:21 +02:00
Gilles Gouaillardet
fd08b923d5 pmix: do not invoke PMIX_INFO_CREATE() with a zero size
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#3854

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-28 11:25:58 +09:00
Yossi Itigin
14a93a5992 pml_ucx: fix tag/context_id layout and upper bounds.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-08-27 17:15:48 +03:00
Josh Hursey
ad87aa2674 Merge pull request #4121 from jjhursey/explore/dlopen-local
mca: Dynamic components link against project lib
2017-08-25 13:15:51 -05:00
Gilles Gouaillardet
cc41c48026 Merge pull request #4125 from ggouaillardet/topic/flang
configury: patch configure in order to correctly support flang compilers
2017-08-25 19:58:31 +09:00
Gilles Gouaillardet
6f8010c685 configury: add support for flang.
flang is currently not supported by libtool, so once configure has been invoked,
it is necessary to manually hack the generated libtool as described at
https://developer.arm.com/products/software-development-tools/hpc/resources/porting-and-tuning/building-openmpi-with-arm-compiler

This commit hacks the generated configure automatically in autogen.pl

The libtool patch has been submitted upstream and is available at https://savannah.gnu.org/patch/index.php?9442

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-25 14:53:17 +09:00
Mark Allen
9b029c1be3 removing nmcheck_prefix.pl due to false positives
This test has proven to produce too many false positives so far. I hope
to re-enable it in the future, but until it has a longer history of not
producing false postivies it doesn't need to produce false nuisance
failures for everybody.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-08-24 13:01:39 -04:00
Joshua Hursey
49c40f05d4 mpi/java: Remove dlopen() workaround
* See discussion on Issue #3705 regarding why this is no longer needed.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:17 -04:00
Joshua Hursey
e1d079544b mca: Dynamic components link against project lib
* Resolves #3705
 * Components should link against the project level library to better
   support `dlopen` with `RTLD_LOCAL`.
 * Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
   with the appropriate project level library:
```
MCA components in ompi/
       $(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
       $(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
       $(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
       $(top_builddir)/oshmem/liboshmem.la"
```

Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Joshua Hursey
7a3f1ff75e contrib: Script to automate LIBADD changes for components
* This script will search for all of the `Makefile.am` files in each
   of the project-level components. Then it adds the project-level
   library to `mca_FRAMEWORK_COMPONENT_la_LIBADD`.
   - If the library is already in the LIBADD list then it's skipped.
     So it is safe to run multiple times on the same codebase.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-24 11:56:16 -04:00
Ralph Castain
d0e3bfe213 Merge pull request #4137 from rhc54/topic/tools
Fix the orte-dvm operations so that orterun can connect and execute an application.
2017-08-23 20:05:47 -07:00
Ralph Castain
68029b27e4 Fix the orte-dvm operations so that orterun can connect and execute an application. There is a lingering problem, though. The first invocation of orterun succeeds every time. However, subsequent invocations have a high probability of hanging in the OOB connection handshake.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 17:31:08 -07:00
Ralph Castain
2e23fba5c4 Merge pull request #4136 from rhc54/topic/pmixup
Continue tracking PMIx v2.1.0
2017-08-23 11:16:39 -07:00
Ralph Castain
0561d64748 Continue tracking PMIx v2.1.0
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-23 09:38:27 -07:00
Ralph Castain
f6fd699d44 Merge pull request #4133 from rhc54/topic/modex
Optimize discovery of HWLOC topology
2017-08-22 21:00:49 -07:00
Ralph Castain
e02c39385a Merge branch 'master' into topic/modex 2017-08-22 20:06:35 -07:00
George Bosilca
50f471e31e
Cleanup a set of warnings reported by Ralph.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-08-22 23:00:18 -04:00
Gilles Gouaillardet
565b516dae hwloc/base: fix opal_output() usage
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-23 10:24:47 +09:00