1
1
Граф коммитов

27675 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
bbd83fd4c0 Add a new launcher "prun" for starting applications against the ORTE DVM.
Unlike "orterun", "prun" is a PMIx-only program that discovers the DVM connection instead of requiring that we explicitly provide it. Only build "prun" if PMIx v2.x is available.

This gets the DVM working again, but still is showing problems for multiple executions. I'll detail those in a separate issue. Thus, the DVM should still be considered "broken".

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 21:40:41 -07:00
Ralph Castain
d41069795f Merge pull request #4200 from rhc54/topic/cov
Silence coverity warnings
2017-09-12 10:29:32 -07:00
Ralph Castain
88eac797fb Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-12 09:14:36 -07:00
Brian Barrett
637ebf60f9 atomics: Remove requirement of 64 bit atomics
Remove two of the three  instances of components requiring
64 bit atomics, even on 32 bit systems.  The SM OSC component
also uses 64 bit atomics, but is a more complicated fix that
will follow this one.  Currently, no one is testing on
platforms that don't provide 64 bit atomics (even in 32 bit
mode), but with the removal of the non-inline assembly for
IA32, the older compilers on Absoft's test systems now
result in no practical way to call cmpxchg8 in 32 bit mode.
At that point, these failures started popping up.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-11 19:50:10 -07:00
Ralph Castain
6775b2a9c6 Merge pull request #4198 from rhc54/topic/dvmrepair
Repair the ORTE DVM
2017-09-11 18:40:06 -07:00
Ralph Castain
3477079804 Repair the ORTE DVM
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-11 17:38:21 -07:00
Nathan Hjelm
7cdda24206 osc/sm: do not require 64-bit atomic math
This commit fixes a compile issue on 32-bit systems that do not
support 64-bit atomic math. The active target path was using 64-bit
atomics exclusively to support PSCW. This commit updates the code to
use either 32 or 64-bit atomic math depending on what is available.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-11 14:10:38 -10:00
Brian Barrett
29a53b0269 git: Ignore OSHMEM C++ wrapper artifacts
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-08 08:54:08 -07:00
Josh Hursey
392129063b Merge pull request #4191 from jjhursey/fix/global_rank
orte/pmix: Always seed environment with global rank
2017-09-08 09:39:50 -05:00
Joshua Hursey
420ca65f4f orte/pmix: Always seed environment with global rank
* Even if we are only launching one app context, we might call spawn
   later and the remote groups might want their global rank information.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-09-08 08:53:49 -05:00
Brian Barrett
5602d3b9c2 atomics: Remove cmpset_64 on IA32
The recent changes to remove non-inline atomics have caused
a cascade of issues with cmpset_64 on IA32.  cmpxchg8 requires
the use of a bunch of registers (2 for every operand, 3 operands),
and one of them is ebx, which is used by the compiler to do
shared library things.  Some compilers don't deal well with
ebx being clobbered (I'm looking at you, gcc 4.1).  Rather than
continue trying to fight, remove cmpset_64 from the supported
atomic operations on IA32.  Other 32 bit platforms (MIPS32,
SPARC32, ARM, etc.) already don't support a 64 bit compare-and-
swap, so while this might slightly reduce performance, it will
at least be correct.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-07 12:19:34 -07:00
Ralph Castain
afe7f6983b Merge pull request #4184 from rhc54/topic/pmix
Update to track PMIx master
2017-09-06 15:19:01 -07:00
Brian Barrett
ff3ff28a00 NEWS: Remove duplicate "master" items
Both the C++ and Vampir notes appear in release branch notes
already, so remove from the "not on release branch" section.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-06 13:31:30 -07:00
Nathan Hjelm
4bba8774f4 monitoring: fix MPI_T regression
The monitoring code causes MPI_T based tools to segfault when
monitoring is disabled. This happens because the performance
variables remain registered after the common/monitoring
component is dlclosed due to a missing variable registration
flag. This commit adds the necessary flag to all the registered
performance variables.

The issue on github is #4162. Close when applied to master.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-06 14:24:35 -06:00
Ralph Castain
cbc114e923 Update to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-06 13:15:24 -07:00
Jeff Squyres
41c7230bc4 Merge pull request #4179 from jsquyres/pr/opal-path-nfs-razzem-frazzem
opal_path_nfs: ensure arrays are always long enough
2017-09-06 11:16:44 -04:00
Jeff Squyres
dee8cfbfd0 opal_path_nfs: ensure arrays are always long enough
This test used to have fixed-sized arrays for the mounts that it was
checking.  However, we periodically run across machines with more
mounts than can fit into those fixed-size arrays.  Rather than
periodically increasing the size of those arrays (after re-discovering
that the error is due to fixed-size arrays), just count how many
entries there are and make arrays that are big enough.

Additionally, add a check to ensure that we don't go over the max size
of the array when reading/filling them.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-06 07:01:45 -07:00
bosilca
dc538e9675 Merge pull request #1177 from bosilca/topic/large_msg
Topic/large msg
2017-09-05 13:30:19 -04:00
Mike Dubman
62739c6513 Merge pull request #4165 from alinask/topic/spml-ucx-estimated-num-eps
SPML_UCX: use ompi_proc_world_size() to set the estimated_num_eps value
2017-09-05 18:36:42 +03:00
Alina Sklarevich
007b1803ec SPML_UCX: use ompi_proc_world_size() to set the estimated_num_eps value
before this fix, mca_spml_ucx_component_open was using
oshmem_num_procs() to set the value of params.estimated_num_eps for UCX.
The oshmem_num_procs() function uses oshmem_group_all which will be
initialized after the call to mca_spml_ucx_component_open and therefore,
cannot be used there.

Signed-off-by: Alina Sklarevich <alinas@mellanox.com>
2017-09-04 14:46:00 +03:00
Gilles Gouaillardet
3b8b8c52c5 Merge pull request #1432 from ggouaillardet/topic/memchecker
Fix misc memchecker issues
2017-09-04 13:14:40 +09:00
Gilles Gouaillardet
ecb6b81a05 mpi: correctly handle MPI_IN_PLACE by memchecker in neighborhood collectives
MPI_IN_PLACE is not a valid send buffer for neighborhood collectives, so do not
invoke memchecker in this case.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
66c9485e77 MPI_Isend: memchecker do not mark send buffer as unaccessible after pml isend invokation
Today's MPI standard mandates the send buffer remains accessible during the send operation.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
af8242a121 pml/ob1: have memchecker make recv buffer defined again when mca_pml_ob1_recv completes
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Gilles Gouaillardet
6ee9366243 MPI_Wait: correctly handle MPI_STATUS_IGNORE in MEMCHECKER
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Ralph Castain
c1ce233eaf Merge pull request #4143 from aravindksg/psm2_cuda
Add support for GPU buffers for PSM2 MTL
2017-09-01 21:09:55 -07:00
Ralph Castain
7b22207599 Merge pull request #4163 from rhc54/topic/pmix21
Roll to track PMIx master
2017-09-01 17:36:20 -07:00
Aravind Gopalakrishnan
2e83cf15ce Add support for GPU buffers for PSM2 MTL
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.

This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-09-01 16:59:03 -07:00
George Bosilca
d10522a01c
Set a hard limit on the TCP max fragment size.
Some OSes have hardcoded limits to prevent overflowing over an int32_t.
We can either detect this at configure (which might be a nicer but
incomplete solution), or always force the pipelined protocol over TCP.
As it only covers data larger than 1GB, no performance penalty is to be
expected.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
866899e836
Always abide to the RDMA pipeline limit.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
050bd3b6d7
Make the pipeline depth an int instead of a size_t. While
they are supposed to be unsigned, casting them to a signed
value for all atomic operations is as errorprone as handling
them as signed entities.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
c340da2586
A first cut at the large data problem with TCP. As long
as the writev and readv support a sum larger than a uint32_t
this version will work. For the other OSes a different patch
is required. This patch is a slight modification of the one
proposed by @ggouaillardet.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
4db3730a25
Be consistent for atomic operations and add an entity
of the same type.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
Ralph Castain
2c723f4338 Roll to track PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-01 12:30:34 -07:00
Nathan Hjelm
79fc9d54dc Revert "* Some recent versions of GCC try very hard to make it impossible to"
This reverts commit b5ea5e0994

This commit reverts a change that is hopefully not necessary. If this
is the case this will fix #4146.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-01 08:47:29 -06:00
Jeff Squyres
89c860b4fc Merge pull request #4154 from ggouaillardet/topic/opal_setup_wrappers
configury: revamp opal_setup_wrappers.m4
2017-09-01 10:03:46 -04:00
Gilles Gouaillardet
59b9602c0b Merge pull request #2102 from ggouaillardet/topic/oshCC
oshmem: add C++ wrapper compilers
2017-09-01 15:24:36 +09:00
Gilles Gouaillardet
77f30a4378 oshmem_info: cleanup oshmem_info output
- there is no C++ bindings in OpenSHMEM
- only Fortran binding is shmem.fh

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 13:25:19 +09:00
Gilles Gouaillardet
d1740a679c oshmem: add C++ wrappers
though there are no C++ bindings for oshmem, we need C++ wrappers
since a C compiler might not be able to compile a C++ source.
the C++ wrappers are :
- shmemc++ / oshc++
- shmemcxx / oshcxx
- shmemCC / oshCC (on case sensitive filesystems)

also add the examples/hello_oshmem_cxx.cc example

Thanks Bert Wesarg for bringing this to our attention

Fixes open-mpi/ompi#2097

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 13:24:34 +09:00
Gilles Gouaillardet
cf2a80d215 configury: fix indentation in opal_setup_wrappers.m4
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 09:09:01 +09:00
Gilles Gouaillardet
0fe7757097 configury: revamp opal_setup_wrappers.m4
Define OPAL_EVAL_LIBTOOL() macro and factorize some code

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 09:08:26 +09:00
Gilles Gouaillardet
d8c795e312 configury: revamp opal_setup_wrappers.m4
Define OPAL_EVAL_LIBTOOL() macro and factorize some code

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-01 08:59:56 +09:00
Howard Pritchard
fb34a2104e Merge pull request #4081 from hppritcha/topic/readme_freebsd11.1_issue
README: Add a blurb about FreeBSD 11.1
2017-08-31 12:02:20 -06:00
Howard Pritchard
083e6e6f5e README: Add a blurb about FreeBSD 11.1
The clang 4.0 compiler that ships with FreeBSD 11.1 doesn't
work well with OpenMPI.  Workaround is to use a GNU compiler.

Related to #3992.
[skip ci]

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-31 09:25:12 -06:00
Howard Pritchard
97204b8620 Merge pull request #4150 from hppritcha/topic/ofi_swat_compi_warn
rml/ofi: swat a compiler warning
2017-08-30 15:44:44 -06:00
Howard Pritchard
5db9416724 rml/ofi: swat a compiler warning
On the path to -Werror passing builds!

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-30 09:16:49 -06:00
Ralph Castain
49d68f4343 Merge pull request #3873 from ggouaillardet/topic/pmix_info_create_zero
pmix: do not invoke PMIX_INFO_CREATE() with a zero size
2017-08-30 07:40:29 -07:00
Gilles Gouaillardet
c9cca771cc pmix/ext2x: automatically generate ext2x component from pmix2x sources
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-30 09:41:31 +09:00
Geoff Paulsen
0716a1276f Merge pull request #4119 from markalle/nm_test_fix2
remove nmcheck_prefix.pl test due to false positives
2017-08-29 12:24:36 -05:00
Mike Dubman
a84af675ff Merge pull request #4141 from yosefe/topic/pml-ucx-tag-context-bits
pml_ucx: fix tag/context_id layout and upper bounds.
2017-08-28 08:55:21 +02:00