- remove stale s390 and MIPS atomics
- ensure envars from spawn are propagated
- fix make tarball
- ensure cleanup of default hostfile
Signed-off-by: Ralph Castain <rhc@pmix.org>
Per the developer's meeting, add detection of the deprecated --with-pmi
(and its associated --with-pmi-libdir) configure option and error out
with a polite note of the change in support
Since "--with-pmi" now shows in the configure help output, mark the help
string with a giant *DEPRECATED* to warn users not to use it
Signed-off-by: Ralph Castain <rhc@pmix.org>
Ma
Signed-off-by: Ralph Castain <rhc@pmix.org>
* `AUTOMAKE_JOBS` can improve the performance to `autogen.pl`
* The user can set this envar in the environment before calling
`autogen.pl` or use the new `-j #` option to set it.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
- fix LSF compile issue
- protect against NULL user home directory
- update reachable component in PRRTE (still unused)
Signed-off-by: Ralph Castain <rhc@pmix.org>
This commit removes the code specific to MIPS. This architecture
has been unsupported for some time. Open MPI will continue to work
on MIPS with C11 and __atomic but will not longer use CMA for
shared memory.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit removes the CMA support for s390 and s390x. These
architectures have been unsupported for awhile and no one has
verified that CMA actually works with Open MPI on these systems.
s390 and s390x will continue to work with Open MPI without CMA.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit fixes an issue with non-debug builds where adding an
attachment to the attachment list doesn't actually happen. This
causes all MPI_Win_detach calls to fail. The call was within an
assert which is optimized out in optimized builds.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit fixes an issue in the MCA base variable system. The
code was retrieving the user home directory (from HOME) and
attempting to use it to build a search path for config files.
In this case user-level configuration directories have been
enabled so the appropriate thing to do is to print an error
message and return. This commit makes that change. It does not
ensure that HOME is set correctly.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit increaes the osc_rdma_max_attach variable from 32
to 64. The new default is kept low due to the small number
of registration resources on some systems (Cray Aries). A
larger max attachement value can be set by the user on other
systems.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
This commit addresses two issues in osc/rdma:
1) It is erroneous to attach regions that overlap. This was being
allowed but the standard does not allow overlapping attachments.
2) Overlapping registration regions (4k alignment of attachments)
appear to be allowed. Add attachment bases to the bookeeping
structure so we can keep better track of what can be detached.
It is possible that the standard did not intend to allow #2. If that
is the case then #2 should fail in the same way as #1. There should
be no technical reason to disallow #2 at this time.
References #7384
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
If you autogen.pl --without-prrte, we wouldn't configure or build PRRTE
support. However, configuring with --disable-internal-rte wasn't working
as it was being ignored. This led to some false errors when compiling
with an earlier PMIx v2.2 release.
That said, there were a couple of places that needed protection against
PMIx v2.2.
Signed-off-by: Ralph Castain <rhc@pmix.org>
We do not want to escape $, because the resulting quoted string ends
up in C code, and "\$" is not recognized by printf (and some compilers
warn about it).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Fix a couple of spots in OMPI to resolve warnings. The one in comm_cid
in particular may be responsible for some/all of the comm_spawn issues
as it was passing an incorrect pointer to a macro, thus causing memory
corruption.
Update PRRTE and PMIx to deal with v3/v4 differences.
Signed-off-by: Ralph Castain <rhc@pmix.org>
- fix a typo `alloc_shared_contig` to `alloc_shared_noncontig`
- correct the value of `blocking_fence`
Signed-off-by: Tsubasa Yanagibashi <fj2505dt@aa.jp.fujitsu.com>
When a job fails very quickly, it is possible that the spawning tool
won't get notified that the spawn completed prior to be told that the
job terminated. This can cause the tool to "hang" in PMIx_Spawn. Ensure
that PRRTE handles this case by guaranteeing we notify the spawner.
Track both PRRTE and PMIx masters as both have changed, though only the
PRRTE one is involved in this particular fix.
Signed-off-by: Ralph Castain <rhc@pmix.org>
- Ensure we accurately handle node name aliases
- Apply the local launch environ to apps prior to spawn
- Add error check if PMIx_Spawn fails
- Fix compiler warning
- Fix PGI vendor check
- Prevent mpirun from hanging if prte segfaults
- Fix absolute/relative path names to ensure "prte" and
"prted" are taken from same distribution as "mpirun"
Signed-off-by: Ralph Castain <rhc@pmix.org>
- pgcc18 defines __GNUC__ similar to Intel compilers. So we must
check for pgi higher up, or else configury will mistake
it for gcc.
Signed-off-by: Austen Lauria <awlauria@us.ibm.com>
This commit fixes a potential deadlock that can occur between the
memory hooks and region registation. This deadlock occurs because
of a hold and wait error between two mutexes. The first mutex is
the VMA lock used to protect internal rcache/grdma structures and
the reader/writer lock in the interval tree.
In the case of the memory hooks a reader lock is obtained on the
interval tree then the VMA lock is obtained to remove the
registration from the LRU. In the case of LRU evictions the VMA
lock is obtained then the writer lock on the interval tree is
obtained. This leads to the deadlock.
To fix the issue the code that evicts from the LRU has been
updated to only invalidate the registration while the VMA lock
is held then remove the registration from the VMA after the
lock is released. This should completely eliminate the above
deadlock.
Signed-off-by: Nathan Hjelm <hjelmn@google.com>
Update PRRTE submodule pointer to track changes in master that impact
OMPI behavior plus provide a new capability
Signed-off-by: Ralph Castain <rhc@pmix.org>