Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers.
Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation).
Begin first cut at memory profiler
Some minor cleanups of memprobe
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
This is news to me: I didn't know that some distros do not set $HOME.
So use "~" instead, and only try to grep ~/.rpmmacros if it exists.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
* Point to local libfabric v1.4 install
* Add MPI C++ bindings
* Remove PSM support (if someone can install PSM/PSM2 libraries on the
build server, let's re-enable this)
Also change from -j8 to -j4 (the new AWS build instance only has 1
core / 2 hyperthreads).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Nightly snapshots will now be named:
openmpi-${BRANCHNAME}-${YYYYMMDDHHMM}-${SHORTHASH}.tar.${COMPRESSION}.
Fixes#2337
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.
Multiple conduits can exist at the same time, and can even point to the same base transport. Each conduit can have its own characteristics (e.g., flow control) based on the info keys provided to the "open_conduit" call. For ease during the transition period, the "legacy" RML interfaces remain as wrappers over the new conduit-based APIs using a default conduit opened during orte_init - this default conduit is tied to the OOB framework so that current behaviors are preserved. Once the transition has been completed, a one-time cleanup will be done to update all RML calls to the new APIs and the "legacy" interfaces will be deleted.
While we are at it: Remove oob/usock component to eliminate the TMPDIR length problem - get all working, including oob_stress
* In open-mpi/ompi@f6f24a4f67 I missed
updating the library references for the wrapper compilers.
* Fixes the CXX wrapper compiler and CXX library is renamed as needed.
* Fixes the Java wrapper compiler and the Java library is renamed as needed.
Update the script to auto-generate the entire AUTHORS file from two
sources:
1. The existing AUTHORS file
2. The output from "git log --format=tformat:=tformat:'%aN <%aE>'"
Merge these two together (which will preserve organization
affiliations) and warn in two cases:
1. If a person has no organization affiliation
1. If the same email address appears for more than one person
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>