Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given.
Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example:
$ mpirun -npernode 2 ./mpi_memprobe
Sampling memory usage after MPI_Init
Data for node rhc001
Daemon: 12.483398
Client: 6.514648
Data for node rhc002
Daemon: 11.865234
Client: 4.643555
Sampling memory usage after MPI_Barrier
Data for node rhc001
Daemon: 12.520508
Client: 6.576660
Data for node rhc002
Daemon: 11.879883
Client: 4.703125
Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Revamp the event notification integration to rely on the PMIx event chaining and remove the duplicate chaining in OPAL. This ensures we get system-level events that target non-default handlers.
Restore the hostname entries for MPI-level error messages, but provide an MCA param (orte_hostname_cutoff) to remove them for large clusters where the memory footprint is problematic. Set the default at 1000 nodes in the job (not the allocation).
Begin first cut at memory profiler
Some minor cleanups of memprobe
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Plug a minor memory leak. Tell the PMIx server not to create a dstore memory region for the daemon job as there is nobody to share it with.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Protect users of hwloc membind functions
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Update PMIx to include NULL string protection
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Update to PMIx master to include key overwrite protection
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:
* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
instead of the topology itself, if available, thus avoiding instantiating the topology
* openib BTL. This uses the distance matrix. At present, I haven't developed a method
for replacing that reference. Thus, this component will instantiate the topology
* usnic BTL. Uses the distance matrix.
* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
the topology
* ess base functions. If a process is direct launched and not bound at launch, this
code attempts to bind it. Thus, procs in this scenario will instantiate the
topology
Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.
Fix pernode binding policy
Properly handle the unbound case
Correct pointer usage
Do not free static error messages!
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Looks like we missed one place where we needed to swap OMPI_UNIQUE for
OMPI_FLAGS_UNIQUE. Thanks to Phil Tooley (@Telemin) for reporting the
issue.
Fixes#2635.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
MPI_T_pvar_get_index was returning an incorrect index. The index
was never set correctly while registering the performance variables.
Additionally fix a missing case in the mca_base_var_type_t to MPI
datatype conversion. This type is currently used for control variables
registered by mxm, fca and hcoll components.
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
- Fix capitolization typos
- Make comment more correct / flow better
- Use AM_CPPFLAGS, not DEFAULT_INCLUDES
- Remove extra "hwloc/" from external hwloc.h specification
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This is news to me: I didn't know that some distros do not set $HOME.
So use "~" instead, and only try to grep ~/.rpmmacros if it exists.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
--with-libltdl is now added (via AC_ARG_WITH) in
opal/mca/dl/libltdl/configure.m4 -- it no longer belongs up here in
this top-level m4 file. Plus, the help string in this stale entry is
also stale/incorrect.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Fixed the case were only part of the nodes in the allocation
are used by the applicaton proccesses.
Force PMIx nodemap key to only contain nodes that are actually
used by the application proccesses.
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
- simply #include "hwloc.h" to use the external hwloc header
- do use the external hwloc header instead of opal/mca/hwloc/hwloc.h
Thanks Orion Poplawski for the report
Fixesopen-mpi/ompi#2616
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>