1
1
openmpi/orte/mca
Howard Pritchard 39367ca0bf plm/alps: only use srun for Native SLURM
Turns out that the way the SLURM plm works
is not compatible with the way MPI processes
on Cray XC obtain RDMA credentials to use
the high speed network.  Unlike with ALPS,
the mpirun process is on the first compute
node in the job.  With the current PLM launch
system, mpirun (HNP daemon) launches the MPI
ranks on that node rather than relying on
srun.

This will probably require a significant amount
of effort to rework to support Native SLURM
on Cray XC's.  As a short term alternative,
have the alps plm (which gets selected by default
again on Cray systems regardless of the launch system)
check whether or not srun or alps is being used on the
system.  If alps is not being used, print a helpful
message for the user and abort the job launch.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-12-22 11:03:42 -08:00
..
common configury: clean up .so version numbers 2015-12-18 12:50:23 -05:00
dfs mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
errmgr mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
ess Do not override any external settings for PMIx component selection 2015-12-21 08:36:12 -08:00
filem mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
grpcomm If an executable isn't found, it's possible for the state machine to hit the grpcomm with a zero-node map before we actually terminate with error. Silence the annoying malloc warning about zero-byte requests. 2015-11-11 14:24:13 -08:00
iof mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
notifier ORTE: update for the new opal_progress_thread API 2015-08-07 10:13:40 -07:00
odls Fix some messages in the process. 2015-11-09 18:03:26 -05:00
oob oob_tcp: fix peer->state wrong check 2015-10-29 16:43:58 +01:00
plm plm/alps: only use srun for Native SLURM 2015-12-22 11:03:42 -08:00
qos Cleanup warnings in opal and orte layers when building optimized on Mac 2015-12-17 07:51:24 -08:00
ras Cleanup warnings in opal and orte layers when building optimized on Mac 2015-12-17 07:51:24 -08:00
rmaps Fix the default slot mapping in rank file mapper 2015-12-21 09:47:27 -08:00
rml Cleanup warnings in opal and orte layers when building optimized on Mac 2015-12-17 07:51:24 -08:00
routed mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
rtc Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given 2015-09-04 16:54:40 -07:00
schizo Do not override any external settings for PMIx component selection 2015-12-21 08:36:12 -08:00
snapc mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
sstore mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
state Work on cleaning up memory leaks that are causing orte-dvm to eventually run out of memory. Still don't have everything plugged, but getting better. Sync to the PMIx master that includes removal of the pmix_common.h.in file that really didn't need to be generated, and update to the PMIx_server_init API. 2015-11-06 14:15:30 -08:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
mca.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00