1
1
openmpi/orte/mca
Artem Polyakov 55ac3b0be3 orte/schizo: fix binding detection in slurm component
in SLURM 16.05 the SLURM_CPU_BIND_TYPE is equal to "mask_cpu:"
instead of "mask_cpu". Account for that.
2016-08-26 09:55:52 +03:00
..
common configury: clean up .so version numbers 2015-12-18 12:50:23 -05:00
dfs mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
errmgr Ensure we properly convert pmix status to ORTE state before activating an error state upon notification. Cleanup some conversion issues on notification info. Add a new orte_notify.c test program 2016-08-12 21:14:29 -07:00
ess ess/singleton: push all PMIX_* environment variables, regardless how many there are 2016-08-23 09:46:55 +09:00
filem Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
grpcomm Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need. 2016-05-01 11:30:25 -07:00
iof Remove stale map-reduce support 2016-06-12 07:41:57 -07:00
notifier ORTE: update for the new opal_progress_thread API 2015-08-07 10:13:40 -07:00
odls Cleanup the forced termination a bit by restoring the delay before issuing the sigkill, and eliminating the large time loss spent checking if the proc died. The latter is responsible for a large number of test timeouts in MTT 2016-06-02 17:48:21 -07:00
oob Update the session dir structure. Restore the creation of a top-level dir based on userid so that everything is contained under the user's top-level dir. Make the next level down (the "job family" level) be either the pid (indicated by a name of "pid.N") or the job family if not launched by mpirun. This allows for proper rendezvous by direct-launched procs. 2016-08-15 22:46:46 -05:00
plm rsh: robustify the check for plm_rsh_agent default value 2016-08-16 06:58:20 -05:00
ras Enable simulation of large-scale clusters by allowing multiple daemons/node. Specifying the ras_base_multiplier parameter to be greater than 1 will cause ORTE to replicate each allocated node by that factor. A daemon will be spawned for each replica, thus letting ORTE function as if it were on a much larger cluster. 2016-05-29 18:56:18 -07:00
rmaps Merge pull request #1995 from rhc54/topic/pe-per-rank 2016-08-25 14:38:12 -05:00
rml Add a timeout cmd line option and an option to report state info upon timeout to assist with debugging Jenkins tests 2016-05-28 08:36:25 -07:00
routed Release child object when we are recording someone's relatives. 2016-02-15 20:50:42 -08:00
rtc Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given 2015-09-04 16:54:40 -07:00
schizo orte/schizo: fix binding detection in slurm component 2016-08-26 09:55:52 +03:00
snapc mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
sstore mca/base: add priority output to mca_base_select 2015-10-19 12:32:41 -06:00
state Add a new --continuous flag to mpirun that directs ORTE to let a job continue running as app procs terminate. Don't attempt to restart them. Add event notification of abnormally terminating procs, and demonstrate that in the mpi_spin test program. 2016-07-13 15:28:33 -07:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
mca.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00