1
1
openmpi/orte/runtime
Ralph Castain 8f496b01b7 Try automatically adding local spawn threads to parallelize the fork/exec process to speed up the launch on large SMPs. Harvest the threads after initial spawn to minimize any impact on running jobs.
Change the determination of #spawn threads to be done on basis of #local procs in first job being spawned. Someone can look at an optimization that handles subsequent dynamic spawns that might be larger in size.

Leave the threads running, but blocked, for the life of the daemon, and use them to harvest the local procs as they terminate. This helps short-lived jobs in particular.

Add MCA params to set:
  * max number of spawn threads (default: 4)
  * set a specific number of spawn threads (default: -1, indicating no set number)
  * cutoff - minimum number of local procs before using spawn threads (default: 32)

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-11-29 19:54:00 -08:00
..
data_type_support Fix the backend mapper algorithm for comm_spawn. The front and back ends need to get the nodes into the job map in the same order so that the ranking algorithms will reach the same results 2017-06-08 08:00:52 -07:00
help-orte-runtime.txt Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Makefile.am Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_cr.c Several fixes related to session directories: 2016-09-05 07:48:44 +03:00
orte_cr.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_data_server.c Cleanup some issues in connect/accept support across jobs started by different mpirun commands. Still not fully operational, but someone else will have to finish debugging it 2017-08-17 11:58:48 -07:00
orte_data_server.h Ensure that data from a job that was stored in ompi-server is purged once that job completes. Cleanup a few typos. Silence a Coverity warning 2017-05-30 09:43:01 -07:00
orte_finalize.c Also need to avoid calling destruct on the opal_process_info struct after finalize 2017-06-23 07:49:14 -07:00
orte_globals.c Bring the ofi/rml component online by completing the wireup protocol for the daemons. Cleanup the current confusion over how connection info gets created and 2017-07-20 21:01:57 -07:00
orte_globals.h Update the connect/accept support so we check to see if we have the proper infrastructure and RTE support, including whether we have ompi-server available if the connect/accept spans multiple applications. Print pretty help messages in all cases where we do not have support 2017-05-27 10:47:08 -07:00
orte_info_support.c Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_info_support.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_init.c Add verbose output to nidmap code for debugging as this is a new, and sometimes fragile, feature 2017-05-10 12:40:02 -07:00
orte_locks.c opal: rename opal_atomic_init to opal_atomic_lock_init 2017-08-07 14:15:11 -06:00
orte_locks.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
orte_mca_params.c Remove fortran support from platform file 2017-07-20 21:02:30 -07:00
orte_quit.c Update OPAL and ORTE for thread safety 2017-06-06 12:30:57 -07:00
orte_quit.h Refactor the ORTE DVM code so that external codes can submit multiple jobs using only a single connection to the HNP. 2016-02-13 08:10:44 -08:00
orte_wait.c Try automatically adding local spawn threads to parallelize the fork/exec process to speed up the launch on large SMPs. Harvest the threads after initial spawn to minimize any impact on running jobs. 2017-11-29 19:54:00 -08:00
orte_wait.h Try automatically adding local spawn threads to parallelize the fork/exec process to speed up the launch on large SMPs. Harvest the threads after initial spawn to minimize any impact on running jobs. 2017-11-29 19:54:00 -08:00
runtime_internals.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
runtime.h Purge whitespace from the repo 2015-06-23 20:59:57 -07:00