openmpi

Автор	SHA1	Сообщение	Дата
Joshua Ladd	3e23380bba	Merge pull request #2675 from artpol84/orte/state/exit_1_fix orte/odls: Fix ORTE state machine for the non-zero exit case	2017-01-09 12:32:37 -05:00
Ralph Castain	67fce2861b	Merge pull request #2685 from rhc54/topic/cov Resolve Coverity issues	2017-01-07 13:11:40 -08:00
Ralph Castain	84ce7eed2a	Merge pull request #2683 from rhc54/topic/nits Cleanup some configure stuff for static builds	2017-01-07 11:33:20 -08:00
Ralph Castain	e25e69dc2f	Resolve Coverity issues Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-07 10:45:52 -08:00
Ralph Castain	822e2680ba	Cleanup some configure stuff for static builds - still can't get wrapper extra libs to be recognized Signed-off-by: Ralph Castain <rhc@open-mpi.org> pmix2x: minor configure updates Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-07 08:37:36 -08:00
Nathan Hjelm	5b70ae3ec0	amd64: save/restore all 64 bits of rbx around cpuid This commit fixes a bug in the timer check. When -fPIC is used we need to save/restore ebx. The code copied from patcher was meant for 32-bit systems and did not work correctly on 64-bit systems. This commit updates the save/restore to use rbx instead of ebx. Fixes #2678 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-06 18:54:20 -07:00
Joshua Ladd	dc7d2f5b6a	Merge pull request #2571 from alex-mikheev/topic/sshmem_prio_fix oshmem: sshmem: make mmap allocator a default instead of verbs	2017-01-06 17:39:37 -05:00
Joshua Ladd	7fc9f9bbac	Merge pull request #2620 from karasevb/fix_rmaps_mindist rmaps/mindist: fix pmix errors	2017-01-06 17:26:48 -05:00
Ralph Castain	ca16f3f9ed	Merge pull request #2676 from rhc54/topic/alps Minor cleanups to eliminate warnings	2017-01-06 12:43:42 -08:00
Ralph Castain	684e69695f	Minor cleanups to eliminate warnings Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-06 08:44:10 -08:00
Ralph Castain	39d880f65d	Merge pull request #2673 from rhc54/topic/usock Raise the priority of the usock component so it gets preferentially picked	2017-01-06 08:24:35 -08:00
Artem Polyakov	3eb6c98542	orte/odls: Fix ORTE state machine for the non-zero exit case This commit fixes rare race condition that occurs when the process that is calling `exit(-1)` has delay between fd cleanup and actual OS-level exit. This may happen if the process has some work to do `on_exit()`. Problem description: Consider an application process that has called `exit(nonzero)`, it's fd's was closed but it's actual termination at OS level is delayed by some cleanups (eg. in callbacks registered via `on_exit()`). Observed sequence of events was the following: * orted gets stdio disconnection and activating `IOF COMPLETE` state. * parallel OOB disconnection causes `COMMUNICATION FAILURE` state to be activated. * during `COMMUNICATION FAILURE` processing `odls_base_default_wait_local_proc` is called even though real waitpid wasn't yet called (code mentions that waitpid might not be called for unspecified reason). Because of that real exit code is unknown and set to 0. `odls_base_default_wait_local_proc` callback sees `IOF COMPLETE` flag and in conjunction with 0-exit-code it activates `WAITPID FIRED` state. * processing of `WAITPID FIRED` leads to `NORMALLY TERMINATED` to be activated. * `NORMALLY TERMINATED` state in particular leads `ORTE_PROC_FLAG_ALIVE` flag for this proc to be dropped. * when application process finally exits and `wait_signal_callback` is launched. It sets real exit code and calls `odls_base_default_wait_local_proc` again but at this time since the process has `ORTE_PROC_FLAG_ALIVE` flag dropped `WAITPID FIRED` state is activated (instead of `EXITED WITH NON-ZERO`) leading to a hang that was observed. Signed-off-by: Artem Polyakov <artpol84@gmail.com>	2017-01-06 11:12:55 +02:00
Gilles Gouaillardet	189d7b9480	opal/dss: revamp opal_value_unload() to keep valgrind happy reorder tests to avoid valgrind complaining about uninitialized variables Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 17:10:39 +09:00
Ralph Castain	444f5fa35d	Raise the priority of the usock component so it gets preferentially picked Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-05 22:53:04 -08:00
Gilles Gouaillardet	a1a0e324b3	util/hostfile: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	6b9343a966	plm/rsh: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	8ba92d7516	iof/base: plug a memory leak in orte_iof_base_close() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	e396b17a7f	orte/orted: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	6b90b03c28	orted/pmix: plug a memoy leak in pmix_server_fencenb_fn() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	7fe6840232	state/hnp: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	4d58b8dcae	ess/pmi: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:45 +09:00
Gilles Gouaillardet	c0c5dd8ccc	orte: plug a memory leak in orte_rml.recv_cancel do not invoke orte_rml.recv_cancel after the orte progress thread has gone Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:44 +09:00
Gilles Gouaillardet	17fac4bfd1	grpcomm/base: get rid of the seq_num field of the orte_grpcomm_signature_t struct Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:44 +09:00
Gilles Gouaillardet	fe25f50871	grpcomm/base: plug a memory leak on finalize manually allocate sequence numbers to be stored into the orte_grpcomm_base.sig_table hash table, and manually release them on orte_grpcomm_base_close() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:44 +09:00
Gilles Gouaillardet	2189c5bcc3	ompi/dpm: plug a memory leak in disconnect_waitall() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:44 +09:00
Gilles Gouaillardet	a988ad24eb	orte/runtime: plug a leak in orte_finalize() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 15:38:44 +09:00
Gilles Gouaillardet	c2ddb1e2fc	mca/base: plug a memory leak register mca_base_var_enum_value_flag_t so they can be free'd upon finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:36 +09:00
Gilles Gouaillardet	cf534d0c95	ompi/proc: plug a memory leak in ompi_proc_finalize() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	6d5cb9fe0d	event: plug a leak when closing the event framework Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	b3a2bdda7b	opal/threads: manually invoke thread-specific key destructors on the main thread. there is no such thing as pthread_join(main_thread), so key destructors are never invoked on the main thread, which causes valgrind report some memory leaks. Manually store and then invoke the key destructors and make valgrind happy. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	6ef281e163	pmix/base: fix misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	0ee5d56ab1	grpcomm/direct: plug a memory leak in barrier_release() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	a59dfd7b14	sec/munge: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	f2d6584189	grpcomm/base: plug misc memory leaks - add a destructor to orte_grpcomm_caddy_t in order to plug a memory leak - plug a memory leak in barrier_release() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:21 +09:00
Gilles Gouaillardet	c4a47ae9a9	orte/orted: plug misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	88535b6200	orte/util: revamp orte_attr_unload() to keep valgrind happy reorder tests to avoid valgrind complaining about uninitialized variables Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	c612499bc1	opal: mca/base: fix a memory leak in the mca_base_var_enum_flag_t destructor Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	58f2a764f9	ess/hnp: plug memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	24c61b0625	oob/tcp: plug a memory leak in mca_oob_tcp_component_lost_connection() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	7e5da7382e	btl/tcp: plug leaks when closing component remove tcp_local from the tcp_procs table, and release it Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	c7d9e62d47	rml/base: plug a memory leak add a destructor to orte_rml_send_request_t in order to plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	507623d6b1	mpool/hugepage: plug a memory leak on finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:58 +09:00
Gilles Gouaillardet	51021028d6	mpool/base: plug a memory leak on finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:58 +09:00
Gilles Gouaillardet	1daa80d78f	mtl/psm2: plug a memory leak in ompi_mtl_psm2_component_open() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 09:28:32 +09:00
Ralph Castain	b343df43a1	Merge pull request #2669 from rhc54/topic/memprobe Complete the memprobe support.	2017-01-05 12:02:56 -08:00
Ralph Castain	6509f60929	Complete the memprobe support. This provides a new scaling tool called "mpi_memprobe" that samples the memory footprint of the local daemon and the client procs, and then reports the results. The output contains the footprint of the daemon on each node, plus the average footprint of the client procs on that node. Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given. Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example: $ mpirun -npernode 2 ./mpi_memprobe Sampling memory usage after MPI_Init Data for node rhc001 Daemon: 12.483398 Client: 6.514648 Data for node rhc002 Daemon: 11.865234 Client: 4.643555 Sampling memory usage after MPI_Barrier Data for node rhc001 Daemon: 12.520508 Client: 6.576660 Data for node rhc002 Daemon: 11.879883 Client: 4.703125 Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-05 10:32:17 -08:00
Ralph Castain	b4088c331a	Merge pull request #2662 from rhc54/topic/stuff Variety of cleanups	2017-01-04 10:25:26 -08:00
Ralph Castain	91d714fe93	Add flags to direct PMIx to only use one listener, but without directing which one (tcp or usock) to use. This allows the user to set PMIX_MCA_ptl in their environment to select the transport method. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-04 09:16:44 -08:00
Ralph Castain	f355fb926d	Continue cleanup of notifications. Resolve a race condition that can result in attempt to send a message on a closed socket Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-04 09:16:33 -08:00
Joshua Ladd	57c0c847d0	Merge pull request #2603 from xinzhao3/topic/revert-ucx-mt Revert "PML/SPML/UCX: add UCX MT support to PML and SPML."	2017-01-04 11:50:37 -05:00

1 2 3 4 5 ...

26409 Коммитов