openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	100b112d3c	pmix: fix zlib protection macro usage It's possible that we can have zlib.h but still not have zlib support. Use the correct macro to protect the usage of calling zlib functions. This fixes 32-bit MTT builds at Cisco (e.g., https://mtt.open-mpi.org/index.php?do_redir=2389). Submitted upstream to PMIX: https://github.com/pmix/master/pull/290 Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-02-07 05:52:32 -08:00
KAWASHIMA Takahiro	750406f67b	pmix/pmix2x: Correct configure option description `--enable-pmix-dstore` option was enabled by default in `f4a5511`. Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>	2017-02-07 11:52:56 +09:00
Gilles Gouaillardet	c62498ab3d	btl/tcp: remove reference to just removed tcp_local Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-02-07 09:32:09 +09:00
Jeff Squyres	368ab4d9a5	Merge pull request #2684 from bosilca/topic/tcp_fixes Remove the tcp_local field from the TCP component.	2017-02-06 16:32:06 -05:00
Carlos Bederián	ccea3de44c	amd64 timers: use lfence instead of cpuid for serialization Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>	2017-02-04 18:50:29 -03:00
Carlos Bederián	4009ba6b94	opal_progress: use usec native timer only when a native cycle counter isn't available Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>	2017-02-04 18:31:14 -03:00
bosilca	c331e6794c	Allow all tuned MCA parameters to be modified programatically. (#2829 ) Fix a comment in the MCA header. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-01-31 21:47:36 -05:00
Ralph Castain	6cb484a3cb	Merge pull request #2887 from rhc54/topic/update Update to latest PMIx master	2017-01-31 11:05:37 -08:00
Jeff Squyres	45b791542c	Merge pull request #2809 from jjhursey/fix/ibm/opal-verbose opal/output: Make sure verbose gets updated when id 0 gets updated.	2017-01-31 12:18:38 -05:00
Ralph Castain	edcfdf2365	Update to latest PMIx master Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-31 08:01:37 -08:00
Gilles Gouaillardet	b078e57e73	pmix/ext1x: fix misc memory leaks in namespace registration Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 10:52:42 +09:00
Gilles Gouaillardet	f51fc293a2	ext1x/pmix1x_client: plug misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 10:52:42 +09:00
Gilles Gouaillardet	022cca79ea	pmix/ext1x: plug a memory leak in opal_lkupcbfunc() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 10:52:36 +09:00
Gilles Gouaillardet	f485d12a82	pmix: rename the ext11 component into ext1x also use the same naming scheme thann pmix/ext2x Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 10:52:35 +09:00
Gilles Gouaillardet	dccb1899e6	pmix/ext11: correctly use PMIx_server_register_nspace() PMIx_server_register_nspace() is an asynchronous operation, so the pmix glue wait for it completes before returning. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 09:23:19 +09:00
Gilles Gouaillardet	6955e1e25c	pmix/ext11: fix compilation the argc field from the opal_pmix_app_t struct was removed, so adjust the pmix/ext11 glue accordingly. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-30 09:23:18 +09:00
Howard Pritchard	fca45a2742	mca help: fix typo found by user Fix typo found by @pozdneev Fixes #2821 bot:notest Signed-off-by: Howard Pritchard <howardp@lanl.gov>	2017-01-28 09:37:43 -07:00
Ralph Castain	3302864a7d	Cleanup a typo that can cause a segfault - use a local variable name different than the one passed into the function Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-27 16:49:25 -08:00
Josh Hursey	2e64bf42fb	Merge pull request #2810 from jjhursey/fix/ibm/stdiag-to-stdout Extend options for stddiag routing	2017-01-26 14:29:16 -06:00
Josh Hursey	770c41f493	Merge pull request #2807 from jjhursey/fix/ibm/event-external libevent/external: Add opal_event_include to this component	2017-01-26 14:26:50 -06:00
Jeff Squyres	2c277a66fd	Merge pull request #2772 from jjhursey/topic/stacktrace-improv master: opal/stacktrace improvements	2017-01-26 10:48:41 -08:00
Joshua Hursey	6d98559be9	stacktrace: Add flexibility in stacktrace ouptut - New MCA option: opal_stacktrace_output - Specifies where the stack trace output stream goes. - Accepts: none, stdout, stderr, file[:filename] - Default filename 'stacktrace' - Filename will be `stacktrace.PID`, or if VPID is available, then the filename will be `stacktrace.VPID.PID` - Update util/stacktrace to allow for different output avenues including files. Previously this was hardcoded to 'stderr'. - Since opal_backtrace_print needs to be signal safe, passing it a FILE object that actually represents a file stream is difficult. This is because we cannot open the file in the signal handler using `fopen` (not safe), but have to use `open` (safe). Additionally, we cannot use `fdopen` to convert the `int fd` to a `FILE fh` since it is also not signal safe. - I did not want to break the backtrace.h API so I introduced a new rule (documented in `backtrace.c`) that if the `FILE file` argument is `NULL` then look for the `opal_stacktrace_output_fileno` variable to tell you which file descriptor to use for output. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-26 11:55:32 -06:00
Joshua Hursey	f8918e37a9	opal/stacktace: Raise the signal after processing - This prevents us for accidentally masking a signal that was meant to terminate the application. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-26 11:55:28 -06:00
Nathan Hjelm	fe1c6bd881	Merge pull request #2840 from hjelmn/event_fix verbs: remove extra event user increment/decrement operation	2017-01-26 07:30:24 -08:00
Gilles Gouaillardet	896434b1bd	pmix/ext2x: plug a memory leak in opal_lkupcbfunc() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-26 14:07:15 +09:00
Gilles Gouaillardet	6b8e1c217c	pmix/ext2x: plug misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-26 14:06:58 +09:00
Nathan Hjelm	9f28c0af39	verbs: remove extra event user increment/decrement operation Since the oob and connections systems do not work the same way they did in older versions of Open MPI these operations are no longer necessary. At best they do nothing and at worst they hurt performance by making us enter the event library more often in opal_progress(). Fixes #2839 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-25 18:37:06 -07:00
Gilles Gouaillardet	d3a5065288	Merge pull request #2815 from ggouaillardet/topic/opal_tsd_keys_destruct opal/threads: protect opal_tsd_keys_destruct() to fix Java bindings.	2017-01-26 09:24:14 +09:00
Gilles Gouaillardet	142b95df87	pmix/ext2x: plug misc memory leaks regarding opal_pmix2x_event_chain_t handling Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-25 16:17:10 +09:00
Gilles Gouaillardet	7a3d39f079	pmix/ext2x: plug a memory leak in _reg_nspace() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-25 16:17:01 +09:00
Gilles Gouaillardet	e1811cfe17	opal/threads: protect opal_tsd_keys_destruct() to fix Java bindings. When Java bindings are used, MPI_Init() is not invoked by the main thread, and this causes some keys being destructed twice. Reset the per thread values to NULL in order to correctly handle this Fixes open-mpi/ompi#2811 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-25 10:58:55 +09:00
Joshua Hursey	dcd9801f7c	orte/iof: Add orte_map_stddiag_to_stdout option * Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag` to `stdout` instead of `stderr`. * Add protection so that the user canot supply both: - `orte_map_stddiag_to_stderr` - `orte_map_stddiag_to_stdout` Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-24 16:22:59 -06:00
Joshua Hursey	2596983593	opal/output: Make sure verbose gets updated when id 0 gets updated. - This allows the following MCA option to have an impact on the framework verbose output as well. * `-mca mca_base_verbose stdout` Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-24 16:14:11 -06:00
Joshua Hursey	d6b306d716	libevent/external: Add opal_event_include to this component * Adds a parameter to adjust the method used by libevent. - Matches that of the libevent2022 component. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-24 16:03:09 -06:00
Ralph Castain	ef86707fbe	Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it. Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-24 13:33:22 -08:00
Josh Hursey	c6595c2289	Merge pull request #2792 from jjhursey/topic/libevent-conf2 libevent2022: Fix broken configure AC_LANG_PROGRAM	2017-01-24 08:31:46 -06:00
Gilles Gouaillardet	682f5116aa	Merge pull request #2781 from ggouaillardet/topic/misc_fixes_and_plugs fix misc bugs and plug misc memory leaks	2017-01-24 14:41:45 +09:00
Joshua Hursey	72ac812039	libevent2022: Fix broken configure AC_LANG_PROGRAM * Similar to commit `029964a748` This removes an extra `int main` during configure. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-23 21:47:59 -06:00
Gilles Gouaillardet	189da7fdab	pmix2x: plug a memory leak in _event_hdlr() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-24 09:13:30 +09:00
Gilles Gouaillardet	acbc32d3b2	pmix2x: plug a memory leak in opal_lkupcbfunc() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-24 09:13:29 +09:00
Gilles Gouaillardet	b5b21043c4	pmix2x: plug a memory leak in _reg_nspace() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-24 09:13:29 +09:00
Gilles Gouaillardet	0f47310a75	pmix2x/pmix2x_client: plug misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-24 09:13:29 +09:00
Gilles Gouaillardet	1a6c17ec7d	opal/util: plug a memory leak by using opal_setenv() instead of putenv() Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-24 09:12:47 +09:00
Joshua Hursey	029964a748	libevent2022: Fix broken configure AC_LANG_PROGRAM * The AC_LANG_PROGRAM macro adds the `main()` so it is erroneous to add it to the test program. * This was detected with the XL compilers which will fail to build the program in this situation. The GNU compiler does not error out or warn, but successfully compiles the program. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>	2017-01-23 13:44:12 -06:00
Ralph Castain	8c960bae8d	Update to latest PMIx master Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-23 07:07:40 -08:00
Ralph Castain	e8e5f81abd	Something not quite right about the revised allocation algos, so revert them while retaining the larger initial and threshold sizes Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-21 14:37:45 -08:00
Ralph Castain	be3ef77739	Improve packing efficiency by raising the initial buffer size and modifying the extension code. Flag if a job map has had its nodes added so we don't have to loop repeatedly to check it. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-21 14:03:19 -08:00
Gilles Gouaillardet	dffaad9de2	opal/util: fix a race condition in opal_os_dirpath_create() always check the permissions of the created directory, in case some one else created the very same directory but with incompatible permissions Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-19 14:02:47 +09:00
Ralph Castain	6da4dbbb33	Quick fix: save the errno from the mkdir call as the call to stat will likely overwrite it Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-18 15:42:31 -08:00
Ralph Castain	d1880d8ba1	Merge pull request #2755 from rhc54/topic/session Update and cleanup os_dirpath	2017-01-18 13:57:44 -08:00
Ralph Castain	b257c32d2c	Cleanup the os_dirpath logic so it doesn't error out if the directory actually gets created (regardless of what mkdir returns), and pretty-prints the error if it does error out. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-18 12:05:47 -08:00
Gilles Gouaillardet	a3f21fb2aa	opal_os_dirpath_create: fix TOCTOU as reported by Coverity with CID 70396 (cherry picked from commit `58d1b3f4d0`)	2017-01-18 11:48:30 -08:00
George Bosilca	999d4973a9	Fix an issue with extremely large data identified by tjb900. Due to the conversion from ssize_t to int we were losing bytes, and ended up writing outside the receiver buffer. Similarly on the send, due to the conversion to a lesser type, we could missinterpret the end of the fragment.	2017-01-18 10:33:12 -05:00
Nathan Hjelm	91c34c8df6	Merge pull request #2703 from hjelmn/rcache_fix rcache/base: do not release vma stuctures in vma_tree_delete	2017-01-12 09:53:34 -07:00
Nysal Jan K A	16ca8c18c6	Merge pull request #2706 from nysal/ppc_atomic_master asm/ppc: Fix a regression in powerpc atomics	2017-01-12 19:43:33 +05:30
Jeff Squyres	938ab01ad6	Merge pull request #2714 from hjelmn/timer_rollover timer/linux: prevent 64-bit overflow	2017-01-12 06:40:52 -05:00
Nathan Hjelm	45c05880aa	timer/linux: prevent 64-bit overflow The linux timer code was multiplying the result of the x86 time stamp counter by 1000000 before dividing by the cpu frequency. This can cause us to overflow 64 bits if the time stamp counter grows larger than ~ 1.8e13 (about 8400 seconds after boot). To fix the issue the units of opal_timer_linux_freq have been changed to MHz. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-11 20:03:10 -07:00
Gilles Gouaillardet	aeee48357a	btl/sm: correctly handle nodes with zero NUMA hwloc object the hwloc topology might not contain a NUMA object with hwloc < v2 if the node is not NUMA, so force the NUMA object count to one in order to correctly allocate mca_btl_sm_component.sm_mpools. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-12 11:45:29 +09:00
George Bosilca	c2cd717f82	Don't refcount the predefined datatypes. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-01-11 16:48:59 -05:00
Nysal Jan K.A	97801028ba	asm/ppc: Fix a regression in powerpc atomics Add a missing constraint to the input operand list. This fixes a regression caused by d4be138a7b. Thanks to Orion Poplawski for reporting the issue. Refs #2610 Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>	2017-01-11 11:00:11 -05:00
Ralph Castain	31a8476223	Merge pull request #2702 from rhc54/topic/cov Silence Coverity CID 1398541	2017-01-10 17:50:23 -08:00
Nathan Hjelm	79cabc92fd	rcache/base: do not release vma stuctures in vma_tree_delete This commit fixes a deadlock that can occur when the libc version holds a lock when calling munmap. In this case we could end up calling free() from vma_tree_delete which would in turn try to obtain the lock in libc. To avoid the issue put any deleted vma's in a new list on the vma module and release them on the next call to vma_tree_insert. This should be safe as this function is not called from the memory hooks. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-10 16:58:07 -07:00
Ralph Castain	e568b211e4	Silence Coverity CID 1398541 Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-10 15:30:50 -08:00
Jeff Squyres	b980e334dc	usnic: add completion stats This should probably not go to the v2.x branch, since it changes the output format of the usnic stats. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:54 -08:00
Jeff Squyres	706f53bb01	usnic: ensure that stats string is always truncated Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:54 -08:00
Jeff Squyres	1fdd0fe228	usnic: add missing params to show_help() call Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:54 -08:00
Jeff Squyres	7048adec04	usnic: add some assert()s Add some run-time assert checks for debug builds. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:32 -08:00
Jeff Squyres	2d28ccb5fd	usnic: add verbose output of queue lengths Show the actual RX/TX and CQ length returned by libfabric in verbose output. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:32 -08:00
Jeff Squyres	bd5b8ed754	usnic: ensure that queues are long enough Double check the queue lengths that we get back from libfabric to ensure that they are at least as long as we need. They should never be shorter than we need, but let's just check to be sure. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:32 -08:00
Jeff Squyres	53dc75a89c	usnic: ensure to reset flags on returned frags Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:31 -08:00
Jeff Squyres	c4d7876ca0	usnic: check send credits on data channel for data frags Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:31 -08:00
Jeff Squyres	879d25e5df	usnic: ensure to check send credits for ACKs Don't just blindly send ACKs; ensure that we have send credits before doing so. If we don't have any send credits, just don't send the ACK (it'll come again soon enough; it's not a tragedy if we don't send it now). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:06:31 -08:00
Jeff Squyres	7787dad4db	usnic: ensure CQs are long enough The libfabric usnic provider may give you back TX/RX queues that are longer than you asked for. So just use the TX/RQ/CQ lengths that we asked for, regardless of what length comes back. Additionally, keep the length of the priority channel CQ separate from the length of the data CQ. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:03:53 -08:00
Jeff Squyres	b02d8c48f5	usnic: make the releasing safer Since the usnic BTL is single-threaded in this area, there really is no danger, but don't use one of the pointers hanging off the frag after we return it to the freelist. Instead, save the endpoint pointer before returning the frag. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:03:53 -08:00
Jeff Squyres	e25b860627	usnic: clarify types The types are technically typedef equivalent, but it's less confusing to use the types that agree with the name of the constructor. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 12:03:53 -08:00
Jeff Squyres	40fe575132	usnic: trivial updates (no code/logic changes) - Add more explanatory comments - Trivial whitespace / style updates - Rename opal_btl_usnic_force_retrans() -> opal_btl_usnic_fast_retrans() Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-10 10:40:02 -08:00
Nathan Hjelm	3593fad4d2	Merge pull request #2679 from hjelmn/cpuid_fix master: amd64: save/restore all 64 bits of rbx around cpuid	2017-01-10 09:12:28 -07:00
Gilles Gouaillardet	6d59b476de	Merge pull request #2686 from ggouaillardet/topic/pmix2x_ptl_base_sendrecv pmix2x: ptl/base: send header and message data together via writev()	2017-01-10 16:26:10 +09:00
Gilles Gouaillardet	44c1ff60f1	Merge pull request #2672 from ggouaillardet/topic/misc_memory_leaks Plug misc memory leaks	2017-01-10 13:16:04 +09:00
Gilles Gouaillardet	a01960bee5	pmix2x: ptl/base: send header and message data together via writev() on Linux, sending the header and then the message data does severely impact performances of ptl/tcp : on the receiver, reading the data can often result in an PMIX_ERR_RESOURCE_BUSY or PMIX_ERR_WOULD_BLOCK, which ends up degrading performances) this commit send both header and message data at the same time via writev() and makes ptl/tcp virtually as efficient as ptl/usock. Short writev generally occur when the kernel buffer is full, so there is no point for retrying in this case. fwiw, no such degradation was observed on OSX. Refs open-mpi/ompi#2657 Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-10 13:07:39 +09:00
Nathan Hjelm	d6bd69dc93	mca/base: account for NULL string_value in verbose set The MCA variable code calls the string from value function with a NULL string to verify values. The verbosity enumerator was not correctly checking for a non-NULL value before trying to set the string. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-09 11:52:31 -07:00
Ralph Castain	67fce2861b	Merge pull request #2685 from rhc54/topic/cov Resolve Coverity issues	2017-01-07 13:11:40 -08:00
Ralph Castain	e25e69dc2f	Resolve Coverity issues Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-07 10:45:52 -08:00
George Bosilca	cfeeecd381	Remove the tcp_local field from the TCP component. Instead use the OPAL process name to get the name of the local process. Signed-off-by: George Bosilca <bosilca@icl.utk.edu>	2017-01-07 13:24:18 -05:00
Ralph Castain	822e2680ba	Cleanup some configure stuff for static builds - still can't get wrapper extra libs to be recognized Signed-off-by: Ralph Castain <rhc@open-mpi.org> pmix2x: minor configure updates Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2017-01-07 08:37:36 -08:00
Nathan Hjelm	5b70ae3ec0	amd64: save/restore all 64 bits of rbx around cpuid This commit fixes a bug in the timer check. When -fPIC is used we need to save/restore ebx. The code copied from patcher was meant for 32-bit systems and did not work correctly on 64-bit systems. This commit updates the save/restore to use rbx instead of ebx. Fixes #2678 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2017-01-06 18:54:20 -07:00
Gilles Gouaillardet	189d7b9480	opal/dss: revamp opal_value_unload() to keep valgrind happy reorder tests to avoid valgrind complaining about uninitialized variables Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 17:10:39 +09:00
Ralph Castain	444f5fa35d	Raise the priority of the usock component so it gets preferentially picked Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-05 22:53:04 -08:00
Gilles Gouaillardet	c2ddb1e2fc	mca/base: plug a memory leak register mca_base_var_enum_value_flag_t so they can be free'd upon finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:36 +09:00
Gilles Gouaillardet	6d5cb9fe0d	event: plug a leak when closing the event framework Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	b3a2bdda7b	opal/threads: manually invoke thread-specific key destructors on the main thread. there is no such thing as pthread_join(main_thread), so key destructors are never invoked on the main thread, which causes valgrind report some memory leaks. Manually store and then invoke the key destructors and make valgrind happy. Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	6ef281e163	pmix/base: fix misc memory leaks Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	a59dfd7b14	sec/munge: plug a memory leak Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 13:46:35 +09:00
Gilles Gouaillardet	c612499bc1	opal: mca/base: fix a memory leak in the mca_base_var_enum_flag_t destructor Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	7e5da7382e	btl/tcp: plug leaks when closing component remove tcp_local from the tcp_procs table, and release it Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:59 +09:00
Gilles Gouaillardet	507623d6b1	mpool/hugepage: plug a memory leak on finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:58 +09:00
Gilles Gouaillardet	51021028d6	mpool/base: plug a memory leak on finalize Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>	2017-01-06 11:35:58 +09:00
Ralph Castain	6509f60929	Complete the memprobe support. This provides a new scaling tool called "mpi_memprobe" that samples the memory footprint of the local daemon and the client procs, and then reports the results. The output contains the footprint of the daemon on each node, plus the average footprint of the client procs on that node. Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given. Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example: $ mpirun -npernode 2 ./mpi_memprobe Sampling memory usage after MPI_Init Data for node rhc001 Daemon: 12.483398 Client: 6.514648 Data for node rhc002 Daemon: 11.865234 Client: 4.643555 Sampling memory usage after MPI_Barrier Data for node rhc001 Daemon: 12.520508 Client: 6.576660 Data for node rhc002 Daemon: 11.879883 Client: 4.703125 Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-05 10:32:17 -08:00
Ralph Castain	91d714fe93	Add flags to direct PMIx to only use one listener, but without directing which one (tcp or usock) to use. This allows the user to set PMIX_MCA_ptl in their environment to select the transport method. Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-04 09:16:44 -08:00
Ralph Castain	f355fb926d	Continue cleanup of notifications. Resolve a race condition that can result in attempt to send a message on a closed socket Signed-off-by: Ralph Castain <rhc@open-mpi.org>	2017-01-04 09:16:33 -08:00

1 2 3 4 5 ...

4639 Коммитов