openmpi

Автор	SHA1	Сообщение	Дата
Gilles Gouaillardet	f1f1fb15eb	pmix3x: configury: output major, minor and release version after checking them and hence fix the configure output (back-ported from upstream commit pmix/master@7b7cdda2de)	2016-10-08 13:01:28 +09:00
Gilles Gouaillardet	f3af799608	pmix3x: misc fixes to get pmix build on Solaris - replace MAXHOSTNAMELEN with hardcoded 1024. unlike Linux, Solaris #define MAXHOSTNAMELEN in <netdb.h>, so use a hard coded value to keep the test simpl - stdout cannot be assigned on Solaris, so use freopen instead (back-ported from upstream commit pmix/master@a63f6e53f4)	2016-10-08 13:01:28 +09:00
Gilles Gouaillardet	5cbfddb8f1	pmix3x: fix misc memory leaks (back-ported from upstream commit pmix/master@1eff526929)	2016-10-08 13:01:28 +09:00
Gilles Gouaillardet	b4e4e4a5f1	pmix3x: enhance pmix_nspace_t destructor PMIX_RELEASE all elements stored in the internal and modex hash tables (back-ported from upstream commit pmix/master@b90674fc52)	2016-10-08 13:01:27 +09:00
Gilles Gouaillardet	f1dc033767	pmix3x: add the PMIX_HASH_TABLE_FOREACH macro this is a convenience macro similar to the PMIX_LIST_FOREACH macro, that can be used to iterate on all the key/value pairs of a pmix_hash_table_t (back-ported from upstream commit pmix/master@349971c68c)	2016-10-08 13:01:27 +09:00
Jeff Squyres	67684be7c9	usnic: fix one last stray fabric_attr->name --> linux_device_name Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-10-04 18:17:38 -07:00
Jeff Squyres	8b77359cac	usnic: remove some legacy libfabric 1.0/1.1 code We only support running with libfabric v1.3 or greater. So it's safe to remove the legacy/adaptive cq_readerr() behavior. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-10-03 11:59:41 -07:00
Jeff Squyres	345c07a252	usnic: require libfabric >= v1.3 at run time There are critical usnic libfabric AV insert bugs before v1.3, so don't allow any version prior to v1.3 at run time (still allow compiling with earlier versions, though, since the ABI guarantees allow us to compile with an earlier libfabric and run with a later libfabric). Switch to using fi_version() to check the version (instead of calling fi_getinfo()) as a potentially lighter-weight / simpler solution. This allows us to only call fi_getinfo() once. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-10-03 11:59:41 -07:00
Jeff Squyres	b13813810f	usnic: print a helpful message invoke PML error callback The previous message was unhelpful / confusing. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-10-03 11:59:41 -07:00
Gilles Gouaillardet	7601e783cc	pmix3x: sec/munge: add a missing include file (cherry picked from upstream pmix/master@f7cfb11f6b)	2016-10-03 16:09:10 +09:00
Ralph Castain	e773c17cf3	Put show_help thru the PMIx "log" API. This pushes the show_help output from apps into the pmix thread, thus avoiding conflicts in the RML thread, which should help with thread lock situations.	2016-10-02 16:02:23 -07:00
Jeff Squyres	545d8f2e66	usnic cagent: correctly compute the "large" ping message size The (effective) "+42" computation was, in fact, the incorrect answer in this case (gasp!). We should just take the max_msg_size from the command (which came from the libfabric endpoint max_msg_size attribute in the client) and subtract off the max header size: 68 (which is explained in the comment). This will result in a "large" message size which is likely slightly smaller than the MTU, but still right up near the MTU, and therefore good enough. Note: the old computation (i.e., -(68-42)) worked fine when we asked for Libfabric API v1.1 because the usnic provider would return a max_msg_size that was already less than the MTU due to FI_PREFIX behavior shenanigans. Once we started asking for Libfabric API v1.4, the usnic Libfabric provider started returning (MTU + prefix_size), and the -(68-42) computation started giving a value that was over the MTU. This caused sendto() on the connectivity checker UDP socket to fail. This commit also removes an old/misleading comment. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-09-30 17:01:05 -07:00
Joshua Hursey	f6f24a4f67	build: Custom libmpi(_FOO) name option in configure * Add a configure time option to rename libmpi(_FOO).* - `--with-libmpi-name=STRING` * This commit only impacts the installed libraries. Internal, temporary libraries have not been renamed to limit the scope of the patch to only what is needed. For example: ```shell shell$ ./configure --with-libmpi-name=wookie ... shell$ find . -name "libmpi" shell$ find . -name "libwookie" ./lib/libwookie.so.0.0.0 ./lib/libwookie.so.0 ./lib/libwookie.so ./lib/libwookie.la ./lib/libwookie_mpifh.so.0.0.0 ./lib/libwookie_mpifh.so.0 ./lib/libwookie_mpifh.so ./lib/libwookie_mpifh.la ./lib/libwookie_usempi.so.0.0.0 ./lib/libwookie_usempi.so.0 ./lib/libwookie_usempi.so ./lib/libwookie_usempi.la shell$ ```	2016-09-29 21:47:24 -05:00
Gilles Gouaillardet	871ade9231	pmix/{cray,s1,s2}: make pmi_opcaddy_t class static theses three pmix components use the same class name, declare it as static so Open MPI can be built with --disable-dlopen Thanks Limin Gu for the report	2016-09-28 09:18:36 +09:00
Jeff Squyres	1a5a5fb400	Merge pull request #1861 from bharatpotnuri/master btl/openib: Disqualify rdmacm CPC if MPI_THREAD_MULTIPLE	2016-09-27 13:03:35 -04:00
Potnuri Bharat Teja	740b636dbe	btl/openib: Disqualify rdmacm CPC if MPI_THREAD_MULTIPLE The rdmacm CPC in the openib BTL is not thread safe. The rdmacm CPC should disqualify itself (instead of failing in random ways) if MPI_THREAD_MULTIPLE is the thread level. Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>	2016-09-27 14:20:59 +05:30
Gilles Gouaillardet	1fbc9a5431	pmix3x: dstore/pmix: flock portability Using the fcntl-locking instead of the flock (back-ported from upstream pmix/master@3030a0cca1)	2016-09-27 13:21:03 +09:00
George Bosilca	066370202d	Support non-monotonic assembly timers. If monotonic support has been required by the runtime and the assembly timers are unable to provide it, fall back to clock_gettime.	2016-09-23 21:51:34 -04:00
George Bosilca	45dcf1f5d7	Always use the best timer available If we have better timer than clock_gettime use it, even if it an assembly timer.	2016-09-23 19:32:58 -04:00
George Bosilca	93fa94f96f	Re-enable support for local addresses. This patch is based on the "RFC: Reenabling the TCP BTL over local interfaces (when specifically requested)". It removes the hardcoded exception for the local devices that has been enforced by the TCP BTL. Instead, we exclude the local interface only via the exclude MCA (both IPv4 and IPv6 local addresses are already in the default if_exclude), which is also the behavior currently described in our README file.	2016-09-23 13:04:33 -04:00
Gilles Gouaillardet	362a5886de	pmix3x: client: fix PMIx_Finalize() sequence pmix_progress_thread_finalize() invokes libevent event_base_free, so all libevent stuff cannot be used after. Hence, pmix_client_globals.myserver must be PMIX_DESTRUCT'ed before invoking pmix_progress_thread_finalize()	2016-09-24 00:01:23 +09:00
Gilles Gouaillardet	5479c6cca7	pmix3x: add missing #include and get Open MPI build on OpenBSD 6.0	2016-09-23 11:23:18 +09:00
Gilles Gouaillardet	eaee1332e1	opal/util/ethtool: add missing headers and get Open MPI build on OpenBSD 6.0	2016-09-23 11:22:19 +09:00
Ralph Castain	a14ec3bdbc	Mucho thanks to Gilles - his patch to reorder the CPPFLAGS solves the problem of inadvertently picking up hwloc and libevent headers from locations in CPPFLAGS while continuing to build the embedded versions. Also silence a minor warning about an uninitialized var.	2016-09-22 07:39:22 -07:00
George Bosilca	131fe42db8	Fix MT wait-sync. Prevent a race condition between a thread checking count and then going in cond_wait, and another thread setting the count to 0 and signaling the condition. Thanks to Pascal Deveze for catching up the bug and for the initial patch.	2016-09-21 07:42:48 -04:00
Gilles Gouaillardet	fbf03299c3	Merge pull request #2079 from ggouaillardet/topic/pmix_configury_dlopen pmix3x: configury: correctly handle --disable-dlopen	2016-09-21 10:59:33 +09:00
Gilles Gouaillardet	6c1e25b76e	pmix/ext11: fix pmix1_value_unload() prototype and call pmix1_value_unload() was added a "key" argument which is unused, and pmix1_value_unload() was sometimes invoked with two arguments instead of three. since the "key" argument is unused, simply remove it from the subroutine prototype and calls.	2016-09-20 14:34:41 +09:00
Gilles Gouaillardet	e6f7facd7d	opal/util: improve error message in opal_os_dirpath_create()	2016-09-18 17:10:47 +09:00
Gilles Gouaillardet	4b47daeeb0	opal/util: improve return status of opal_os_dirpath_create()	2016-09-18 12:32:42 +09:00
George Bosilca	295eec7059	Small fix for persistence receives. A minor optimization, few typos and extra comments	2016-09-16 10:27:32 -04:00
Nathan Hjelm	2edc77b27b	asm/ppc: work around apparent PGI 16.9 bug The add_64, sub_64, and cmpset_64 atomics used "+m" (*addr) to indicate the asm also writes the memory location. This is better than using a memory clobber. PGI 16.9 introduced a bug that causes a compiler failure on the "+m" constraint (input/output). It seems to work with "=m" (output) which matches the 32-bit atomics. Fixes open-mpi/ompi#2086 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-15 12:43:31 -06:00
Gilles Gouaillardet	041a431966	pmix3x: configury: correctly handle --disable-dlopen the LT_* macros do overwrite the enable_dlopen variable, so it must be tested and saved before invoking LT_INIT. delay the invokation of the LT_* macros and use the PMIX_ENABLE_DLOPEN_SUPPORT variable to figure out whether --disable-dlopen was invoked	2016-09-15 13:26:20 +09:00
Nathan Hjelm	4c9e38e8e0	Merge pull request #2077 from hjelmn/tcp_fix btl/tcp: fix double list remove	2016-09-13 12:21:52 -06:00
Nathan Hjelm	a681837ba8	btl/tcp: fix double list remove This commit fixes an abort during finalize because pending events were removed from the list twice. References #2030 Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-13 09:23:12 -06:00
Gilles Gouaillardet	628c730196	pkgconfig: define the pkgincludedir variable in *.pc files this has been made necesarry with open-mpi/ompi@12e796dcaf Refs open-mpi/ompi#2069	2016-09-13 09:50:14 +09:00
Artem Polyakov	9eba1b0b75	Merge pull request #2042 from artpol84/pmix_sdirs Several fixes related to session directories:	2016-09-07 14:15:47 +07:00
Gilles Gouaillardet	cd2b5a82ed	hwloc: plug memory leak as reported by Coverity with CID 1270441	2016-09-07 10:08:44 +09:00
Gilles Gouaillardet	44a66e208c	threads: fix WAIT_SYNC_INIT with a zero count WAIT_SYNC_INIT(sync,0); WAIT_SYNC_RELEASE(sync); hanged because sync->signaled was initialised to true, and there is no reason to invoke WAIT_SYNC_SIGNALED(sync) before WAIT_SYNC_RELEASE(sync) this commit initializes sync->signaled to true unless the count is zero. Thanks George for the review and guidance.	2016-09-07 10:03:40 +09:00
Nathan Hjelm	27a2509fec	Merge pull request #2051 from hjelmn/ppc_asm opal/asm: updates to powerpc assembly	2016-09-06 15:13:28 -06:00
Jeff Squyres	527efec4fb	Merge pull request #2050 from jsquyres/pr/btl-tcp-help-messages Add a show_help message to TCP BTL when peer unexpectedly disconnects	2016-09-06 09:40:31 -04:00
Jeff Squyres	1953e3406f	btl/tcp: add show_help message when peer hangs up We commonly see messages on the users list where a peer has hung up because it has crashed. Instead of having just a BTL_ERROR message, make this a real opal_show_help() message that tells the user that the peer unexpectedly hung up, and they should look into why that peer hung up. Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-09-06 09:40:03 -04:00
Gilles Gouaillardet	894be7860a	gcc_builtin/atomic: Silence numerous warnings from Studio compilers This commit adds selective use of a compiler-specific pragma to silence the numerous warnings the Sun/Oracle/Studio compilers emit for the GNU-style inline asm used in atomic.h. Thanks Paul Hargrove for the initial patch and the guidance.	2016-09-06 09:07:16 +09:00
Gilles Gouaillardet	4b208e4463	btl/tcp: make mca_btl_tcp_proc_insert re-entrant otherwise bad things happen with --mca btl_tcp_progress_thread 1 (non default) and --mca mpi_add_procs_cutoff 0 (default)	2016-09-05 15:57:34 +09:00
Artem Polyakov	dc0ab674de	Add PMIx key to provide RM with ability to indicate that it will cleanup session directories provided at through OPAL_PMIX_TMPDIR, OPAL_PMIX_NSDIR, OPAL_PMIX_PROCDIR	2016-09-05 07:48:44 +03:00
Nathan Hjelm	a36bdfe69f	opal/asm: updates to powerpc assembly This commit contains the following changes: - There is a bug in the PGI 16.x betas for ppc64 that causes them to emit the incorrect instruction for loading 64-bit operands. If not cast to void * the operands are loaded with lwz (load word and zero) instead of ld. This does not affect optimized mode. The work around is to cast to void * and was implemented similar to a work-around for a xlc bug. - Actually implement 64-bit add/sub. These functions were missing and fell back to the less efficient compare-and-swap implementations. Thanks to @PHHargrove for helping to track this down. With this update the GCC inline assembly works as expected with pgi and ppc64. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-02 23:47:47 -06:00
Jeff Squyres	95c6f6cfc0	btl/tcp: fix help message It looks like one help message was accidentally pasted in the middle of another. Disentangle the two messages from each other, and slightly tweak the one message to say that the job may also crash (in addition to hanging). Signed-off-by: Jeff Squyres <jsquyres@cisco.com>	2016-09-02 17:14:22 -04:00
Nathan Hjelm	f93c1f2106	btl/ugni: fix erroneous warning message This commit prevents the connection code from trying to connect an endpoint if the directed datagram has been posted but not received. Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>	2016-09-02 09:17:44 -06:00
Ralph Castain	34f04a7924	Remove spurious Makefile.am line	2016-09-01 15:31:09 -07:00
Ralph Castain	0ea1cff733	Implement notification of completion on comm_spawn'd child jobs. Add a configure flag to enable PMIx 3's shared memory datastore, and set it disable by default so that comm_spawn functions again. Will reverse the default once that feature is fully functional	2016-09-01 13:10:10 -07:00
rhc54	39d086e000	Merge pull request #2035 from rhc54/topic/memprofile Provide a mechanism for obtaining memory profiles of daemons and application profiles for use in studying our memory footprint	2016-08-31 14:06:48 -05:00

... 2 3 4 5 6 ...

4540 Коммитов