Jeff Squyres
4341639a66
Revert "configury: fix (again) XRC detection on OFED < 3.12"
...
@ggouaillardet is likely offline for the weekend, but master is broken
on RHEL 6.5 systems that do not have MOFED installed. So I'm taking
the liberty of revering this commit; I'm guessing Gilles will fixup
and re-commit next week.
This reverts commit 77f8282d51
.
2015-07-10 06:45:33 -07:00
Gilles Gouaillardet
77f8282d51
configury: fix (again) XRC detection on OFED < 3.12
...
since ibv_create_xrc_rcv_qp is now deprecated, and in order to
be "future-proof", we have to consider the case in which only XRC Domains are supported.
Thanks Paul Hargrove for the detailled report.
2015-07-10 15:31:45 +09:00
Rolf vandeVaart
ae0f3cfee7
Make explicit call to initalize MCA parameters in common CUDA code. This allows us to view them with ompi_info and possibly modify with tools interface
2015-07-09 12:51:55 -04:00
Rolf vandeVaart
cdffa4724d
Force smcuda BTL to use CUDA IPC path for all GPU buffers where possible
2015-07-08 17:11:25 -04:00
Ralph Castain
ed93154e43
Fix hetero operations. An error in the hwloc utilities only allocated memory for the first display of a binding map, and then assumed that all nodes had the same number of cores in them. This resulted in memory corruption whenever someone displayed a binding pattern for a hetero cluster, and a smaller node was first in line.
2015-07-07 12:52:16 -07:00
Gilles Gouaillardet
9f171de412
btl/openib: queue pending fragments once only when running out of credit
...
Fixes open-mpi/ompi#640
2015-07-06 09:45:01 +09:00
bosilca
77367ca02c
Merge pull request #687 from rolfv/pr/fix-smcuda-perfprob
...
Add the ability use different size buffers for host and CUDA buffers
2015-07-02 18:42:41 -04:00
Jeff Squyres
4e7d979f8d
Merge pull request #686 from jsquyres/pr/autogen-no-ompi-bool-fixes
...
bool: use SIZEOF__BOOL, not SIZEOF_BOOL
2015-07-02 12:19:07 -04:00
Rolf vandeVaart
30a872b478
Add the ability to send host buffers through one sized staging buffers and CUDA buffers through different sized buffers. Fixes performance issues
2015-07-02 11:11:15 -04:00
Jeff Squyres
f1353947ff
libfabric: fix wrappers for static builds
...
Need to set the WRAPPER_EXTRA flags so that the wrappers for static
builds pull in -lfabric.
Also update/fix some comments.
2015-07-02 07:58:16 -07:00
Jeff Squyres
cd5751c217
bool: use SIZEOF__BOOL, not SIZEOF_BOOL
...
When you "autogen.pl --no-ompi", the AC_SIZEOF(bool) test is not run.
But we *do* run AC_SIZEOF(_Bool), which is the equivalent. So switch
the uses of SIZEOF_BOOL in the code base to be SIZEOF__BOOL, and it's
all good.
2015-07-02 07:32:02 -07:00
Ralph Castain
861fe1d9dd
This is the third time I am fixing this - I have no idea who or why this is being reset.
2015-07-02 08:39:48 -05:00
Alina Sklarevich
27797654db
openib btl: added a new vendor_part_id for Mellanox ConnectX4-LX.
2015-06-29 13:50:43 +03:00
Ralph Castain
75ceec663a
Now that it has been officially released, update the embedded HWLOC to 1.11.0
2015-06-28 14:07:45 -07:00
bureddy
c78b8e9b8e
Merge pull request #664 from bureddy/master
...
powerpc: update mem barrier instructions
2015-06-25 14:09:49 -07:00
Jeff Squyres
a172bd161e
usnic: switch to use the new libfabric common library
...
The usnic BTL configure.m4 no longer needs to OPAL_CHECK_LIBFABRIC; it
just uses the results from opal/mca/common/libfabric's configure.m4.
We also now don't need to link against libfabric -- they just link
against the opal_common_libfabric library.
2015-06-25 13:33:15 -07:00
Ralph Castain
8d128fe090
Remove the non-null attributes from the cmd_line parser as this isn't something we can guarantee, and the optimization isn't worth the potential for error
2015-06-25 13:26:20 -07:00
Ralph Castain
ea0e21bb06
Add a common/libfabric component to the opal layer where we can place common functions
2015-06-25 11:04:00 -07:00
Nathan Hjelm
ee36d813dc
Merge pull request #657 from hjelmn/c99
...
more c99 updates
2015-06-25 11:21:09 -06:00
Howard Pritchard
f45914db9b
Merge pull request #670 from hppritcha/topic/ownership_update
...
ownership: update ownership files
2015-06-25 11:02:45 -06:00
Nathan Hjelm
4d92c9989e
more c99 updates
...
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-25 10:14:13 -06:00
rhc54
1a767ed47c
Merge pull request #654 from rhc54/topic/config
...
Remove internal bool type definitions
2015-06-25 09:10:21 -07:00
Howard Pritchard
e49a37c034
ownership: update ownership files
...
per discussions at OMPI devel workshop
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-06-25 10:04:42 -06:00
Devendar Bureddy
ed406b05cb
powerpc: update mem barrier instructions
...
- added isync interface.
- define opal_atomic_wmb() to lwsync as it is recommend over eieio
on cache enabled storage.
(http://www.ibm.com/developerworks/systems/articles/powerpc.html ).
2015-06-25 10:54:44 +03:00
Nathan Hjelm
4552afff06
Fix definition of MPI_T_pvar_get_index
...
The definition of MPI_T_pvar_get_index was incorrect. This commit
fixes the definition and adds a missing return code.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-24 17:31:26 -06:00
Ralph Castain
869041f770
Purge whitespace from the repo
2015-06-23 20:59:57 -07:00
Ralph Castain
a809902c0a
Now that we require C99, and stdbool.h is part of C99, we no longer need to define our own bool types. Since bool is commonly used in a lot of places, just include stdbool.h in opal_config_bottom.h
2015-06-23 11:31:48 -07:00
Ralph Castain
cc9b416ab3
Ensure we properly commit suicide if/when we lose connection to the daemon. There are multiple paths by which a lost daemon can be reported, and so a race condition exists in the pmix support. Our MPI layer wants the ability to determine the response to the failure, and so it will call down to the RTE with any abort request. This comes down to the pmix layer as a "pmix_abort" command, which involves communicating the request to the daemon - who is gone. Sadly, the pmix component may not know that just yet, and so we hang.
...
So add a brief timer event to kick us out of the communication. The precise amount of time we should wait is somewhat TBD, but set something short for now and we can adjust.
2015-06-18 09:45:52 -07:00
Jeff Squyres
8ab2b11f88
btl_openib.c: fix another compiler warning
...
Remove this unused variable
2015-06-17 09:00:12 -07:00
Jeff Squyres
f688289aaf
btl_openib.c: fix compiler warning
...
This return code is not used; tell the compiler we're not going to
use it.
2015-06-17 08:56:56 -07:00
Jeff Squyres
097b48d521
mca_base_component_respository.c: fix compiler warning
...
This function is only used in the DL case -- it can be #if'ed out if
we're not compiling with DL support to avoid a compiler warning about
defined-but-not-used.
2015-06-17 08:54:59 -07:00
Jeff Squyres
dfa36197ea
usnic/Makefile.am: ensure static builds include -lfabric
2015-06-17 08:15:29 -07:00
Gilles Gouaillardet
2cef2d0fe6
opal/memory: silence a warning
...
as reported by Coverity with CID 71663
2015-06-17 11:17:55 +09:00
Gilles Gouaillardet
58d1b3f4d0
opal_os_dirpath_create: fix TOCTOU
...
as reported by Coverity with CID 70396
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
de66447ebb
opal_cmd_line_get_usage_msg: silence warning
...
as reported by Coverity with CID 1269967
2015-06-17 11:17:54 +09:00
Gilles Gouaillardet
f2f66e6e63
opal_daemon_init: silence warning
...
as reported by Coverity with CID 710642
2015-06-17 11:17:53 +09:00
Gilles Gouaillardet
8427e87ee9
opal_argv_delete: silence warning
...
as reported by Coverity with CID 71914
2015-06-17 11:17:53 +09:00
Gilles Gouaillardet
d9c490cf9f
refactor opal_bitmap_get_string
...
make it more efficient and fix CID 71992 (dead code)
2015-06-17 11:17:53 +09:00
Jeff Squyres
44e7646de9
usnic/configure.m4: convert to use external libfabric
...
Use the new OPAL_CHECK_LIBFABRIC macro.
2015-06-15 15:17:06 -07:00
Jeff Squyres
3e1b85ceb3
libfabric: remove embedded libfabric
...
OMPI now only builds against external libfabric installations.
2015-06-15 15:17:05 -07:00
Jeff Squyres
c74ab51dd4
opal/mca/dl/dl.h: fix the #ifndef/#define name
...
Thanks to Scott Atchley for noticing the name mismatch.
2015-06-15 13:08:57 -07:00
rhc54
adbff46a13
Merge pull request #642 from rhc54/topic/hwloc
...
Update hwloc to 1.11.0
2015-06-13 12:09:58 -07:00
Ralph Castain
ff92781ec4
Replace hwloc191 with hwloc1110
...
Fix hwloc compile. Ignore LAMA mapper due to deprecated hwloc functions
2015-06-13 10:11:45 -07:00
Jeff Squyres
4384131e65
openib: minor style and defensive programming fixes
...
Minor comment/whitespace fixes. Also some minor logic changes that
are mainly for defensive programming purposes (i.e., ensure to always
set malloc_hook_set to true or false, and then check it before we try
to actually invoke it).
2015-06-12 20:11:47 -07:00
Jeff Squyres
2f137ff151
openib: reset memalign threshhold properly
...
Now that open-mpi/ompi#638 is fixed, reset the openib BTL memalign
threshhold properly.
This effectively re-instates commit
open-mpi/ompi@ce915b5757 .
2015-06-12 20:11:47 -07:00
Jeff Squyres
88c13adc8c
openib: only set the memory hook if it is enabled
...
Instead of unconditionally setting the memory hook, only set it when
the memory hooks are both available and have been enabled (e.g.,
opal/mca/memory/linux has decided that it *can* be enabled, and when
the mpi_leave_pinned MCA param is set to 1, or is set to -1 and some
component requested the memory hooks be enabled).
If we set the memory hook when memory hooks are not enabled,
__malloc_hook will be NULL, which will cause problems when
btl_openib_malloc_hook() tries to invoke it.
Fixes open-mpi/ompi#638 .
2015-06-12 20:11:47 -07:00
Ralph Castain
12d3c9ca22
Revert "Fix a typo that incorrectly set the alignment threshold in the openib BTL."
...
This reverts commit ce915b5757
.
2015-06-10 14:02:49 -07:00
Gilles Gouaillardet
8885b34637
mca/base: fix a misc memory leak
...
as reported by Coverity with CID 1294415
2015-06-10 15:10:57 +09:00
Gilles Gouaillardet
9e278a21ce
opal/crs: fix a string overflow
...
and revamp out of resource handling
fixes resource leak as reported by Coverity with CID 1304752
2015-06-10 14:23:25 +09:00
Nathan Hjelm
6772d32b85
opal/crs: silence clang warnings introduced by coverity fixes
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-06-08 09:16:13 -06:00