1
1
Граф коммитов

4121 Коммитов

Автор SHA1 Сообщение Дата
rhc54
99d3c283f5 Merge pull request #1681 from rhc54/topic/pmixupdate
Update PMIx 114 to current release candidate
2016-05-19 13:50:16 -07:00
Ralph Castain
6f743f81b6 Update PMIx 114 to current release candidate 2016-05-19 12:55:05 -07:00
Jeff Squyres
87233aae49 ethtool: better handle portability
Be sure to handle the case where we don't have ethtool support at all.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-19 10:57:14 -07:00
Gilles Gouaillardet
fd93d236b1 opal/util/ethtool: fix compilation on older Linux when struct ethtool_cmd has no speed_hi field
Refs: open-mpi/ompi#1628
2016-05-19 11:58:04 +09:00
Jeff Squyres
66f53ec29a Merge pull request #1628 from kmroz/wip-btl-tcp-ethtool-speed
btl/tcp: autodetect bandwidth and latency if unset by the user
2016-05-18 12:12:55 -04:00
Nathan Hjelm
9371a6a52d Merge pull request #1673 from hjelmn/fix_rcache_deadlock
rcache: fix deadlock in multi-threaded environments
2016-05-18 08:32:21 -07:00
Karol Mroz
ca6ddf3270 btl/tcp: autodetect bandwidth and latency if unset
Fixes open-mpi/ompi#120

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Karol Mroz
b9c6c43c6b btl/tcp: add default defines for bandwidth and latency
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Karol Mroz
31e33a64f9 opal/util: add function to obtain interface speed
If kernel ethtool_cmd_speed() is not available, use copies if possible.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:51 +02:00
Nathan Hjelm
ab8ed177f5 rcache: fix deadlock in multi-threaded environments
This commit fixes several bugs in the registration cache code:

 - Fix a programming error in the grdma invalidation function that can
   cause an infinite loop if more than 100 registrations are
   associated with a munmapped region. This happens because the
   mca_rcache_base_vma_find_all function returns the same 100
   registrations on each call. This has been fixed by adding an
   iterate function to the vma tree interface.

 - Always obtain the vma lock when needed. This is required because
   there may be other threads in the system even if
   opal_using_threads() is false. Additionally, since it is safe to do
   so (the vma lock is recursive) the vma interface has been made
   thread safe.

 - Avoid calling free() while holding a lock. This avoids race
   conditions with locks held outside the Open MPI code.

Fixes open-mpi/ompi#1654.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-17 09:02:40 -06:00
Nathan Hjelm
f6938868bd Merge pull request #1659 from hjelmn/sync_64
sync_builtin: check for 64-bit atomic support
2016-05-17 05:40:04 -07:00
rhc54
8b534e9897 Merge pull request #1668 from rhc54/topic/slurm
When direct launching applications, we must allow the MPI layer to pr…
2016-05-16 12:23:19 -07:00
Howard Pritchard
1a676e5b35 pmix/cray: fix some breakage
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-05-16 12:45:05 -05:00
Gilles Gouaillardet
4e21933a74 memory/patcher: declare __curbrk as extern in order not to generate an (unitialized) common symbol 2016-05-16 09:30:11 +09:00
Ralph Castain
01ba861f2a When direct launching applications, we must allow the MPI layer to progress during RTE-level barriers. Neither SLURM nor Cray provide non-blocking fence functions, so push those calls into a separate event thread (use the OPAL async thread for this purpose so we don't create another one) and let the MPI thread sping in wait_for_completion. This also restores the "lazy" completion during MPI_Finalize to minimize cpu utilization.
Update external as well

Revise the change: we still need the MPI_Barrier in MPI_Finalize when we use a blocking fence, but do use the "lazy" wait for completion. Replace the direct logic in MPI_Init with a cleaner macro
2016-05-14 16:37:00 -07:00
Gilles Gouaillardet
456b73da69 btl/openib: fix error path in init_one_device()
do not explicitly release ib verbs components since they will
be released in the object destructor

Thanks Durga for the report
2016-05-13 09:03:48 +09:00
Jeff Squyres
30f913f217 Merge pull request #1652 from jsquyres/pr/remove-aix-timer
timer/aix: remove stale code
2016-05-10 15:47:02 -04:00
Jeff Squyres
eccf0ff4cd hwloc/external: set WRAPPER_EXTRA_* vars in proper location
WRAPPER_EXTRA flags are checked *before* the POST_CONFIG macro is
invoked.  So set them in the main CONFIG macro.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-10 07:34:56 -07:00
Josh Hursey
44d95cb610 Merge pull request #1657 from bgoglin/hwloc-for-2.0
configure: check the actual may_alias syntax that we use
2016-05-09 13:37:08 -05:00
Ralph Castain
7767882346 Per user request, add some missing data and definitions:
OPAL_PMIX_UNIV_RANK - synonym for OPAL_PMIX_GLOBAL_RANK
OPAL_PMIX_APP_SIZE - #ranks in the application of this proc
2016-05-09 08:39:01 -07:00
Nathan Hjelm
d99a9786b6 sync_builtin: check for 64-bit atomic support
This commit adds an additional check for 64-bit atomic support for __sync
builtins. If 64-bit support is not available the opal_atomic_*_64 atomics
are disabled.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-05-09 03:17:51 -06:00
Brice Goglin
6839d928c2 configure: check the actual may_alias syntax that we use
xlc 13.1.0 crashes because of our may_alias attributes in nolibxml.c
on Power7. libxml.c and nolibxml.c are the only may_alias users for now,
so change our configure check to match the actual code using it.

Thanks to Paul Hargrove for reporting and debugging the issue,
and providing the patch.

https://www.open-mpi.org/community/lists/devel/2016/05/18918.php

(cherry picked from open-mpi/hwloc@0ab7af5e90)
2016-05-08 22:22:30 +02:00
Ralph Castain
7594b95e4b Ensure the hwloc external header is include when --with-devel-headers is given 2016-05-08 10:18:14 -07:00
Jeff Squyres
acbd2c608d memory/patcher: check for <sys/syscall.h>
Thanks to Paul Hargrove for reporting.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-07 09:48:14 -07:00
Jeff Squyres
b4982d7725 timer/aix: remove stale code
Per discussion on the mailing list and with IBM, remove the AIX timer
code (since AIX is no longer supported).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-07 09:31:34 -07:00
Ralph Castain
7e5ef6a240 Fix the env_list support - the MCA param was being set way too early, so provide a "backdoor" way of providing the value 2016-05-06 15:38:39 -07:00
Ralph Castain
58dd41facf Repair the processing of cmd line options that mapped to MCA params. This was responsible for breaking things like map-by <foo>.
Remove debug, let orterun send terminate cmd to DVM

Recover the DVM support
2016-05-06 13:14:03 -07:00
Josh Hursey
35ae7e33d7 Merge pull request #1639 from jjhursey/topic/dl-open-null-fname
dl/dlopen/libltdl: Allow opal_dl_open to take a NULL filename.
2016-05-05 22:15:46 -05:00
Ralph Castain
8ec1891d11 Silence warning 2016-05-05 20:04:10 -07:00
Ralph Castain
08022d7af1 Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required. 2016-05-05 15:28:13 -07:00
Joshua Hursey
677178f206 dl/dlopen/libltdl: Allow opal_dl_open to take a NULL filename. 2016-05-05 17:07:26 -04:00
Nathan Hjelm
80f45925bc Merge pull request #1629 from hjelmn/new_hooks_update
New hooks update
2016-05-04 18:53:25 -06:00
Joshua Hursey
788cf1a9fe asm/powerpc: Fix empty colon list in asm for XL compiler on power
Thanks to Paul Hargrove for reporting the problem, and submitting patch.
 * https://www.open-mpi.org/community/lists/devel/2016/05/18886.php
2016-05-04 14:14:33 -05:00
Nathan Hjelm
ff2a54bd37 patcher/linux: code cleanup
Update based on cleanup made to the upstream version on OpenUCX.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-04 12:53:45 -06:00
Nathan Hjelm
6c9a0e1c55 patcher/overwrite: disable ia64 support for now
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-04 12:53:24 -06:00
Nathan Hjelm
6ad68da407 patcher/linux: disable the linux patcher component
This commit disables the linux patcher component due to a limitation
in loader patching. While this component is effective in patching
calls made within Open MPI and by the application it fails to hook
calls made within glibc. This means the munmap call made by free is
not correctly hooked. Until this problem can be resolved this
component will remain disabled. If it can't be resolved this component
should probably be removed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-04 12:50:51 -06:00
Nathan Hjelm
71be36d380 patcher: fix ppc32 support
The table of contents (TOC) code only appears to only apply to
ppc64. The code was incorrectly assuming the existence of the TOC on
ppc32. This commit updates the necessary code to only apply to ppc64.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-04 12:50:32 -06:00
Nathan Hjelm
41f00b7465 memory/patcher: initialize patcher framework when needed
This commit moves the patcher framework initialization to the
memory/patcher component.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-04 12:46:42 -06:00
Nathan Hjelm
0f54a95408 Merge pull request #1626 from hjelmn/vader_32
btl/vader: fix compilation on 32-bit systems
2016-05-03 16:39:46 -06:00
Nathan Hjelm
4a740e9f27 Merge pull request #1619 from hjelmn/ext_verbs_fix
btl/openib: fix check for exp verbs struct members
2016-05-03 14:16:17 -06:00
Nathan Hjelm
e7ccbdee27 btl/vader: fix compilation on 32-bit systems
This commit fixes a compile/link issue caused by vader. The vader btl
was using OPAL_THREAD_ADD64 to increment a counter which may not be
available on 32-bit systems. Changed to use OPAL_THREAD_ADD_SIZE_T
which will be 64-bit or 32-bit depending on the system.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-03 10:14:44 -06:00
Nathan Hjelm
2d0e2b6233 patcher: do not clobber ebx
ebx can not be clobbered when using -fPIC so save and restore the
register instead of allowing it to be clobbered.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-05-03 08:24:33 -06:00
Brice Goglin
a2a721f961 linux: actually enable libudev based on the result of AC_CHECK_LIB
instead of doing AC_CHECK_HEADERS+AC_CHECK_LIB and only using the result of the former.

Thanks to Paul Hargrove for reporting the issue (OMPI build with -m32).

(cherry picked from open-mpi/hwloc@9549fd59af)
2016-05-03 10:00:40 +02:00
Nathan Hjelm
da695a6ce6 Merge pull request #1618 from hjelmn/new_hooks_update
More hook updates
2016-05-02 18:12:50 -06:00
Nathan Hjelm
a65af6d079 btl/openib: fix check for exp verbs struct members
This commit fixes a compilation issue with some versions of exp
verbs. In some cases struct ibv_exp_device_attr does not have either
the exp_atom or exp_atomic_cap fields. It is fine to drop one check
and fall back to the non-exp attribute check on the other.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 17:13:33 -06:00
Nathan Hjelm
1ff79656dd patcher: remove debug fprintf
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 17:11:00 -06:00
Nathan Hjelm
581e47c271 patcher: check for clflush
Add a feature check for clflush before trying to use the clflush
instruction. As far as I can tell there is no equivalent before the
SSE2 instruction set.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 17:10:42 -06:00
Nathan Hjelm
67fd6fa6eb Merge pull request #1615 from hjelmn/new_hooks_update
memory/patcher: add #if check for MREMAP_FIXED
2016-05-02 16:26:58 -06:00
Nathan Hjelm
eb14b34f04 memory/patcher: fix compilation on BSDs
The function signature of mremap on BSD (NetBSD, FreeBSD) differs from
the linux version. Added support for the BSD style of mremap.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 14:54:08 -06:00
Nathan Hjelm
52edb43bdc memory/patcher: check for linux/mman.h
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 14:29:46 -06:00