George Bosilca
a6d515ba9e
Fixes opal_atomic_ll_64. Thanks to Paul Hardgrove for
...
the report and his patch.
This is an addition to #1140 and should go in 2.x
2016-08-27 12:43:48 -04:00
Nathan Hjelm
d33204b0dc
Merge pull request #2021 from hjelmn/xlc_fix
...
opal/patcher: fix xlc support
2016-08-26 18:15:41 -06:00
rhc54
b90a64e734
Merge pull request #2022 from rhc54/topic/nnodes
...
Provide the number of nodes in the job
2016-08-26 18:15:24 -05:00
Ralph Castain
2f6e0fec90
Provide the number of nodes in the job
2016-08-26 14:50:41 -07:00
Jeff Squyres
09ad7e81eb
Merge pull request #2007 from jsquyres/pr/usnic-show-local-udp-ports
...
usnic: show the local UDP ports
2016-08-26 17:03:16 -04:00
Nathan Hjelm
a9bc692d99
opal/patcher: fix xlc support
...
The xlc compiler seems to behave in a different way that gcc when it
comes the inline asm. There were two problems with the code with xlc:
- The TOC read in mca_patcher_base_patch_hook used the syntax
register unsigned long toc asm("r2") to read $r2 (the TOC
pointer). With gcc this seems to behave as expected but with xlc
the result in toc is not the same as $r2. I updated the code to use
asm volatile ("std 2, %0" : "=m" (toc)) to load the TOC pointer.
- The OPAL_PATCHER_BEGIN macro is meant to be the first thing in a
hook. On PPC64 it loads the correct TOC pointer (thanks to
mca_patcher_base_patch_hook) and saves the old one. The
OPAL_PATCHER_END macro restores the TOC pointer. Because we *need*
the TOC to be correct before it is accessed in the hook the
OPAL_PATCHER_BEGIN macro MUST come first. We did this and all was
well with gcc. With xlc on the other hand there was a TOC access
before the assembly inserted by OPAL_PATCHER_BEGIN. To fix this
quickly I broke each hook into a pair of function with the
OPAL_PATCHER_* macros on the top level functions. This works around
the issue but is not a clean way to fix this. In the future we
should 1) either update overwrite to not need this, or 2) figure
out why xlc is not inserting the asm before the first TOC read.
This fixes open-mpi/ompi#1854
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-26 14:43:03 -06:00
Jeff Squyres
87a5ccc060
usnic: show the local UDP ports
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-26 12:25:18 -07:00
Jeff Squyres
e03a40a0e9
pmix3x: remove generated file
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-26 10:30:47 -07:00
Jeff Squyres
9ae51a09f2
Merge pull request #1989 from jsquyres/pr/update-usnic-to-libfabric-v1.4
...
Update usnic BTL to libfabric v1.4
2016-08-26 09:53:07 -04:00
Gilles Gouaillardet
e4bf915e75
pmix3x: remove auto-generated file
...
remove opal/mca/pmix/pmix3x/pmix/src/include/pmix_config.h.in
.gitignore is correct, so it seems this file was added before .gitignore was updated
2016-08-26 15:00:18 +09:00
Ralph Castain
af67f16422
Update configury to support multiple PMIx versions, rename pmix2x component to pmix3x for support of PMIx master
...
Update support for external v1.1.x and v2.x libraries. Minor corrections to the v3.x component
2016-08-25 18:19:05 -07:00
Gilles Gouaillardet
277c319389
opal/util: fix (again and again) incorrect type casting in opal_path_df
...
and silence CID 1371767
this fixes previous commits :
- open-mpi/ompi@2eec8970ff
- open-mpi/ompi@a439afce5b
2016-08-26 09:42:45 +09:00
Nathan Hjelm
de32c779e2
opal/wait_sync: add #if protection on header
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-25 14:31:52 -06:00
Jeff Squyres
f56b16f079
usnic: remove unused variable
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
9717bcb7e6
btl/usnic: remove stale comment
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
6f5e377fe0
btl/usnic: update for libfabric v1.4
...
With libfabric v1.4, the usnic provider changed the values of its
fabric and domain name strings (compared to libfabric <v1.4). Update
the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain
names.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:17 -07:00
George Bosilca
3adff9d323
Fixes #1793 .
...
Reshape the tearing down process (connection close) to prevent race
conditions between the main thread and the progress thread.
Minor cleanups.
2016-08-24 22:45:19 -04:00
Nathan Hjelm
83062db7cb
btl/ugni: actually make the endpoint lock recursive
...
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-24 10:36:08 -06:00
Gilles Gouaillardet
2eec8970ff
opal/util: fix (again) incorrect type casting in opal_path_df
...
this fixes previous commit open-mpi/ompi@a439afce5b
2016-08-24 12:50:15 +09:00
Gilles Gouaillardet
02847d9e7b
pmix2x: dstore: add missing <fcntl.h> include file in pmix_esh.c
...
(back-ported from upstream pmix/master@5c66ffe0f0 )
2016-08-24 11:18:46 +09:00
Gilles Gouaillardet
c11e8163f8
pmix2x: sec/native: fix the pmix_native module under solaris by using getpeerucred()
...
and fail with a user friendly message if no method is available:
"sec: native cannot validate_cred on this system"
(back-ported from upstream pmix/master@c474a1fc60 )
2016-08-24 11:18:40 +09:00
Gilles Gouaillardet
e91292aa41
pmix2x: configury: add missing check for <netdb.h> header file
...
(back-ported from upstream pmix/master@e54ce6d423 )
2016-08-24 11:18:32 +09:00
Gilles Gouaillardet
a439afce5b
opal/util: fix incorrect type casting in opal_path_df
2016-08-24 10:26:13 +09:00
Potnuri Bharat Teja
9b7f9ece20
Add Chelsio T6 adapter device parameters.
...
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2016-08-23 10:38:13 +05:30
Ralph Castain
639dbdb7ea
For maintainability, fold the external PMIx 2.x integration into the internal PMIx 2.x library component. This ensures that we always stay in sync with the two as that is becoming a problem.
2016-08-22 13:28:55 -07:00
George Bosilca
fd57f5bccd
Remove some of the clang warnings.
2016-08-20 14:21:42 -04:00
Ralph Castain
61ffba668b
Roll in the latest PMIx version - includes shared memory datastore and reduced memory footprint
2016-08-20 07:53:06 -07:00
Artem Polyakov
6ea8cccdab
Merge pull request #1969 from artpol84/pmix_jobid_fix
...
Pmix jobid fix
2016-08-18 17:24:58 +07:00
Ralph Castain
7da9793fef
Support the PMIX_TIMEOUT key at the PMIx server when timeout=0 - this indicates that the user doesn't want a lookup of any data from the host RM.
2016-08-17 16:26:58 -05:00
Gilles Gouaillardet
3126ff77e2
pmix2x: common syms: whitelist bison-generated common symbols
...
Bison generates some common symbols that we can't do anything about,
so whitelist them.
2016-08-16 11:29:06 +09:00
Artem Polyakov
c5a91c5c9d
opal/pmix: fix pmix jobid calculation if external PMIx server is used.
2016-08-15 21:13:51 +03:00
Ralph Castain
ecbedee8bb
Fix typo
2016-08-15 07:32:00 -07:00
Artem Polyakov
f3c816b52e
opal/pmix: fix indentation in some files.
2016-08-15 18:21:50 +07:00
Gilles Gouaillardet
483685eb6a
update .gitignore
...
remove autogenerated opal/mca/pmix/pmix2x/pmix/src/include/pmix_config.h.in
2016-08-15 17:00:20 +09:00
Ralph Castain
be8424b691
Provide backward compatible keys so that the non-PMIx components in the opal/pmix framework don't have to adjust as we continue to work on finalizing the PMIx reference scheme. Activate and utilize the new PMIx show_help capability to provide more meaningful error output when the server cannot start.
...
Add a contrib script to cleanup permissions incorrectly modified due to things like smb mounts
dd
2016-08-13 12:13:04 -07:00
rhc54
ddde154d28
Merge pull request #1962 from rhc54/topic/notify
...
Ensure we properly convert pmix status to ORTE state before activatin…
2016-08-13 06:59:50 -07:00
Ralph Castain
48d35a9627
Ensure we properly convert pmix status to ORTE state before activating an error state upon notification. Cleanup some conversion issues on notification info. Add a new orte_notify.c test program
2016-08-12 21:14:29 -07:00
Ralph Castain
4a4c9703a9
Setup the job list in the PMIx integration so that static ports can run
2016-08-12 13:27:10 -07:00
Ralph Castain
0e58609327
Fix a bug where we were requiring that all paths in $PATH be absolute. Some users provide relative paths in their environment, and we should respect those.
2016-08-12 11:28:57 -07:00
Ralph Castain
1d44f0c0e2
Silence Coverity warnings
2016-08-11 21:22:01 -07:00
Ralph Castain
73544d2e00
Rename symbol
2016-08-11 13:06:46 -07:00
Ralph Castain
b0cc9b0bc8
Update to latest PMIx toolext branch
...
Fix indentations
Update the ext20 component to match latest PMIx master.
Cleanup name conflicts and uninit vars
2016-08-11 12:29:48 -07:00
Gilles Gouaillardet
dfbf2b7be4
opal/threads: add OPAL_THREAD_SUB_SIZE_T macro
...
-1 is not a valid size_t, so instead of OPAL_THREAD_ADD_SIZE_T(..., -1),
simply OPAL_THREAD_SUB_SIZE_T(..., 1) and keep picky compilers happy
2016-08-10 13:37:36 +09:00
rhc54
60f789dca1
Merge pull request #1948 from rhc54/topic/pmixtool
...
Update to include extended tool support, new datatypes
2016-08-09 16:17:28 -07:00
Nathan Hjelm
19be439998
Merge pull request #1949 from hjelmn/ugni_fix
...
btl/ugni: fix another connection race
2016-08-09 08:32:40 -06:00
Nathan Hjelm
38f18eed22
Merge pull request #1941 from ggouaillardet/topic/memory_patcher_configury
...
configury: make memory/patcher symbol detection more robust
2016-08-09 07:06:38 -06:00
Gilles Gouaillardet
13009aa290
opal/alfg: have opal_random() wrapper always return a positive int
2016-08-09 17:12:30 +09:00
Gilles Gouaillardet
6f6b3ac68a
configury: standardize memory/patcher symbol detection and make it more robust
...
by default, Sun compilers optimize out the original test, and hence fail detecting a symbol is missing.
2016-08-09 09:35:52 +09:00
Nathan Hjelm
adb668209b
btl/ugni: fix another connection race
...
This commit fixes a race that can occur when two threads are in the
ugni progress function at the same time. This race occurs when one
thread calls GNI_PostDataProbeById then goes to sleep then another
thread calls GNI_PostDataProbeById then GNI_EpPostDataWaitById before
the other thread wakes up. If this happens the first thread will print
a warning on GNI_EpPostDataWaitById about no matching post.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-08 15:38:11 -06:00
Ralph Castain
527b5c692a
Update to include extended tool support, new datatypes
2016-08-08 13:39:46 -07:00