Ralph Castain
e3213386ec
Fix the internal PMIx installation - matching changes have been upstreamed
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:49:07 -07:00
Ralph Castain
a1b15c5666
Roll in update to PMIx master. Transfer updates from pmix2x component to ext2x
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:06:47 -07:00
Jeff Squyres
b991135634
Merge pull request #4128 from jsquyres/pr/fix-info-delete-return-value
...
mpi/info_delete: fix return code
2017-08-22 14:33:29 -04:00
Jeff Squyres
ea5093fc14
mpi/info_delete: fix return code
...
Per MPI-3.1, ensure to raise an MPI exception with value
MPI_ERR_INFO_NOKEY if we try to MPI_INFO_DELETE a key that does not
exist. Thanks to @dalcinl (Lisando Dalcin) for raising the issue.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-22 08:56:40 -07:00
Ralph Castain
f5fb43e9c7
Merge pull request #4120 from bgoglin/master
...
fixes and debug messages to the hwloc/shmem use
2017-08-22 07:59:45 -07:00
Brice Goglin
046d870124
rtc/hwloc/shmem: add Inria copyrights
...
The code for finding the hole for the shmem region actually came from me.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 23:09:57 +02:00
Brice Goglin
2d242ab9f0
hwloc/shmem: don't abort on failure to load from shmem
...
Adopting can fail if the server-side hole isn't available on the client.
We can fallback to other ways to load the topology.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 19:57:38 +02:00
Brice Goglin
ffd209fc2e
hwloc/shmem: dump /proc/self/maps if failed to find a hole and verbosity > 4
...
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 19:57:38 +02:00
Brice Goglin
baf762d99d
rtc/hwloc/shmem: dump /proc/self/maps if failed to find a hole and verbosity > 4
...
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 19:57:38 +02:00
Brice Goglin
8f6afbb641
rtc/hwloc/shmem: fix "heap" hole search kind
...
There can be multiple [heap] consecutively in proc/<pid>/maps,
and there's no room between them.
Don't use a hole after the first [heap] is there's another [heap]
immediately after it.
This code would fail to find the last [heap] if there were multiple
[heap] interleaved with non-heap VMA, but our kind "after heap"
wouldn't be meaningful anymore anyway.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 15:42:38 +02:00
Brice Goglin
b8b46b253b
rtc/hwloc/shmem: fix "libs" hole search kind
...
We want the biggest hole *between* heap and stack, not outside.
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 15:40:36 +02:00
Gilles Gouaillardet
a3e31fa8d0
ompi/communicator: plug a memory leak in ompi_comm_init()
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-08-21 11:47:11 +09:00
Ralph Castain
9d3f4516e6
Merge pull request #4116 from rhc54/topic/notify
...
Don't restrict broadcast notifications
2017-08-18 18:13:47 -07:00
Ralph Castain
d515f48885
The local PMIx server is notifying its clients of all events, but for some reason I don't recall, the broadcast notification was marked for delivery only to non-default event handlers. This creates a discrepancy between the two behaviors, so don't restrict the broadcast notifications.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-18 17:26:11 -07:00
Brian Barrett
c667719a3f
Merge pull request #3955 from mohanasudhan/master
...
Btl tcp: Improved diagnostic output and failure mode
2017-08-18 11:42:27 -07:00
Mohan
fc32ae401e
Btl Tcp: Updated tcp handshake methods
...
This commit has two changes
1. Adding magic string during handshake can cause
issue when used with older version of MPI. Hence set
RCVTIMEO paramter to 2 second
2. Using single call during handshake instead of
two calls
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-18 10:06:52 -07:00
Mohan
e3dfe11da9
Btl tcp: Improving verbose around tcp
...
As part of improvement towards tcp btl we
are improving verbose in general
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 17:22:16 -07:00
Mohan
4bc7b214dc
Btl tcp: Improving verbose around IPV6
...
As part of improvement around tcp btl debugging
& verbose. we are improving verbose around IPV6
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
0741fad479
Btl tcp: BTL_ERROR to show_help & update func behaviour
...
As part of improvement towards tcp debugging
we are moving few BTL_ERROR to show_help and also
update the function behaviour of
mca_btl_tcp_endpoint_complete_connect to return
SUCCESS and ERROR cases.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
368f9f0dfc
Btl tcp: Using magic string to verify mpi connection
...
As part of improvement towards handling failure case
in btl tcp we are using magic string to verify mpi
connection. In case if there is mismatch or missing
magic string we can identify that we are trying to
connect with someother process.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Mohan
c30a42917c
Btl tcp: Refactoring non-blocking send/receive function
...
Moving non-blocking send/receive function to btl_tcp
will help reusing these function where ever needed.
In this case we plan to reuse receive function to
retrive magic string to validate established connection
is from mpi process.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Ralph Castain
b67b1e88a5
Merge pull request #4111 from rhc54/topic/multiconnect
...
Cleanup some issues in connect/accept support across jobs started by …
2017-08-17 12:49:01 -07:00
Ralph Castain
d85239e052
Cleanup some issues in connect/accept support across jobs started by different mpirun commands. Still not fully operational, but someone else will have to finish debugging it
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 11:58:48 -07:00
Ralph Castain
a855ebd86b
Merge pull request #4110 from rhc54/topic/cov
...
Silence coverity warnings
2017-08-17 10:57:31 -07:00
Ralph Castain
088b6cdeee
Silence coverity warnings
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 09:49:35 -07:00
Jeff Squyres
4e763796b1
Merge pull request #4100 from jsquyres/pr/fix-nmcheck-prefix
...
nmcheck_prefix: more updates for more compilers
2017-08-16 20:39:34 -04:00
Ralph Castain
1f799afa30
Merge pull request #4106 from rhc54/topic/hwloc
...
Add diagnostics for hwloc get_topology
2017-08-16 15:47:47 -07:00
Ralph Castain
41df973359
Add diagnostics for hwloc get_topology
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 14:21:27 -07:00
Jeff Squyres
cd8db5313e
Merge pull request #4101 from jsquyres/pr/usnic-restore-configure-summary-line
...
btl/usnic: restore configure usNIC summary line
2017-08-16 16:36:19 -04:00
Josh Hursey
e0931714ea
Merge pull request #4090 from jjhursey/config/old-xl-ppc-support
...
config: Remove support for big endian PPC, XL compiler older than 13.1
2017-08-16 15:31:35 -05:00
Ralph Castain
f21dfd3189
Merge pull request #4097 from rhc54/topic/dlopepn
...
Change test per recommendation of @jsquyres
2017-08-16 13:22:18 -07:00
Jeff Squyres
a591159fb4
btl/usnic: restore configure usNIC summary line
...
Not sure how/when this got deleted, but put back the "Cisco usNIC"
line in the transport summary at the end of configure.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:37:59 -07:00
Jeff Squyres
9d09fe0151
nmcheck_prefix: more updates for more compilers
...
Ignore a few more symbols to pass Absoft and modern gcc.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:28:49 -07:00
Ralph Castain
c4d5dbfcdc
Change test per recommendation of @jsquyres
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 11:19:15 -07:00
Jeff Squyres
1f0b6f783c
Merge pull request #4095 from jsquyres/pr/fix-compiler-warning
...
rcash_base_frame: fix compiler warning
2017-08-16 14:02:21 -04:00
Jeff Squyres
ce3a032b5e
rcash_base_frame: fix compiler warning
...
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 09:48:31 -07:00
Ralph Castain
4cacd222d6
Merge pull request #4094 from rhc54/topic/pmix210rc1
...
Update to PMIx v2.1.0a1
2017-08-15 21:20:39 -07:00
Ralph Castain
eb69df02ae
Update to PMIx v2.1.0rc1
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 19:59:15 -07:00
Ralph Castain
23ffbeb8f8
Merge pull request #4093 from rhc54/topic/toolsupport
...
Update tool support by adding MCA params to direct orted's to drop
2017-08-15 19:41:45 -07:00
Ralph Castain
65fb6070d9
Update tool support by adding MCA params to direct orted's to drop
...
session and/or system-level tool rendezous files. Ensure PMIx is
enabled for tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 17:49:47 -07:00
Geoff Paulsen
f7137ecf98
Merge pull request #4086 from markalle/nm_test_fix
...
updating nmcheck_prefix.pl to accept some more compiler-generated names
2017-08-15 18:58:34 -05:00
Ralph Castain
d4c594fa72
Merge pull request #4091 from rhc54/topic/hostfile
...
Fix hostfile filtering in allocated environments to preserve slot assignments
2017-08-15 16:14:24 -07:00
Ralph Castain
37ec6d45c5
Merge pull request #4089 from rhc54/topic/errors
...
Fix some build errors on master - fix typos in update-my-copyright
2017-08-15 14:42:38 -07:00
Ralph Castain
2fbce9d93c
Fix hostfile filtering in allocated environments to preserve slot assignments
...
Refs #3984
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 14:41:12 -07:00
Joshua Hursey
12a015d90f
config: Remove support for big endian PPC, XL compiler older than 13.1
...
* Removes support for big endian PPC
* Removes support for XL compiler older than 13.1
* Fixes Issue #4053
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-15 17:01:36 -04:00
Ralph Castain
98f36711e3
Update hwloc to latest shmem branch. Correct typos in update-my-copyright.pl.
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 13:32:12 -07:00
Mark Allen
245006a23d
updating nmcheck_prefix.pl to accept some more compiler-generated names
...
Someone posted an MTT test where libmpi_usempi_ignore_tkr.so ended
up with symbols like these being identifed as errors:
[error] MPI
[error] _Cmpi_fortran_status_ignore
[error] _Cmpi_fortran_statuses_ignore
those must be compiler-generated names so we shouldn't identify them
as problematic.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-08-15 15:48:22 -04:00
Ralph Castain
033a0eb373
Fix the --disable-dlopen --with-devel-headers case by not having libpmix link back to libopen-pal as the latter won't exist in time during this build case
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 10:51:35 -07:00
Edgar Gabriel
99c7482dd8
Merge pull request #3739 from cniethammer/sharedfp_sm_file_dir
...
Create file for file backed shared memory in process job session dir.
2017-08-15 11:53:30 -05:00
Edgar Gabriel
ec1a9a8218
Merge pull request #4057 from edgargabriel/pr/performance-fixes-2
...
io/ompio: new aggregator selection algorithm
2017-08-15 11:38:53 -05:00