1
1
Граф коммитов

27690 Коммитов

Автор SHA1 Сообщение Дата
Mohan
fc32ae401e Btl Tcp: Updated tcp handshake methods
This commit has two changes

1. Adding magic string during handshake can cause
issue when used with older version of MPI. Hence set
RCVTIMEO paramter to 2 second
2. Using single call during handshake instead of
two calls

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-18 10:06:52 -07:00
Mohan
e3dfe11da9 Btl tcp: Improving verbose around tcp
As part of improvement towards tcp btl we
are improving verbose in general

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 17:22:16 -07:00
Mohan
4bc7b214dc Btl tcp: Improving verbose around IPV6
As part of improvement around tcp btl debugging
& verbose. we are improving verbose around IPV6

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
0741fad479 Btl tcp: BTL_ERROR to show_help & update func behaviour
As part of improvement towards tcp debugging
we are moving few BTL_ERROR to show_help and also
update the function behaviour of
mca_btl_tcp_endpoint_complete_connect to return
SUCCESS and ERROR cases.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
368f9f0dfc Btl tcp: Using magic string to verify mpi connection
As part of improvement towards handling failure case
in btl tcp we are using magic string to verify mpi
connection. In case if there is mismatch or missing
magic string we can identify that we are trying to
connect with someother process.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Mohan
c30a42917c Btl tcp: Refactoring non-blocking send/receive function
Moving non-blocking send/receive function to btl_tcp
will help reusing these function where ever needed.
In this case we plan to reuse receive function to
retrive magic string to validate established connection
is from mpi process.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Ralph Castain
b67b1e88a5 Merge pull request #4111 from rhc54/topic/multiconnect
Cleanup some issues in connect/accept support across jobs started by …
2017-08-17 12:49:01 -07:00
Ralph Castain
d85239e052 Cleanup some issues in connect/accept support across jobs started by different mpirun commands. Still not fully operational, but someone else will have to finish debugging it
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 11:58:48 -07:00
Ralph Castain
a855ebd86b Merge pull request #4110 from rhc54/topic/cov
Silence coverity warnings
2017-08-17 10:57:31 -07:00
Ralph Castain
088b6cdeee Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 09:49:35 -07:00
Jeff Squyres
4e763796b1 Merge pull request #4100 from jsquyres/pr/fix-nmcheck-prefix
nmcheck_prefix: more updates for more compilers
2017-08-16 20:39:34 -04:00
Ralph Castain
1f799afa30 Merge pull request #4106 from rhc54/topic/hwloc
Add diagnostics for hwloc get_topology
2017-08-16 15:47:47 -07:00
Ralph Castain
41df973359 Add diagnostics for hwloc get_topology
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 14:21:27 -07:00
Jeff Squyres
cd8db5313e Merge pull request #4101 from jsquyres/pr/usnic-restore-configure-summary-line
btl/usnic: restore configure usNIC summary line
2017-08-16 16:36:19 -04:00
Josh Hursey
e0931714ea Merge pull request #4090 from jjhursey/config/old-xl-ppc-support
config: Remove support for big endian PPC, XL compiler older than 13.1
2017-08-16 15:31:35 -05:00
Ralph Castain
f21dfd3189 Merge pull request #4097 from rhc54/topic/dlopepn
Change test per recommendation of @jsquyres
2017-08-16 13:22:18 -07:00
Jeff Squyres
a591159fb4 btl/usnic: restore configure usNIC summary line
Not sure how/when this got deleted, but put back the "Cisco usNIC"
line in the transport summary at the end of configure.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:37:59 -07:00
Jeff Squyres
9d09fe0151 nmcheck_prefix: more updates for more compilers
Ignore a few more symbols to pass Absoft and modern gcc.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:28:49 -07:00
Ralph Castain
c4d5dbfcdc Change test per recommendation of @jsquyres
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 11:19:15 -07:00
Jeff Squyres
1f0b6f783c Merge pull request #4095 from jsquyres/pr/fix-compiler-warning
rcash_base_frame: fix compiler warning
2017-08-16 14:02:21 -04:00
Jeff Squyres
ce3a032b5e rcash_base_frame: fix compiler warning
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 09:48:31 -07:00
Ralph Castain
4cacd222d6 Merge pull request #4094 from rhc54/topic/pmix210rc1
Update to PMIx v2.1.0a1
2017-08-15 21:20:39 -07:00
Ralph Castain
eb69df02ae Update to PMIx v2.1.0rc1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 19:59:15 -07:00
Ralph Castain
23ffbeb8f8 Merge pull request #4093 from rhc54/topic/toolsupport
Update tool support by adding MCA params to direct orted's to drop
2017-08-15 19:41:45 -07:00
Ralph Castain
65fb6070d9 Update tool support by adding MCA params to direct orted's to drop
session and/or system-level tool rendezous files. Ensure PMIx is
enabled for tools

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 17:49:47 -07:00
Geoff Paulsen
f7137ecf98 Merge pull request #4086 from markalle/nm_test_fix
updating nmcheck_prefix.pl to accept some more compiler-generated names
2017-08-15 18:58:34 -05:00
Ralph Castain
d4c594fa72 Merge pull request #4091 from rhc54/topic/hostfile
Fix hostfile filtering in allocated environments to preserve slot assignments
2017-08-15 16:14:24 -07:00
Ralph Castain
37ec6d45c5 Merge pull request #4089 from rhc54/topic/errors
Fix some build errors on master - fix typos in update-my-copyright
2017-08-15 14:42:38 -07:00
Ralph Castain
2fbce9d93c Fix hostfile filtering in allocated environments to preserve slot assignments
Refs #3984

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 14:41:12 -07:00
Joshua Hursey
12a015d90f config: Remove support for big endian PPC, XL compiler older than 13.1
* Removes support for big endian PPC
 * Removes support for XL compiler older than 13.1
 * Fixes Issue #4053

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-15 17:01:36 -04:00
Ralph Castain
98f36711e3 Update hwloc to latest shmem branch. Correct typos in update-my-copyright.pl.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 13:32:12 -07:00
Mark Allen
245006a23d updating nmcheck_prefix.pl to accept some more compiler-generated names
Someone posted an MTT test where libmpi_usempi_ignore_tkr.so ended
up with symbols like these being identifed as errors:
    [error]   MPI
    [error]   _Cmpi_fortran_status_ignore
    [error]   _Cmpi_fortran_statuses_ignore
those must be compiler-generated names so we shouldn't identify them
as problematic.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-08-15 15:48:22 -04:00
Ralph Castain
033a0eb373 Fix the --disable-dlopen --with-devel-headers case by not having libpmix link back to libopen-pal as the latter won't exist in time during this build case
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 10:51:35 -07:00
Edgar Gabriel
99c7482dd8 Merge pull request #3739 from cniethammer/sharedfp_sm_file_dir
Create file for file backed shared memory in process job session dir.
2017-08-15 11:53:30 -05:00
Edgar Gabriel
ec1a9a8218 Merge pull request #4057 from edgargabriel/pr/performance-fixes-2
io/ompio: new aggregator selection algorithm
2017-08-15 11:38:53 -05:00
Edgar Gabriel
8fe1c63e25 io/ompio: change the increment for cost based aggr. selection
- change the increment used to test various no. of aggregators
  to avoid using only power of two numbers
- convert some paratemers in the cost function from integers to
  to floats for providing smoother and more consistent results
- set the FVIEW_IS_SET flag on the file *only* if the user
  has set anything else than the default file view.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2017-08-15 09:50:41 -05:00
Edgar Gabriel
f258036e06 fcoll/two_phase: adjust aggregator selection to new mapby flag on MPI_COMM_WORLD
adjust how the aggregator nodes are selected depending on whether processes
have been mapped by node or anything else.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2017-08-15 09:50:41 -05:00
Edgar Gabriel
92eff9050c communicator/comm_init.c: add a new flag indicating binding policy
Check for the binding policy used. We are only interested in
whether mapby-node has been set right now (could be extended later)
and only on MPI_COMM_WORLD, since for all other sub-communicators
it is virtually impossible to identify their layout across nodes
in the most generic sense. This is used by OMPIO for deciding which
ranks to use for aggregators

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2017-08-15 09:50:41 -05:00
Edgar Gabriel
b3f59c76e1 io/ompio: new simple aggr. selection algorithm
add a new aggregator selection algorithm based on the performance
model described in:

Shweta Jha, Edgar Gabriel,
'Performance Models for Communication in Collective I/O Operations'
Proceedings of the 17th IEEE/ACM Symposium
on Cluster, Cloud and Grid Computing, Workshop on Theoretical
Approaches to Performance Evaluation, Modeling and Simulation, 2017.

Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
2017-08-15 09:50:41 -05:00
Jeff Squyres
0414c0c9d7 Merge pull request #3757 from ggouaillardet/topic/enable_builtin_atomics
configury: abort when builtin atomics cannot be built and configure'd…
2017-08-14 15:22:18 -04:00
Ralph Castain
0118e32165 Merge pull request #4076 from artpol84/revert_71da0f/master
Revert "plm/rsh: Propagate PMIx prefix to orted's"
2017-08-14 09:23:47 -07:00
Artem Polyakov
10d6e90bf5 Revert "plm/rsh: Propagate PMIx prefix to orted's"
This reverts commit 71da0fcbef.
(per https://github.com/open-mpi/ompi/pull/4052).
Refs: https://github.com/open-mpi/ompi/issues/3980

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-14 21:37:57 +07:00
Ralph Castain
84810adc24 Merge pull request #4075 from rhc54/topic/hwfix
Apply patch from @bgoglin
2017-08-11 09:39:12 -07:00
Ralph Castain
daf548b328 Apply patch from @bgoglin
Fixes #4027

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-11 07:16:14 -07:00
Ralph Castain
0e9623faf8 Merge pull request #4073 from rhc54/topic/pmixup
Update to latest PMIx v2.1.0a
2017-08-11 01:38:14 -07:00
Ralph Castain
4290247d64 Update to latest PMIx v2.1.0a
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-10 18:48:07 -07:00
Jeff Squyres
03544d7cfa Merge pull request #4068 from jsquyres/pr/remove-f08-desc
ompi/fortran: remove proof-of-concept mpi_f08 module
2017-08-10 10:43:46 -04:00
Jeff Squyres
791bcee6c0 ompi/fortran: remove proof-of-concept mpi_f08 module
This module was always intended to be a proof of concept, and was far
from complete.  If/when someone implemented F08 descriptor support for
the mpi_f08 module, this commit can either be restored or used as
reference material.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-10 06:19:17 -07:00
Jeff Squyres
1a70e5bd16 Merge pull request #3617 from ggouaillardet/topic/f08_mpiext
fortran2008: fix mpiext example
2017-08-10 09:16:30 -04:00
Ralph Castain
9324193d92 Merge pull request #4066 from rhc54/topic/patterns
Provide the mapping, ranking, binding patterns
2017-08-09 13:12:31 -07:00