This commit has two changes
1. Adding magic string during handshake can cause
issue when used with older version of MPI. Hence set
RCVTIMEO paramter to 2 second
2. Using single call during handshake instead of
two calls
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
As part of improvement towards tcp debugging
we are moving few BTL_ERROR to show_help and also
update the function behaviour of
mca_btl_tcp_endpoint_complete_connect to return
SUCCESS and ERROR cases.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
As part of improvement towards handling failure case
in btl tcp we are using magic string to verify mpi
connection. In case if there is mismatch or missing
magic string we can identify that we are trying to
connect with someother process.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
Moving non-blocking send/receive function to btl_tcp
will help reusing these function where ever needed.
In this case we plan to reuse receive function to
retrive magic string to validate established connection
is from mpi process.
Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
Not sure how/when this got deleted, but put back the "Cisco usNIC"
line in the transport summary at the end of configure.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
* Removes support for big endian PPC
* Removes support for XL compiler older than 13.1
* Fixes Issue #4053
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Someone posted an MTT test where libmpi_usempi_ignore_tkr.so ended
up with symbols like these being identifed as errors:
[error] MPI
[error] _Cmpi_fortran_status_ignore
[error] _Cmpi_fortran_statuses_ignore
those must be compiler-generated names so we shouldn't identify them
as problematic.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
- change the increment used to test various no. of aggregators
to avoid using only power of two numbers
- convert some paratemers in the cost function from integers to
to floats for providing smoother and more consistent results
- set the FVIEW_IS_SET flag on the file *only* if the user
has set anything else than the default file view.
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
adjust how the aggregator nodes are selected depending on whether processes
have been mapped by node or anything else.
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
Check for the binding policy used. We are only interested in
whether mapby-node has been set right now (could be extended later)
and only on MPI_COMM_WORLD, since for all other sub-communicators
it is virtually impossible to identify their layout across nodes
in the most generic sense. This is used by OMPIO for deciding which
ranks to use for aggregators
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
add a new aggregator selection algorithm based on the performance
model described in:
Shweta Jha, Edgar Gabriel,
'Performance Models for Communication in Collective I/O Operations'
Proceedings of the 17th IEEE/ACM Symposium
on Cluster, Cloud and Grid Computing, Workshop on Theoretical
Approaches to Performance Evaluation, Modeling and Simulation, 2017.
Signed-off-by: Edgar Gabriel <gabriel@cs.uh.edu>
This module was always intended to be a proof of concept, and was far
from complete. If/when someone implemented F08 descriptor support for
the mpi_f08 module, this commit can either be restored or used as
reference material.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>