zero-size derived datatypes are now flagged as OPAL_DATATYPE_FLAG_CONTIGUOUS
so update mca_pml_ucx_init_datatype() to correctly handle them.
Since 'size' is a 'size_t', the assertion can simply be removed.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
bad side effect occurs when CPPFLAGS is set in the environment,
so set it (and LDFLAGS too) on the configure command line.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
individual read/write operations exceeding 2GB fail in ompio
due to improper conversions from size_t to int in two different
locations. This commit fixes an issue reported by Richard Warren
from the HDF5 group.
Fixes Issue #397
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Move the prefix area from the head to the body in relevant size
computations. This fixes a problem in high traffic situations where
usNIC may have sent from unregistered memory.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
New MCA param: btl_usnic_max_resends_per_iteration. This is the max
number of resends we'll do in a single pass through usNIC component
progress. This prevents progress from getting stuck in an endless
loop of retransmissions (i.e., if more retransmissions are triggered
during the sending of retransmissions). Specifically: we need to
leave the resend loop to allow receives to happen (which may ACK
messages we have sent previously, and therefore cause pending resends
to be moot).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Significantly increase the default retrans timeout. If the
retrans timeout is too soon, we can end up in a retransmission storm
where the logic will continually re-transmit the same frames during a
single run through the usNIC progress function (because the timer for
a single frame expires before we have run through re-transmitting all
the frames pending re-transmission).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
New MCA parameter: btl_usnic_ack_iteration_delay. Set this to the
number of times through the usNIC component progress function before
sending a standalone ACK (vs. piggy-backing the ACK on any other send
going to the target peer).
Use "ticks" language to clarify that we're really counting the number
of times through the usNIC component DATA_CHANNEL completion check (to
check for incoming messages) -- it has no relation to wall clock time
whatsoever.
Also slightly change the channel-checking scheme in usNIC component
progress: only check the PRIORITY channel once (vs. checking it once,
not finding anything, and then falling through the progress_2() where we
check PRIORITY again and then check the DATA channel).
As before, if our "progress" libevent fires, increment the tick
counter enough to guarantee that all endpoints that need an ACK will
get triggered to send standalone ACKs the next time through progress,
if necessary.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Rename "get_nsec()" to "get_ticks()" to more accurately reflect that
this function has no correlation to wall clock time at all.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Might as well save a few bytes when sending this struct across the
network via the __opal_attribute_packed__ attribute.
That being said, also re-order the elements in this struct so that
there's no holes to begin with. Do this so that the compiler/runtime
won't effect (slow) unaligned reads/writes because of the
__opal_attribute_packed__ attribute.
The "packed" attribute is really more about defensive programming
(e.g., if we make a mistake and have a hole, "packed" will remove it
for us).
*** Do not bring this commit back to existing/already-released release
branches: it will cause incompatibility, since it effectively changes
the usNIC BTL wire protocol.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
* The user can set `-mca odls_base_sigkill_timeout 30` to have ORTE wait
30 seconds before sending SIGTERM then another 30 seconds before sending
SIGKILL to remaining processes. This usually happens on an abnormal
termination. Sometimes the user wants to delay the cleanup to give the
system time to write out corefile or run other diagnostics.
* The problem is that child processes may be completing while ORTE is
in this loop. The SIGCHLD will interrupt the `sleep` system call.
Without the loop the sleep could effectively be ignored in this case.
- Sleep returns the amount of time remaining to sleep. If it was
interrupted by a signal then it is a positive number less than or
equal to the parameter passed to it. If it slept the whole time
then it returns 0.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
one off patch for v4.0.x. for some reason commit on master
didn't have this problem.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
(cherry picked from commit 5f3dbdb5c8)
Note that this commit is actually a cherry-pick from the v4.0.x
branch. This is the opposite direction than what we nornmally do: we
usually commit to master first and then cherry-pick to the release
branches (vs. the other way around).
As is probably evident from the original commit message above, through
a comedy of errors, this commit was actually applied to the v4.0.x
branch first and then cherry-picked back to master (i.e., the problem
*did* exist in the original master commit
3aca4af548, but it was not recongized at
the time).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Remove code for multiple OOB progress threads as it is an optimization
nobody uses. Also turns out to have a race condition that can cause
segfault on finalize, so maybe good that nobody is using it.
Signed-off-by: Ralph Castain <rhc@pmix.org>
INTERNAL: STL-59403
The OFI (libfabric) MTL does not respect the maximum message size
parameter that OFI provides in the fi_info data.
This patch adds this missing max_msg_size field to the mca_ofi_module_t
structure and adds a length check to the low-level send routines.
Change-Id: I05aa71d332f2df897133b30c28bf37d98f061996
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
Reviewed-by: Adam Goldman <adam.goldman@intel.com>
Reviewed-by: Brendan Cunningham <brendan.cunningham@intel.com>
- due to some refactoring and adding new functionality compilation
of ikrit module was broken
- this commit restores compilation
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
Currently, there is no function that allows the user to retrieve the
data they have stored in a vertex easily. Using the internal macros and
knowledge of the structures, the new function will return a pointer to
the user provided vertex data.
Signed-off-by: William Zhang <wilzhang@amazon.com>
If both types of interfaces are enabled, don't error out if one of them
isn't able to open listener sockets. Only one interface family may be
available on some machines, but someone might want to build the code to
run more generally.
Refs https://github.com/pmix/prrte/pull/249
Signed-off-by: Ralph Castain <rhc@pmix.org>
This commit changes how the single-copy emulation in the vader btl
operates. Before this change the BTL set its put and get limits
based on the max send size. After this change the limits are unset
and the put or get operation is fragmented internally.
References #6568
Signed-off-by: Nathan Hjelm <hjelmn@google.com>