The data endpoint was not being set correctly for local peers in some
cases. This commit fixes the bug and cleans the associated code to
simplify the logic.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This now passes the loop test, and so we believe it resolves the random hangs in finalize.
Changes in PMIx master that are included here:
* Fixed a bug in the PMIx_Get logic
* Fixed self-notification procedure
* Made pmix_output functions thread safe
* Fixed a number of thread safety issues
* Updated configury to use 'uname -n' when hostname is unavailable
Work on cleaning up the event handler thread safety problem
Rarely used functions, but protect them anyway
Fix the last part of the intercomm problem
Ensure we don't cover any PMIx calls with the framework-level lock.
Protect against NULL argv comm_spawn
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
update the cartesian communicator based grouping strategy to match the other
algorithms used in the aggregator selection process.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
the cart_based_grouping aggregator strategy was not correctly updated
during the last major rewrite of the aggregator selection algorithm.
It is also not supposed to be called from file_open (but from
file_set_view).
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
the 'hostname' command might not be available on some platforms
such as Fedora Core 26, so mimick config/libtool.m4 and fallback
to 'uname -n' if needed
Refs. #3680
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
* Add work around support for the following missing ops in ROMIO 3.1.4
- `MPI_File_iread_at_all`
- `MPI_File_iwrite_at_all`
- `MPI_File_iread_all`
- `MPI_File_iwrite_all`
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
* Protects us from segv when ROMIO 314 is selected and one of the
following operations is called:
- MPI_File_iread_at_all
- MPI_File_iwrite_at_all
- MPI_File_iread_all
- MPI_File_iwrite_all
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
This commit adds code to allow support for the info assertions added
by mpi-forum/mpi-issues#11. The assertions added are:
mpi_assert_no_any_tag, mpi_assert_no_any_source,
mpi_assert_exact_length, and mpi_assert_allow_overtaking.
This commit also adds support for the mpi_assert_no_any_source and
mpi_assert_allow_overtaking info keys to the ob1 pml.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Hostname and PID are output as a message prefix in many places in
our code. Their printf-formats were either `[%s:%d]` or `[%s:%05d]`.
This commit changes `[%s:%d]` to `[%s:%05d]`. The latter was more
widely used in our code (including OPAL output system and the signal
handler).
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
This commit expands the effect of the MCA parameter `opal_abort_delay`
to the OPAL signal handler. This allows attaching of a debugger on
segmentation fault etc. before quitting the job.
The sleep code is moved to the `opal_delay_abort` function from the
`ompi_mpi_abort` and `oshmem_shmem_abort` functions for code cleanup.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
Example (using MPI_ORDER_C so the below has 6 rows of 4 ints to parcel out)
size = 4;
rank = 0;
ndims=2;
gsizes[0] = 6;
gsizes[1] = 4;
distribs[0] = MPI_DISTRIBUTE_CYCLIC;
distribs[1] = MPI_DISTRIBUTE_BLOCK;
dargs[0] = 2;
dargs[1] = 2;
psizes[0] = 2;
psizes[1] = 2;
MPI_Type_create_darray(size, rank, ndims,
gsizes, distribs, dargs, psizes,
MPI_ORDER_C, MPI_INT, &mydt);
Expectation for the layout:
inner dimension (1) is
4 items (ints) distributed block over 2 ranks with 2 items each
eg for rank 0: [ x x . . ]
outer dimension (0) is:
6 items (the above [ x x . .]) cyclic over 2 ranks with 2 items each
eg for rank 0:
[ x x . . ] : offset=0 bytes=8
[ x x . . ] : ofset=16 bytes=8
[ . . . . ]
[ . . . . ]
[ x x . . ] : offset=64 bytes=8
[ x x . . ] : offset=80 bytes=8
Or more specifically a stream of ints 0,1,2,3,4,5,6,7 sent into that
type should be
[ 0 1 . . ]
[ 2 3 . . ]
[ . . . . ]
[ . . . . ]
[ 4 5 . . ]
[ 6 7 . . ]
The data was laying out though as
[ 0 1 2 3 ]
[ . . . . ]
[ . . . . ]
[ . . . . ]
[ 4 5 6 7 ]
[ . . . . ]
because the recursive construction inside the block() function (which
creates the smaller row datatype [ x x . . ]) wasn't setting the extent
of that type.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
rename ompi/mpi/fortran/base/strings.h so it does not get pulled
when /usr/include/strings.h is expected.
Refs open-mpi/ompi#3639
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Convert the predefined MPI object padding to a fixed number of bytes
(vs. a multiple of sizeof(void*)) so that the padding is the same size
between 32 and 64 bit builds. I.e., we won't have a situation where
we've run out of padding in 32 bit builds but still have more space
available in 64 bit builds.
Fixes#3610
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This commit moves the info subscribe for the blocking_fence to after
the global_state is allocated and moves setting win->w_osc_module to
before the info subscribe for alloc_shared_contig. This fixes a SEGV
caught by MTT.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Without this change, the directory of `javadoc` command must be
included in the `PATH` environment variable at `make`-time.
Paths of `javac`, `javah`, and `jar` commands are detected in
`configure`. So the path of `javadoc` also should be detected.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
a buffer defined by (buf, count, dt)
will have data starting at buf+offset and ending len bytes later with
len = opal_datatype_span(&dt.super, count, &offset);
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This commit removes a nonexistent function that was causing build
problems under certain environments.
Reference #3442
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
A component with implementation of R. Rabenseifner's algorithm for Reduce and Allreduce.
This algorithm is a combination of a reduce-scatter implemented with recursive vector halving
and recursive distance doubling, followed either by a gather or an allgather.
Current limitations:
-- count >= 2^{\floor{\log_2 p}}
-- commutative operations only
-- intra-communicators onl
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
coll/spacc: Modify implementation to use `ompi_coll_base_sendrecv()`
Replace irecv() + isend() + ompi_request_wait() to ompi_coll_base_sendrecv().
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
origin_datatype and target_datatype might be different and hence have different extent,
so use either origin_extent or target_extent when appropriate.
Refs open-mpi/ompi#3569
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
All other man pages don't have an empty line after
the "! or the older form: INCLUDE 'mpif.h'" line
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
The osc_rdma_get_remote_segment() has the 3rd and 4th args as
* target_disp
* length
which it uses to determine if the rdma falls within the bounds of
the window or not (actually it only checks the upper bound, but I'm
okay with that).
Anyway the caller previously was passing in the length argument as
target_datatype->super.size * target_count
which which doesn't really represent the number of bytes after target_disp
for which data exists. In particular I could create a datatype as
{ disp -4, len 4 } and use target_disp 4
and that would be bytes 0-3 of the window where the original code
would think it was bytes 4-7 and could abort at the range check.
Ive changed it to use the opal_datatype_span() function.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
Yalla has a macro PML_YALLA_INIT_MXM_REQ_DATA that checks if a datatype
is contiguous via opal_datatype_is_contiguous_memory_layout(dt,count)
and if so it selects a size and lb that presumably is what will rdma, as
ompi_datatype_type_size(_dtype, &size); \
ompi_datatype_type_lb(_dtype, &lb); \
This failed when I gave it a datatype constructed as [ ...] with extent 4.
What I mean by that datatype is
lens[0] = 3;
disps[0] = 1;
types[0] = MPI_CHAR;
MPI_Type_struct(1, lens, disps, types, &tmpdt);
MPI_Type_create_resized(tmpdt, 0, 4, &mydt);
So there are 3 chars at offset 1, and the LB is 0 and the UB is 4.
So that macro decides that size=4 and lb=0 and later I suppose size is getting
updated to 3 for the final rdma, and so a send of a buffer
[ 0 1 2 3 ] gets recved as [ 0 1 2 _ ]. I think it should use the true lb
and the true extent.
For "regular" contig datatypes it would be the same, and for the irregular
ones that are still deemed contiguous by that utility function it should
still be the right thing to use.
Signed-off-by: Mark Allen <markalle@us.ibm.com>
See bug report
https://github.com/open-mpi/ompi/issues/3548
If a 1sided test is launched -host hostA:2,hostB:1 some of the ranks
call allocate_state_single() and others call allocate_state_shared().
These functions were producing different values for module->state_size
but that's used when they lookup peer info from each other in
ompi_osc_rdma_peer_setup() so they need to all have matching
module->state_offset values.
This change adds a few unused bytes in the memory allocate_state_single()
creates so it matches.
Signed-off-by: Mark Allen <markalle@us.ibm.com>