previously, the definition was
struct oshmem_proc_data_t {
int num_transports;
char * transport_ids;
};
so in 64 bits arch, the compiler would very likely insert a 4 bytes
padding before the two fields in order to have transport_ids aligned
store oshmem related per proc data in an oshmem_proc_data_t struct,
that is stored in the padding section of an ompi_proc_t
this data can be accessed via the OSHMEM_PROC_DATA(proc) macro
Fixesopen-mpi/ompi#2023
if sendbuf is equal to recvbuf, that should not be interpreted
as equivalent to MPI_IN_PLACE on the non root rank(s)
Thanks Valentin Petrov for the report
predefined datatypes such as MPI_LONG_DOUBLE_INT are not really contiguous,
so use span as returned by opal_datatype_span() instead of type extent,
otherwise data might be written above allocated memory.
Thanks Valentin Petrov for the report
It is possible that one or more procs could get thru PMIx_Init, and thus be marked as in state "registered", before all local procs have been started. If that happens, then we would report some of the procs in state "running", and the others in state "registered" - which means that the HNP would miss the "running" stage of the state machine.
Thanks to Jingchao Zhang for his patience in tracking this down on the 2.0 branch
* Expand the use of the `orte_keep_fqdn_hostnames` MCA parameter when
it is set to false.
* If that parameter is set to false (default) then short hostnames
(e.g., `node01`) will match with the long hostnames (e.g.,
`node01.mycluster.org`). This allows a user (or resource manager)
to mix the use of short and long hostnames.
- Note that this mechanism does _not_ perform a DNS lookup, but
instead strips off the FQDN by truncating the hostname string at
the first `.` character (when not an IP address).
- By default (`false`) the following is true:
`node01 == node01.mycluster.org == node01.bogus.com`
since we use `node01` as the hostname.
The xlc compiler seems to behave in a different way that gcc when it
comes the inline asm. There were two problems with the code with xlc:
- The TOC read in mca_patcher_base_patch_hook used the syntax
register unsigned long toc asm("r2") to read $r2 (the TOC
pointer). With gcc this seems to behave as expected but with xlc
the result in toc is not the same as $r2. I updated the code to use
asm volatile ("std 2, %0" : "=m" (toc)) to load the TOC pointer.
- The OPAL_PATCHER_BEGIN macro is meant to be the first thing in a
hook. On PPC64 it loads the correct TOC pointer (thanks to
mca_patcher_base_patch_hook) and saves the old one. The
OPAL_PATCHER_END macro restores the TOC pointer. Because we *need*
the TOC to be correct before it is accessed in the hook the
OPAL_PATCHER_BEGIN macro MUST come first. We did this and all was
well with gcc. With xlc on the other hand there was a TOC access
before the assembly inserted by OPAL_PATCHER_BEGIN. To fix this
quickly I broke each hook into a pair of function with the
OPAL_PATCHER_* macros on the top level functions. This works around
the issue but is not a clean way to fix this. In the future we
should 1) either update overwrite to not need this, or 2) figure
out why xlc is not inserting the asm before the first TOC read.
This fixesopen-mpi/ompi#1854
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
protect the remaining functions where necessary by a mutex lock
to avoid problems in multi-threaded executions. Some functions
do not require that in my opinion, and I provided an explanation
in those cases.
This commit fixes an ordering bug in the code that keeps track of all
attached memory windows. The code is intended to keep the memory
regions sorted but was often inserting at the wrong index. Thanks to
Christoph Niethammer for reporting the issue. The reproducer will be
added to nightly MTT testing.
Fixesopen-mpi/ompi#2012
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
It is possible for another thread to process a lock ack before the
peer is set as locked. In this case either setting the locked or the
eager active flag might clobber the other thread. To address this the
flags have been made volatile and are set atomically. Since there is
no a opal_atomic_or or opal_atomic_and function just use cmpset for
now.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>