Portals4 supports atomic ops on datatypes less than or equal to
max_fetch_atomic_size bytes. This commit fixes a bug that required
the datatype to be less than max_fetch_atomic_size bytes.
- make the internal structure follow the Open MPI naming convention
- provide a single flag/macro which controls the compilation/utilization of this
feature, to avoid that somebody using this has to modify every single
fcoll component. A configure option could be added later if desired.
configury: fix hcoll, fca and mxm detection and revamp yalla Makefile.am
Thanks to David Shrader and Ake Sandgren for bringing this issue to our attention
* do not add -I/.../include/fca -I /.../include/fca_core to CPPFLAGS
* allow configure --with-fca
* search fca libs in both DIR/lib and DIR/lib64
* fix the description of the --with-fca option
* do not add -I/.../include/hcoll -I /.../include/hcoll/api to CPPFLAGS
* allow configure --with-hcoll
* search hcoll libs in both DIR/lib and DIR/lib64
* fix the description of the --with-hcoll option
mtl_ofi_provider_include (resp. mtl_ofi_provider_exclude) can be used
to specify which provider(s) the OFI MTL can select (resp. ignore).
e.g. --mca mtl_ofi_provider_include "psm,sockets"
By default, mtl_ofi_provider_exclude is set to "sockets,mxm".
This deprecates the old MCA var named "mtl_ofi_provider".
This commit does the following:
* s/ompi_check_treematch/ompi_topo_treematch/ (i.e., abide by the
prefix rule)
* change the value of ompi_topo_treematch_happy from yes/no to 0/1, so
that we can use -eq for numerical comparisons (vs. string
comparisons). It's the little things in life, no?
* Check the valueo f $OPAL_HAVE_HWLOC to ensure that hwloc support is
enabled. If not, disqualify treematch from building.
* Fixes a few places that were underquoted
* Convert from "test ... -a ..." to "test ... && test ..."
Fixesopen-mpi/ompi#797
This commit rewrites parts of libnbc to fix issues identified by
coverity and myself. The changes are as follows:
- libnbc function would return invalid error codes (internal to
libnbc) to the mpi layer. These codes names are of the form
NBC_. They do not match up with the error codes expected by the mpi
layer. I purged the use of all these error codes with the exception
of NBC_OK and NBC_CONTINUE in progress. These codes are used to
identify when a request handle is complete.
- Handles and schedules were leaked by all collective routines on
error. A new routine was added to return a collective handle
(NBC_Return_handle).
- Temporary buffers containting in/out neighbors for neighborhood
collectives were always leaked.
- Neigborhood collectives contained code to handle MPI_IN_PLACE which
is never a valid input for the send or receive buffer. Stipped this
code out.
- Files were inconsistently named. Most are nbc_isomething.c but one
was named coll_libnbc_ireduce_scatter_block.c.
- Made the NBC_Schedule "structure" and object so it can be
retained/released. This may enable the use of schedule caching at a
later time. More testing will be needed to ensure the caching code
works. If it doesn't the code should be stripped out completely.
- Added code to simply common case of scheduling send/recv +
barrier.
- Code cleanup for readability.
The code now passes the clang static analyzer.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Some OFI providers such as "sockets" are used for debugging
purposes mostly. For these providers, other components usually
offer better performance -- e.g. for sockets, the BTL/TCP would
be a better choice.
Thus, we chose to ignore some providers unless explicitly asked
by the user on the command line:
e.g. --mca mtl_ofi_provider sockets
When configured with --enable-picky
topo_base_lazy_init.c compiles with a warning:
CC base/topo_base_lazy_init.lo
base/topo_base_lazy_init.c:46:67: warning: implicit conversion from enumeration type 'enum mca_base_register_flag_t' to different enumeration type 'mca_base_open_flag_t' (aka 'enum mca_base_open_flag_t') [-Wenum-conversion]
err = mca_base_framework_open (&ompi_topo_base_framework, MCA_BASE_REGISTER_DEFAULT);
This commit fixes this implicit conversion problem.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
our optimized default file view. Otherwise, performance will suffer. file_get_view should still return the correct filetype, not our optimized default file view
- some application use MPI_File_delete as a collective function (e.g. IOR), which I think is not really covered by the standard. Right now, one process succeeds and theother ones return an error code. Fix that by not returning no error if the file that we try to delete does not exist anymore, to make these applications work.
Retain inline progress function for ofi
mtl, but have a non-inlined progress function
which is registered with the opal progress
mechanism.
@jithinjosepkl
I've bad news about the psm provider. I still notice
segfaults - not always - but frequently at finalize
when using the psm provider. I don't notice this
when using the sockets provider.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
at Inria Bordeaux. This allows us to take advantage of the remap
capability of MPI to rearrange the ranks beased on the weights
povided by the application.
Fix the indentation and protect with __DEBUG__ one fprintf.
Add the Cecill-B license to the imported library.
Fix a compiler warning.
Restrict the TreeMatch dependencies.
The TreeMatch software is released under BSD3 (as indicated by their
copyright information @
https://gforge.inria.fr/scm/viewvc.php/COPYING?view=markup&root=treematch).
Update the README.
Even if the mutex is actually located in
sm_data->sm_offset_ptr->mutex, have sm_data->mutex point to it. This
avoids a few #if blocks that are otherwise identical.
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This commit fixes several bugs in the static request objects used by
ob1 for blocking send/receive operations.
- Fix memory leak when using MPI_THREAD_MULTIPLE. Requests were
allocated off the free list but were destructed and NOT returned.
- Fix double-destruct of static objects. There is no reason to
CONSTRUCT/DESTUCT the static object for each send/receive
operation. This adds overhead and no benefit. To keep the code
clean helper functions have been added to finalize ob1 send/receive
requests.
- Remove now unnecessary include of alloca.h.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This new MTL runs over PSM2 for Omni Path. PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.
PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
The Portals4 MTL allocates two Portals IDs requesting specific
well-known IDs and assumes that those IDs are allocated. If those IDs
are in use, PtlPTAlloc() will allocate a different ID. This commit
verifies that the requested IDs were allocated.
CID 1295340 Unchecked return value (CHECKED_RETURN)
Check the return code of mca_base_framework_open. If the call fails for some reason
the component array will not be properly defined. This will cause issues in
mca_topo_base_find_available.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
CID 70630 Dereference before null check
Cleaned up useless goto statements and deleted NULL check. If
mca_base_select returns success than best_module and best_component
will always be non-NULL.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
When activating short receive blocks on the overflow list, remove
the PTL_ME_EVENT_LINK_DISABLE flag so the event gets generated.
Without PTL_EVENT_LINK, the block status can't reach the activated
state.
Replace #ifdef with #if for Open MPI configure booleans, because
Open MPI configure booleans are always defined and the value must
be checked.
In days past, some implementations of Portals4 could not cover all
of memory with a single Memory Descriptor so multiple large
overlapping Memory Descriptors were created. Because none of the
current implementations have this limitation (and no future
implementations should either), this commit removes the overlapping
Memory Descriptors code.
If OMPI is initialized as thread multiple, then it is possible for
Portals events to be processed out of order by different threads.
Out of order events could lead to reactivation of the block
(PTL_EVENT_AUTO_FREE) before the block is removed from the active
list (PTL_EVENT_AUTO_UNLINK). This commit adds a status field to
ompi_mtl_portals4_recv_short_block_t that coordinates these events.
Rename all files and symbols from "io_romio" to "io_romio314". This
fixes --disable-dlopen builds (because they were missing
the mca_io_romio314_component symbol).
Per the MPI 3.0 standard (chapter 7, page 310) :
"If maxindegree or maxoutdegree is smaller than the numbers returned by
MPI_DIST_GRAPH_NEIGHBOR_COUNT, then only the first part of the full list is returned."
The length parameter of ompi_mtl_portals4_long_isend() was declared
as "int", which may not be big enough depending on the platform and
compiler options used. This commit changes the type to size_t to
prevent overflow.
The source field was 16 bits which is not sufficient for many
current and future machines. This commit expands the source field
to 24 bits and reduces the tag field from 32 bits to 24 bits.
This commit fixes a bug identified by MTT that occurred when mixing
passive and active target synchronization. The bugs fixed in this
commit are:
- Do not update incoming fragment counts for any type of unbuffered
control message. These messages are out-of-band and should not be
considered towards the signal counts.
- Complete a change from using received counts to expected counts for
lock, unlock, and flush acks. Part of the change made it into
master before the rest was ready. This was preventing wakeups in
some cases.
- Turn the passive_target_access_epoch module member into a
counter. As long as at least one peer is locked we are in a
passive-target epoch and not an active target one. This fix will
ensure that fragment flags are set appropriately.
fixes#538
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit also fixes a problem with the lazy opening of topo
components. The topo framework incorrectly: 1) checked if the topo
framework was open by checking the length of the components list, and
2) called the framework open directly instead of using
mca_base_framework_open.
fixes#544
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
PtlMDRelease() was called if read_msg() returned a failure code.
This commit moves the PtlMDRelease() inside read_msg() so that it
doesn't get called in cases where the failure happens before or at
the PtlMDBind().
This commit adds an MCA variable to select Portals4 logical
addressing, populates the logical-to-physical mapping table and
initializes the NI in this mode.
Based on some on-list and IM discussion with @hjelmn about
open-mpi/ompi@40b7643119, change the testing to a switch/case. If we
fall into the default case, assert() error (because it's an OMPI
developer programming error).
The fragment flush code tries to send the active fragment before
sending any queued fragments. This could cause osc messages to arrive
out-of-order at the target (bad). Ensure ordering by alway sending
the active fragment after sending queued fragments.
This commit also fixes a bug when a synchronization message (unlock,
flush, complete) can not be packed at the end of an existing active
fragment. In this case the source process will end up sending 1 more
fragment than claimed in the synchronization message. To fix the issue
a check has been added that fixes the fragment count if this situation
is detected.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>