We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
Writing to the pml_monitoring_flush variable will set the filename of
the output file.
Stopping a session for the pml_monitoring_flush will force the
generation of the nobitoring output file (as long as the filename
is not NULL).
To reset the monitoring, une has to bind the pml_monitoring_flush to a
session.
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"
Per Brice suggestion make all data count and message length be
uint64_t.
counting or not the collective traffic as a separate entity. The need
for such a PML is simply because the PMPI interface doesn't allow us to
identify the collective generated traffic.
This commit add protection to the group, ob1, and bml endpoint lookup
code. For ob1 and the bml a lock has been added. For performance
reasons the lock is only held if a bml or ob1 endpoint does not
exist. ompi_group_dense_lookup no uses opal_atomic_cmpset to ensure
the proc is only retained by the thread that actually updates the
group.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit changes the priority of mtl components to be relative to
pml/ob1 and updates the mtl interface to expose this priority. cm now
sets its own priority based on the priority of the selected mtl
component.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit removes code that checks the ob1 priority vs the previous
priority. The previous priority is meaningless here and may only cause
ob1 to disable itself when it shouldn't.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This patch removes a priority check that disables cm if the previous
pml had higher priority. The check was incorrect as coded and is
unnecessary as we finalize all but one pml anyway.
Fixesopen-mpi/ompi#1035
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Once a FIN control message is appended to the pending list,
the ob1 PML attempts to send the FIN again in the `mca_pml_ob1_process_pending_packets` function.
But if the PML failed to sent the FIN again, the `mca_pml_ob1_send_fin`
function creates a new `mca_pml_ob1_pckt_pending_t` object and the
old object is not retured to the free list.
Fix CID 1315271: Constant expression result
The intent of this conditional is to not produce a peruse event for
probe or mprobe requests. Coverity is correct that the expression is
always true. Changed the || to && to fix. Also moved the conditional
within an OMPI_WANT_PERUSE to ensure the conditional is not evaluated
if peruse is disabled.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit adds support to the pml, mtl, and btl frameworks for
components to indicate at runtime that they do not support the new
dynamic add_procs behavior. At the high end the lack of dynamic
add_procs support is signalled by the pml using the new pml_flags
member to the pml module structure. If the
MCA_PML_BASE_FLAG_REQUIRE_WORLD flag is set MPI_Init will generate the
ompi_proc_t array passed to add_proc from ompi_proc_world () instead
of ompi_proc_get_allocated ().
Both cm and ob1 have been updated to detect if the underlying mtl and
btl components support dynamic add_procs.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit contains the following changes:
- pml/ob1: use the bml accessor function when requesting a bml
endpoint. this will ensure that bml endpoints are only created when
needed. for example, a bml endpoint is not requested and not
allocated when receiving an eager message from a peer.
- pml/ob1: change the pml_procs array in the ob1 communicator to a
proc pointer array. at the cost of a single level of extra
redirection this will allow us to allocate pml procs on demand.
- pml/ob1: add an accessor function to access the pml proc structure
for a given peer. this function will allocate the proc if it
doesn't already exist.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Bring Slurm PMI-1 component online
Bring the s2 component online
Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.
Bring the OMPI pubsub/pmi component online
Get comm_spawn working again
Ensure we always provide a cpuset, even if it is NULL
pmix/cray: adjust cray pmix component for pmix
Make changes so cray pmix can work within the integrated
ompi/pmix framework.
Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet
Cleanup comm_spawn - procs now starting, error in connect_accept
Complete integration
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This commit fixes several bugs in the static request objects used by
ob1 for blocking send/receive operations.
- Fix memory leak when using MPI_THREAD_MULTIPLE. Requests were
allocated off the free list but were destructed and NOT returned.
- Fix double-destruct of static objects. There is no reason to
CONSTRUCT/DESTUCT the static object for each send/receive
operation. This adds overhead and no benefit. To keep the code
clean helper functions have been added to finalize ob1 send/receive
requests.
- Remove now unnecessary include of alloca.h.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This new MTL runs over PSM2 for Omni Path. PSM2 is a descendant of PSM
with changes to support more ranks and some MPI-3 features like mprobe.
PSM2 will only support Omni Path networks; PSM only supports True Scale.
Likewise, the existing PSM MTL will continue to be maintained for True
Scale, while the PSM2 MTL is developed and maintained for Omni Path.
from the message queues (a debugging feature). With this approach
all blocking (single threaded) requests are allocated from the main
freelist, so they will be accounted for during the message queues
investigation).
This patch tries to do as little as possible in the PML CM blocking
send/receive routines. Basically, avoid creating and filling in an
entire request object. An OMPI-level request is still needed, but we
can create that on the stack instead of going to a free list.
Signed-off-by: Andrew Friedley <andrew.friedley@intel.com>
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.
All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.
With the changes introduced in the previous patches in this series
some goto constructs for cleanup are no longer necessary and removed.
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.
The FT code used barrier mechanisms which have been removed
with aec5cd08bd. This patch replaces
all those different barriers with opal_pmix.fence(NULL, 0);
I am not sure this is completely correct but at least a starting
point for a review.
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.
This first patch moves orte_cr_continue_like_restart from ORTE
to opal_cr_continue_like_restart in OPAL. This only leaves three
calls from OPAL to ORTE in the FT code. As it is not yet 100%
clear how to handle these calls the code orte_sstore.set_attr()
has been #ifdef'd out for now.
This commit should resolve an issue seen with CUDA-aware support. The
problem came in with BTL 3.0. Before 3.0 the size of the copy was
stored in the incoming segment's des_remote_count field. This field
does not exist in BTL 3.0 so I stored the value in the
des_segment_count field. This caused problems with the cuda support
code. To fix the issue the endpoint pointer is now stored in the in
fragment's endpoint pointer which free's up the segment's des_cbdata
pointer for storing the transfer size.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
pml_yalla_del_comm may be called after yalla module is finalized, which
leads to invalid memory access if mxm context is already destroyed in
this point.
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.
This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.
Notes:
OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Please verify your components have been updated correctly. Keep in
mind that in terms of threading:
OPAL_FREE_LIST_GET -> opal_free_list_get_st
OPAL_FREE_LIST_RETURN -> opal_free_list_return_st
I used the opal_using_threads() variant anytime it appeared multiple
threads could be operating on the free list. If this is not the case
update to _st. If multiple threads are always in use change to _mt.