This commit removes the service and async event threads from the
openib btl. Both threads are replaced by opal progress thread
support. The run_in_main function is now supported by allocating an
event and adding it to the sync event base. This ensures that the
requested function is called as part of opal_progress.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
since ibv_create_xrc_rcv_qp is now deprecated, and in order to
be "future-proof", we have to consider the case in which only XRC Domains are supported.
also, correctly handle distro that ship broken ibverbs devel headers
Thanks Paul Hargrove for the detailled report.
@ggouaillardet is likely offline for the weekend, but master is broken
on RHEL 6.5 systems that do not have MOFED installed. So I'm taking
the liberty of revering this commit; I'm guessing Gilles will fixup
and re-commit next week.
This reverts commit 77f8282d51.
since ibv_create_xrc_rcv_qp is now deprecated, and in order to
be "future-proof", we have to consider the case in which only XRC Domains are supported.
Thanks Paul Hargrove for the detailled report.
CID 1269674 Ignoring number of bytes read (CHECKED_RETURN)
Check that we read enough bytes to get a complete async command.
CID 1269793 Missing break in switch (MISSING_BREAK)
Added comment to indicate fall through was intentional.
CID 1269702: Constant variable guards dead code (DEADCODE)
Remove an unused argument to opal_show_help. This will quiet the
coverity issue.
CID 1269675 Ignoring number of bytes read (CHECKED_RETURN)
Check that at least sizeof(int) bytes are read. If this is not the
case then it is an error.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
Properly setup the opal_process_info structure early in the initialization procedure. Define the local hostname right at the beginning of opal_init so all parts of opal can use it. Overlay that during orte_init as the user may choose to remove fqdn and strip prefixes during that time. Setup the job_session_dir and other such info immediately when it becomes available during orte_init.
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.