btl_openib_connect_udcm when notifying not to listen to an fd to ensure
that the main thread does not continue until the service thread has
processed the message
Adds ability to send message to openib async thread to tell it to
ignore the ERR state on a specific QP. Adds this call to udcm_module_finalize
so when we set the error state on the QP it doesn't cause the
openib async thread to abort the mpi program prematurely
Fixes trac:3161
This commit was SVN r27064.
The following Trac tickets were found above:
Ticket 3161 --> https://svn.open-mpi.org/trac/ompi/ticket/3161
ibv_get_device_list_compat() and not finding it, I finally realized
that it was a function in OMPI. So let's name it with a proper ompi_
prefix, not an ibv_ prefix.
This commit was SVN r26867.
hook early in the setup, but ''not'' during the component register
function. And then properly unset it if was set.
This commit was SVN r26697.
The following Trac tickets were found above:
Ticket 3130 --> https://svn.open-mpi.org/trac/ompi/ticket/3130
helper file, even if we find that the system has <infiniband/verbs.h>.
The reason is because there are some inline functions in that verbs
helper file that invoke ibv_* functions. Some linkers (e.g., Solaris
Studio Compilers) will instantiate those static inline functions --
even if we don't use them -- and therefore we need to be able to
resolve the ibv_* symbols at link time.
But since -libverbs is only specified in places where we use other
ibv_* functions (e.g., the OpenFabrics-based BTLs), that means that
linking random executables can/will fail (e.g., orterun).
So instead, introduce a new #define: OPAL_HWLOC_WANT_VERBS_HELPER. If
this macro is set to 1 before including opal/mca/hwloc/hwloc.h, then
you'll also get the hwloc OpenFabrics verbs helper header file (*if*
hwloc found <infiniband/verbs.h> -- otherwise, it'll #error).
This commit was SVN r26417.
* Remove paffinity, maffinity, and carto frameworks -- they've been
wholly replaced by hwloc.
* Move ompi_mpi_init() affinity-setting/checking code down to ORTE.
* Update sm, smcuda, wv, and openib components to no longer use carto.
Instead, use hwloc data. There are still optimizations possible in
the sm/smcuda BTLs (i.e., making multiple mpools). Also, the old
carto-based code found out how many NUMA nodes were ''available''
-- not how many were used ''in this job''. The new hwloc-using
code computes the same value -- it was not updated to calculate how
many NUMA nodes are used ''by this job.''
* Note that I cannot compile the smcuda and wv BTLs -- I ''think''
they're right, but they need to be verified by their owners.
* The openib component now does a bunch of stuff to figure out where
"near" OpenFabrics devices are. '''THIS IS A CHANGE IN DEFAULT
BEHAVIOR!!''' and still needs to be verified by OpenFabrics vendors
(I do not have a NUMA machine with an OpenFabrics device that is a
non-uniform distance from multiple different NUMA nodes).
* Completely rewrite the OMPI_Affinity_str() routine from the
"affinity" mpiext extension. This extension now understands
hyperthreads; the output format of it has changed a bit to reflect
this new information.
* Bunches of minor changes around the code base to update names/types
from maffinity/paffinity-based names to hwloc-based names.
* Add some helper functions into the hwloc base, mainly having to do
with the fact that we have the hwloc data reporting ''all''
topology information, but sometimes you really only want the
(online | available) data.
This commit was SVN r26391.
Roll in the ORTE state machine. Remove last traces of opal_sos. Remove UTK epoch code.
Please see the various emails about the state machine change for details. I'll send something out later with more info on the new arch.
This commit was SVN r26242.
specify btl_tcp_if_include because btl_tcp_if_exclude is defaulted to
the loopback devices.
This commit does a few things:
* Introduce a new OPAL MCA base function:
mca_base_param_check_exclusive_string(). It checks to see that the
''user'' does not set two MCA parameters that are mutually
exclusive by checking the source of those MCS param values.
* Use the above function in many BTLs (and the OOB TCP) to ensure
that <foo>_if_include and <foo>_if_exclude are not both specified
''by the user''.
* Re-arrange many of these BTLs to move their MCA registration code
into a separate component_register() function (vs. the
component_open() function).
This code has been nominally reviewed and checked by Ralph, George,
Terry, and Shiqing.
This commit was SVN r25043.
The following SVN revision numbers were found above:
r24976 --> open-mpi/ompi@8f4ac54336
Update the CMake script for checking mca subdirs.
Add windows support for __attribute__ packed structures.
Define usleep and posix_memalign with equivalent windows functions.
And a few minor fixes, type casts.
This commit was SVN r24429.
code from upper level into btl configure.m4. Changed
prefix from "OMPI" to "BTL" in preprocessor macro. Add
an mca param that shows it has been configured in.
This commit was SVN r24270.
on some systems caused by the definition of malloc in
opal_config_bottom.h getting expanded in the system malloc.h when
OPAL_ENABLE_MEM_DEBUG is set to 1.
This commit was SVN r24210.
verbose statement that shows up when you --mca btl_base_verbose 100.
It clearly states that the openib BTL disqualifies itself when
MPI_THREAD_MULTIPLE is used.
This commit was SVN r24209.