Use $(AM_CPPFLAGS) in $(usnic_btl_run_tests_CPPFLAGS) so that we don't
have to replicate hard-coded values.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
do define the OMPI_LIBMPI_NAME macro via the CPPFLAGS.
The issue occurs when Open MPI is configured with
--enable-opal-btl-usnic-unit-tests
Thanks George Marselis for reporting this issue
Refs. open-mpi/ompi#6441
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Update the OPAL_CHECK_OFI configury macro:
- Make it safe to call the macro multiple times:
- The checks only execute the first time it is invoked
- Subsequent invocations, it just emits a friendly "checking..."
message so that configure output is sensible/logical
- With the goal of ultimately removing opal/mca/common/ofi, rename the
output variables from OPAL_CHECK_OFI to be
opal_ofi_{happy|CPPFLAGS|LDFLAGS|LIBS}.
- Update btl/ofi, btl/usnic, and mtl/ofi for these new conventions.
- Also, don't use AC_REQUIRE to invoke OPAL_CHECK_OFI because that
causes the macro to be invoked at a fairly random time, which makes
configure stdout confusing / hard to grok.
- Remove a little left-over kruft in OPAL_CHECK_OFI, too (which
resulted in an indenting change, making the change to
opal_check_ofi.m4 look larger than it really is).
Thanks Alastair McKinstry for the report and initial fix.
Thanks Rashika Kheria for the reminder.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Now Open MPI requires a C99 compiler. Checking availability of
the following types is no more needed.
- `long long` (`signed` and `unsigned`)
- `long double`
- `float _Complex`
- `double _Complex`
- `long double _Complex`
Furthermore, the `#if HAVE_[TYPE]` style checking is not correct.
Availability of C types is checked by `AC_CHECK_TYPES` in `configure.ac`.
`AC_CHECK_TYPES` defines macro `HAVE_[TYPE]` as `1` in `opal_config.h`
if the `[TYPE]` is available. But it does not define `HAVE_[TYPE]`
(instead of defining as `0`) if it is not available. So even if we
need `HAVE_[TYPE]` checking, it should be `#if defined(HAVE_[TYPE])`.
I didn't remove `AC_CHECK_TYPES` for these types in `configure.ac`
since someone may use `HAVE_[TYPE]` macros somewhere.
Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
The Open MPI code base assumed that asprintf always behaved like
the FreeBSD variant, where ptr is set to NULL on error. However,
the C standard (and Linux) only guarantee that the return code will
be -1 on error and leave ptr undefined. Rather than fix all the
usage in the code, we use opal_asprintf() wrapper instead, which
guarantees the BSD-like behavior of ptr always being set to NULL.
In addition to being correct, this will fix many, many warnings
in the Open MPI code base.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
In many cases, this was a simple string replace. In a few places, it
entailed:
1. Updating some comments and removing now-redundant foo[size-1]='\0'
statements.
2. Updating passing (size-1) to (size) (because opal_string_copy()
wants the entire destination buffer length).
This commit actually fixes a bunch of potential (yet quite unlikely)
bugs where we could have ended up with non-null-terminated strings.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Follow-on to 8097d09858: now that BTL_VERSION is defined in btl.h, be
a little smarter about whether we define it or not.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Cisco wrote a bipartite graph solver to properly solve
interface pair selection for usNIC. Using the reachable
framework, the TCP BTL (and possibly the runtime network
code) can use the graph solver to make more optimal pair
selection. Jeff was happy to have the code more broadly
used, but didn't have time to do the move, hence this
commit.
There are a couple of minor changes to the code compared
to the usNIC version. Obviously, the functions have
been renamed to match naming convention for their new
home. Since it's easier to write unit tests for
util/ code, the unit tests have been made first class
tests run at "make check" time. This last bit required
moving some of the definitions into a new header,
bipartite_graph_internal.h, so that they could be
included in both the library code and the test code.
Signed-off-by: Brian Barrett <bbarrett@amazon.com>
* Resolves#3705
* Components should link against the project level library to better
support `dlopen` with `RTLD_LOCAL`.
* Extend the `mca_FRAMEWORK_COMPONENT_la_LIBADD` in the `Makefile.am`
with the appropriate project level library:
```
MCA components in ompi/
$(top_builddir)/ompi/lib@OMPI_LIBMPI_NAME@.la
MCA components in orte/
$(top_builddir)/orte/lib@ORTE_LIB_PREFIX@open-rte.la
MCA components in opal/
$(top_builddir)/opal/lib@OPAL_LIB_PREFIX@open-pal.la
MCA components in oshmem/
$(top_builddir)/oshmem/liboshmem.la"
```
Note: The changes in this commit were automated by the script in
the commit that proceeds it with the `libadd_mca_comp_update.py`
script. Some components were not included in this change because
they are statically built only.
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
Not sure how/when this got deleted, but put back the "Cisco usNIC"
line in the transport summary at the end of configure.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
usnic endpoints was always created with default send credit value of 8. This
commit assign the correct number from the hardware instead.
Signed-off-by: Thananon Patinyasakdikul <apatinya@cisco.com>
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
Follow on to 7bd2de9960419422a4591f4b5d286f1f911a0a47: move setting
the iov_limit to 1 earlier in the startup sequence.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The usNIC BTL does not use more than 1 iov, so be sure to set it to 1
so that we don't allocate cq/rq/sq entries based on a default (i.e.,
>1) number of iovs per entry.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
This PR renames the common library for OFI libfabric from
libfabric to ofi. There are a number of reasons this
is good to do:
1) its shorter and replaces 9 characters with three for
function names for what may eventually be a fairly extensive interface
2) OFI is the term used for MTL and RML components that use
the OFI libfabric interface
3) A planned OSC component will also use the OFI term.
4) Other HPC libraries that can use OFI libfabric tend to use
the term "ofi" internally and also in their configure options
relevant to OFI libfabric (i.e. MPICH/CH4, Intel MPI, Sandia SHMEM)
There seem to be comments in places in the Open MPI source
code that indicate that this common library will be going away.
Far from it as we will want to be able to share things like
AV objects between OMPI and possibly OSHMEM components that
use the OFI libfabric interface.
This PR also adds a synonym to the --with-libfabric(-libdir)
configury options: --with-ofi and with-ofi-libdir.
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
This should probably not go to the v2.x branch, since it changes the
output format of the usnic stats.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Double check the queue lengths that we get back from libfabric to
ensure that they are at least as long as we need. They *should* never
be shorter than we need, but let's just check to be sure.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Don't just blindly send ACKs; ensure that we have send credits before
doing so. If we don't have any send credits, just don't send the ACK
(it'll come again soon enough; it's not a tragedy if we don't send it
now).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The libfabric usnic provider may give you back TX/RX queues that are
longer than you asked for. So just use the TX/RQ/CQ lengths that we
asked for, regardless of what length comes back.
Additionally, keep the length of the priority channel CQ separate from
the length of the data CQ.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Since the usnic BTL is single-threaded in this area, there really is
no danger, but don't use one of the pointers hanging off the frag
after we return it to the freelist. Instead, save the endpoint
pointer before returning the frag.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The types are technically typedef equivalent, but it's less confusing
to use the types that agree with the name of the constructor.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:
* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
instead of the topology itself, if available, thus avoiding instantiating the topology
* openib BTL. This uses the distance matrix. At present, I haven't developed a method
for replacing that reference. Thus, this component will instantiate the topology
* usnic BTL. Uses the distance matrix.
* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
the topology
* ess base functions. If a process is direct launched and not bound at launch, this
code attempts to bind it. Thus, procs in this scenario will instantiate the
topology
Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.
Fix pernode binding policy
Properly handle the unbound case
Correct pointer usage
Do not free static error messages!
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
We only support running with libfabric v1.3 or greater. So it's safe
to remove the legacy/adaptive cq_readerr() behavior.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
There are critical usnic libfabric AV insert bugs before v1.3, so
don't allow any version prior to v1.3 at run time (still allow
*compiling* with earlier versions, though, since the ABI guarantees
allow us to compile with an earlier libfabric and run with a later
libfabric).
Switch to using fi_version() to check the version (instead of calling
fi_getinfo()) as a potentially lighter-weight / simpler solution.
This allows us to only call fi_getinfo() once.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The (effective) "+42" computation was, in fact, the incorrect answer
in this case (gasp!).
We should just take the max_msg_size from the command (which came from
the libfabric endpoint max_msg_size attribute in the client) and
subtract off the max header size: 68 (which is explained in the
comment). This will result in a "large" message size which is likely
slightly smaller than the MTU, but still right up near the MTU, and
therefore good enough.
Note: the old computation (i.e., -(68-42)) worked fine when we asked
for Libfabric API v1.1 because the usnic provider would return a
max_msg_size that was already less than the MTU due to FI_PREFIX
behavior shenanigans. Once we started asking for Libfabric API v1.4,
the usnic Libfabric provider started returning (MTU + prefix_size),
and the -(68-42) computation started giving a value that was over the
MTU. This caused sendto() on the connectivity checker UDP socket
to fail.
This commit also removes an old/misleading comment.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
* Add a configure time option to rename libmpi(_FOO).*
- `--with-libmpi-name=STRING`
* This commit only impacts the installed libraries.
Internal, temporary libraries have not been renamed to limit the
scope of the patch to only what is needed.
For example:
```shell
shell$ ./configure --with-libmpi-name=wookie
...
shell$ find . -name "libmpi*"
shell$ find . -name "libwookie*"
./lib/libwookie.so.0.0.0
./lib/libwookie.so.0
./lib/libwookie.so
./lib/libwookie.la
./lib/libwookie_mpifh.so.0.0.0
./lib/libwookie_mpifh.so.0
./lib/libwookie_mpifh.so
./lib/libwookie_mpifh.la
./lib/libwookie_usempi.so.0.0.0
./lib/libwookie_usempi.so.0
./lib/libwookie_usempi.so
./lib/libwookie_usempi.la
shell$
```
With libfabric v1.4, the usnic provider changed the values of its
fabric and domain name strings (compared to libfabric <v1.4). Update
the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain
names.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>