1
1
openmpi/opal/mca/btl/openib/configure.m4

163 строки
6.8 KiB
Plaintext
Исходник Обычный вид История

# -*- shell-script -*-
#
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
# University of Stuttgart. All rights reserved.
# Copyright (c) 2004-2005 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2007-2013 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2008-2011 Mellanox Technologies. All rights reserved.
# Copyright (c) 2011 Oracle and/or its affiliates. All rights reserved.
# Copyright (c) 2013 NVIDIA Corporation. All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# MCA_btl_openib_POST_CONFIG([should_build])
# ------------------------------------------
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.
2014-07-26 04:47:28 +04:00
AC_DEFUN([MCA_opal_btl_openib_POST_CONFIG], [
AM_CONDITIONAL([MCA_btl_openib_have_xrc], [test $1 -eq 1 -a "x$btl_openib_have_xrc" = "x1"])
AM_CONDITIONAL([MCA_btl_openib_have_rdmacm], [test $1 -eq 1 -a "x$btl_openib_have_rdmacm" = "x1"])
AM_CONDITIONAL([MCA_btl_openib_have_dynamic_sl], [test $1 -eq 1 -a "x$btl_openib_have_opensm_devel" = "x1"])
AM_CONDITIONAL([MCA_btl_openib_have_udcm], [test $1 -eq 1 -a "x$btl_openib_have_udcm" = "x1"])
])
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.
2014-07-26 04:47:28 +04:00
# MCA_btl_openib_CONFIG([action-if-can-copalle],
# [action-if-cant-copalle])
# ------------------------------------------------
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.
2014-07-26 04:47:28 +04:00
AC_DEFUN([MCA_opal_btl_openib_CONFIG],[
AC_CONFIG_FILES([opal/mca/btl/openib/Makefile])
OPAL_VAR_SCOPE_PUSH([cpcs have_threads LDFLAGS_save LIBS_save])
cpcs="oob"
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.
2014-07-26 04:47:28 +04:00
OPAL_CHECK_OPENFABRICS([btl_openib],
[btl_openib_happy="yes"
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-) WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic. This commit was SVN r32317.
2014-07-26 04:47:28 +04:00
OPAL_CHECK_OPENFABRICS_CM([btl_openib])],
[btl_openib_happy="no"])
AS_IF([test "$btl_openib_happy" = "yes"],
[# With the new openib flags, look for ibv_fork_init
LDFLAGS_save="$LDFLAGS"
LIBS_save="$LIBS"
LDFLAGS="$LDFLAGS $btl_openib_LDFLAGS"
LIBS="$LIBS $btl_openib_LIBS"
AC_CHECK_FUNCS([ibv_fork_init])
LDFLAGS="$LDFLAGS_save"
LIBS="$LIBS_save"
$1],
[$2])
AC_MSG_CHECKING([for thread support (needed for rdmacm/udcm)])
Fixes trac:1210, #1319 Commit from a long-standing Mercurial tree that ended up incorporating a lot of things: * A few fixes for CPC interface changes in all the CPCs * Attempts (but not yet finished) to fix shutdown problems in the IB CM CPC * #1319: add CTS support (i.e., initiator guarantees to send first message; automatically activated for iWARP over the RDMA CM CPC) * Some variable and function renamings to make this be generic (e.g., alloc_credit_frag became alloc_control_frag) * CPCs no longer post receive buffers; they only post a single receive buffer for the CTS if they use CTS. Instead, the main BTL now posts the main sets of receive buffers. * CPCs allocate a CTS buffer only if they're about to make a connection * RDMA CM improvements: * Use threaded mode openib fd monitoring to wait for for RDMA CM events * Synchronize endpoint finalization and disconnection between main thread and service thread to avoid/fix some race conditions * Converted several structs to be OBJs so that we can use reference counting to know when to invoke destructors * Make some new OBJ's have opal_list_item_t's as their base, thereby eliminating the need for the local list_item_t type * Renamed many variables to be internally consistent * Centralize the decision in an inline function as to whether this process or the remote process is supposed to be the initiator * Add oodles of OPAL_OUTPUT statements for debugging (hard-wired to output stream -1; to be activated by developers if they want/need them) * Use rdma_create_qp() instead of ibv_create_qp() * openib fd monitoring improvements: * Renamed a bunch of functions and variables to be a little more obvious as to their true function * Use pipes to communicate between main thread and service thread * Add ability for main thread to invoke a function back on the service thread * Ensure to set initiator_depth and responder_resources properly, but putting max_qp_rd_ataom and ma_qp_init_rd_atom in the modex (see rdma_connect(3)) * Ensure to set the source IP address in rdma_resolve() to ensure that we select the correct OpenFabrics source port * Make new MCA param: openib_btl_connect_rdmacm_resolve_timeout * Other improvements: * btl_openib_device_type MCA param: can be "iw" or "ib" or "all" (or "infiniband" or "iwarp") * Somewhat improved error handling * Bunches of spelling fixes in comments, VERBOSE, and OUTPUT statements * Oodles of little coding style fixes * Changed shutdown ordering of btl; the device is now an OBJ with ref counting for destruction * Added some more show_help error messages * Change configury to only build IBCM / RDMACM if we have threads (because we need a progress thread) This commit was SVN r19686. The following Trac tickets were found above: Ticket 1210 --> https://svn.open-mpi.org/trac/ompi/ticket/1210
2008-10-06 04:46:02 +04:00
have_threads=`echo $THREAD_TYPE | awk '{ print [$]1 }'`
if test "x$have_threads" = "x"; then
have_threads=none
fi
AC_MSG_RESULT([$have_threads])
AS_IF([test "$btl_openib_happy" = "yes"],
[if test "x$btl_openib_have_xrc" = "x1"; then
cpcs="$cpcs xoob"
fi
Fixes trac:1210, #1319 Commit from a long-standing Mercurial tree that ended up incorporating a lot of things: * A few fixes for CPC interface changes in all the CPCs * Attempts (but not yet finished) to fix shutdown problems in the IB CM CPC * #1319: add CTS support (i.e., initiator guarantees to send first message; automatically activated for iWARP over the RDMA CM CPC) * Some variable and function renamings to make this be generic (e.g., alloc_credit_frag became alloc_control_frag) * CPCs no longer post receive buffers; they only post a single receive buffer for the CTS if they use CTS. Instead, the main BTL now posts the main sets of receive buffers. * CPCs allocate a CTS buffer only if they're about to make a connection * RDMA CM improvements: * Use threaded mode openib fd monitoring to wait for for RDMA CM events * Synchronize endpoint finalization and disconnection between main thread and service thread to avoid/fix some race conditions * Converted several structs to be OBJs so that we can use reference counting to know when to invoke destructors * Make some new OBJ's have opal_list_item_t's as their base, thereby eliminating the need for the local list_item_t type * Renamed many variables to be internally consistent * Centralize the decision in an inline function as to whether this process or the remote process is supposed to be the initiator * Add oodles of OPAL_OUTPUT statements for debugging (hard-wired to output stream -1; to be activated by developers if they want/need them) * Use rdma_create_qp() instead of ibv_create_qp() * openib fd monitoring improvements: * Renamed a bunch of functions and variables to be a little more obvious as to their true function * Use pipes to communicate between main thread and service thread * Add ability for main thread to invoke a function back on the service thread * Ensure to set initiator_depth and responder_resources properly, but putting max_qp_rd_ataom and ma_qp_init_rd_atom in the modex (see rdma_connect(3)) * Ensure to set the source IP address in rdma_resolve() to ensure that we select the correct OpenFabrics source port * Make new MCA param: openib_btl_connect_rdmacm_resolve_timeout * Other improvements: * btl_openib_device_type MCA param: can be "iw" or "ib" or "all" (or "infiniband" or "iwarp") * Somewhat improved error handling * Bunches of spelling fixes in comments, VERBOSE, and OUTPUT statements * Oodles of little coding style fixes * Changed shutdown ordering of btl; the device is now an OBJ with ref counting for destruction * Added some more show_help error messages * Change configury to only build IBCM / RDMACM if we have threads (because we need a progress thread) This commit was SVN r19686. The following Trac tickets were found above: Ticket 1210 --> https://svn.open-mpi.org/trac/ompi/ticket/1210
2008-10-06 04:46:02 +04:00
if test "x$btl_openib_have_rdmacm" = "x1" -a \
"$have_threads" != "none"; then
cpcs="$cpcs rdmacm"
if test "$enable_openib_rdmacm_ibaddr" = "yes"; then
AC_MSG_CHECKING([IB addressing])
AC_EGREP_CPP(
yes,
[
#include <infiniband/ib.h>
#ifdef AF_IB
yes
#endif
],
[
AC_CHECK_HEADERS(
[rdma/rsocket.h],
[
AC_MSG_RESULT([yes])
AC_DEFINE(BTL_OPENIB_RDMACM_IB_ADDR, 1, rdmacm IB_AF addressing support)
],
[
AC_MSG_RESULT([no])
AC_DEFINE(BTL_OPENIB_RDMACM_IB_ADDR, 0, rdmacm without IB_AF addressing support)
AC_MSG_WARN([There is no IB_AF addressing support by lib rdmacm.])
]
)],
[
AC_MSG_RESULT([no])
AC_DEFINE(BTL_OPENIB_RDMACM_IB_ADDR, 0, rdmacm without IB_AF addressing support)
AC_MSG_WARN([There is no IB_AF addressing support by lib rdmacm.])
])
else
AC_DEFINE(BTL_OPENIB_RDMACM_IB_ADDR, 0, rdmacm without IB_AF addressing support)
fi
fi
if test "x$btl_openib_have_udcm" = "x1" -a \
"$have_threads" != "none"; then
cpcs="$cpcs udcm"
fi
AC_MSG_CHECKING([which openib btl cpcs will be built])
AC_MSG_RESULT([$cpcs])])
# Enable openib device failover. It is disabled by default.
AC_MSG_CHECKING([whether openib failover is enabled])
AC_ARG_ENABLE([btl-openib-failover],
[AC_HELP_STRING([--enable-btl-openib-failover],
[enable openib BTL failover (default: disabled)])])
if test "$enable_btl_openib_failover" = "yes"; then
AC_MSG_RESULT([yes])
btl_openib_failover_enabled=1
else
AC_MSG_RESULT([no])
btl_openib_failover_enabled=0
fi
AC_DEFINE_UNQUOTED([BTL_OPENIB_FAILOVER_ENABLED], [$btl_openib_failover_enabled],
[enable openib BTL failover])
AM_CONDITIONAL([MCA_btl_openib_enable_failover], [test "x$btl_openib_failover_enabled" = "x1"])
# Check for __malloc_hook availability
AC_ARG_ENABLE(btl-openib-malloc-alignment,
AC_HELP_STRING([--enable-btl-openib-malloc-alignment], [Enable support for allocated memory alignment. Default: enabled if supported, disabled otherwise.]))
btl_openib_malloc_hooks_enabled=0
AS_IF([test "$enable_btl_openib_malloc_alignment" != "no"],
[AC_CHECK_HEADER([malloc.h],
[AC_CHECK_FUNC([__malloc_hook],
[AC_CHECK_FUNC([__realloc_hook],
[AC_CHECK_FUNC([__free_hook],
[btl_openib_malloc_hooks_enabled=1])])])])])
AS_IF([test "$enable_btl_openib_malloc_alignment" = "yes" -a "$btl_openib_malloc_hooks_enabled" = "0"],
[AC_MSG_ERROR([openib malloc alignment is requested but __malloc_hook is not available])])
AC_MSG_CHECKING([whether the openib BTL will use malloc hooks])
AS_IF([test "$btl_openib_malloc_hooks_enabled" = "0"],
[AC_MSG_RESULT([no])],
[AC_MSG_RESULT([yes])])
AC_DEFINE_UNQUOTED(BTL_OPENIB_MALLOC_HOOKS_ENABLED, [$btl_openib_malloc_hooks_enabled],
[Whether the openib BTL malloc hooks are enabled])
# make sure that CUDA-aware checks have been done
AC_REQUIRE([OPAL_CHECK_CUDA])
# substitute in the things needed to build openib
AC_SUBST([btl_openib_CFLAGS])
AC_SUBST([btl_openib_CPPFLAGS])
AC_SUBST([btl_openib_LDFLAGS])
AC_SUBST([btl_openib_LIBS])
OPAL_VAR_SCOPE_POP
])dnl