2004-01-07 19:33:37 +00:00
|
|
|
/*
|
2005-11-05 19:57:48 +00:00
|
|
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
|
|
* University Research and Technology
|
|
|
|
* Corporation. All rights reserved.
|
|
|
|
* Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
|
|
* of Tennessee Research Foundation. All rights
|
|
|
|
* reserved.
|
2015-06-23 20:59:57 -07:00
|
|
|
* Copyright (c) 2004-2010 High Performance Computing Center Stuttgart,
|
2004-11-28 20:09:25 +00:00
|
|
|
* University of Stuttgart. All rights reserved.
|
2005-03-24 12:43:37 +00:00
|
|
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
|
|
|
* All rights reserved.
|
2009-01-26 20:13:44 +00:00
|
|
|
* Copyright (c) 2009 Sun Microsystems, Inc. All rights reserved.
|
2013-10-23 15:52:05 +00:00
|
|
|
* Copyright (c) 2009-2013 Cisco Systems, Inc. All rights reserved.
|
2013-09-10 15:34:09 +00:00
|
|
|
* Copyright (c) 2013 Mellanox Technologies, Inc.
|
|
|
|
* All rights reserved.
|
2014-10-21 19:49:58 +09:00
|
|
|
* Copyright (c) 2015 Research Organization for Information Science
|
|
|
|
* and Technology (RIST). All rights reserved.
|
2015-06-23 11:31:48 -07:00
|
|
|
* Copyright (c) 2015 Intel, Inc. All rights reserved.
|
2004-11-22 01:38:40 +00:00
|
|
|
* $COPYRIGHT$
|
2015-06-23 20:59:57 -07:00
|
|
|
*
|
2004-11-22 01:38:40 +00:00
|
|
|
* Additional copyrights may follow
|
2015-06-23 20:59:57 -07:00
|
|
|
*
|
2004-01-07 19:33:37 +00:00
|
|
|
* $HEADER$
|
|
|
|
*
|
2007-04-26 09:36:47 +00:00
|
|
|
* This file is included at the bottom of opal_config.h, and is
|
2004-01-07 19:33:37 +00:00
|
|
|
* therefore a) after all the #define's that were output from
|
2004-08-06 14:30:18 +00:00
|
|
|
* configure, and b) included in most/all files in Open MPI.
|
2004-01-07 19:33:37 +00:00
|
|
|
*
|
2007-04-26 09:36:47 +00:00
|
|
|
* Since this file is *only* ever included by opal_config.h, and
|
|
|
|
* opal_config.h already has #ifndef/#endif protection, there is no
|
2004-01-07 19:33:37 +00:00
|
|
|
* need to #ifndef/#endif protection here.
|
|
|
|
*/
|
|
|
|
|
2015-06-23 20:59:57 -07:00
|
|
|
#ifndef OPAL_CONFIG_H
|
2006-02-12 01:33:29 +00:00
|
|
|
#error "opal_config_bottom.h should only be included from opal_config.h"
|
2004-11-02 13:22:08 +00:00
|
|
|
#endif
|
|
|
|
|
2006-08-22 19:28:47 +00:00
|
|
|
/*
|
|
|
|
* If we build a static library, Visual C define the _LIB symbol. In the
|
|
|
|
* case of a shared library _USERDLL get defined.
|
|
|
|
*
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* OMPI_BUILDING and _LIB define how opal_config.h
|
2006-08-22 19:28:47 +00:00
|
|
|
* handles configuring all of Open MPI's "compatibility" code. Both
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* constants will always be defined by the end of opal_config.h.
|
2006-08-22 19:28:47 +00:00
|
|
|
*
|
|
|
|
* OMPI_BUILDING affects how much compatibility code is included by
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* opal_config.h. It will always be 1 or 0. The user can set the
|
|
|
|
* value before including either mpi.h or opal_config.h and it will be
|
|
|
|
* respected. If opal_config.h is included before mpi.h, it will
|
|
|
|
* default to 1. If mpi.h is included before opal_config.h, it will
|
2006-08-22 19:28:47 +00:00
|
|
|
* default to 0.
|
|
|
|
*/
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
#ifndef OMPI_BUILDING
|
|
|
|
#define OMPI_BUILDING 1
|
2004-11-02 15:14:04 +00:00
|
|
|
#endif
|
2004-10-22 16:06:05 +00:00
|
|
|
|
2008-09-05 13:18:10 +00:00
|
|
|
/*
|
|
|
|
* Flex is trying to include the unistd.h file. As there is no configure
|
|
|
|
* option or this, the flex generated files will try to include the file
|
2013-02-28 17:31:47 +00:00
|
|
|
* even on platforms without unistd.h. Therefore, if we
|
2008-09-05 13:18:10 +00:00
|
|
|
* know this file is not available, we can prevent flex from including it.
|
|
|
|
*/
|
|
|
|
#ifndef HAVE_UNISTD_H
|
|
|
|
#define YY_NO_UNISTD_H
|
|
|
|
#endif
|
|
|
|
|
2007-05-04 09:03:37 +00:00
|
|
|
/***********************************************************************
|
|
|
|
*
|
|
|
|
* code that should be in ompi_config_bottom.h regardless of build
|
|
|
|
* status
|
|
|
|
*
|
|
|
|
**********************************************************************/
|
|
|
|
|
2007-03-01 17:25:21 +00:00
|
|
|
/*
|
|
|
|
* BEGIN_C_DECLS should be used at the beginning of your declarations,
|
|
|
|
* so that C++ compilers don't mangle their names. Use END_C_DECLS at
|
|
|
|
* the end of C declarations.
|
|
|
|
*/
|
|
|
|
#undef BEGIN_C_DECLS
|
|
|
|
#undef END_C_DECLS
|
|
|
|
#if defined(c_plusplus) || defined(__cplusplus)
|
|
|
|
# define BEGIN_C_DECLS extern "C" {
|
|
|
|
# define END_C_DECLS }
|
|
|
|
#else
|
|
|
|
#define BEGIN_C_DECLS /* empty */
|
|
|
|
#define END_C_DECLS /* empty */
|
|
|
|
#endif
|
|
|
|
|
2007-02-02 06:28:30 +00:00
|
|
|
/**
|
|
|
|
* The attribute definition should be included before any potential
|
|
|
|
* usage.
|
|
|
|
*/
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_ALIGNED
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_aligned__(a) __attribute__((__aligned__(a)))
|
|
|
|
# define __opal_attribute_aligned_max__ __attribute__((__aligned__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_aligned__(a)
|
|
|
|
# define __opal_attribute_aligned_max__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_ALWAYS_INLINE
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_always_inline__ __attribute__((__always_inline__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_always_inline__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_COLD
|
2008-05-10 10:38:51 +00:00
|
|
|
# define __opal_attribute_cold__ __attribute__((__cold__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_cold__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_CONST
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_const__ __attribute__((__const__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_const__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_DEPRECATED
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_deprecated__ __attribute__((__deprecated__))
|
|
|
|
#else
|
2007-05-04 09:03:37 +00:00
|
|
|
# define __opal_attribute_deprecated__
|
2007-02-02 06:28:30 +00:00
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_FORMAT
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_format__(a,b,c) __attribute__((__format__(a, b, c)))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_format__(a,b,c)
|
|
|
|
#endif
|
|
|
|
|
2010-08-31 10:28:51 +00:00
|
|
|
/* Use this __atribute__ on function-ptr declarations, only */
|
|
|
|
#if OPAL_HAVE_ATTRIBUTE_FORMAT_FUNCPTR
|
|
|
|
# define __opal_attribute_format_funcptr__(a,b,c) __attribute__((__format__(a, b, c)))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_format_funcptr__(a,b,c)
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_HOT
|
2008-05-10 10:38:51 +00:00
|
|
|
# define __opal_attribute_hot__ __attribute__((__hot__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_hot__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_MALLOC
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_malloc__ __attribute__((__malloc__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_malloc__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_MAY_ALIAS
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_may_alias__ __attribute__((__may_alias__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_may_alias__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_NO_INSTRUMENT_FUNCTION
|
2007-02-08 13:34:44 +00:00
|
|
|
# define __opal_attribute_no_instrument_function__ __attribute__((__no_instrument_function__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_no_instrument_function__
|
|
|
|
#endif
|
|
|
|
|
2013-10-23 15:52:05 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_NOINLINE
|
|
|
|
# define __opal_attribute_noinline__ __attribute__((__noinline__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_noinline__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_NONNULL
|
2007-02-08 13:34:44 +00:00
|
|
|
# define __opal_attribute_nonnull__(a) __attribute__((__nonnull__(a)))
|
|
|
|
# define __opal_attribute_nonnull_all__ __attribute__((__nonnull__))
|
2007-02-02 06:28:30 +00:00
|
|
|
#else
|
|
|
|
# define __opal_attribute_nonnull__(a)
|
|
|
|
# define __opal_attribute_nonnull_all__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_NORETURN
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_noreturn__ __attribute__((__noreturn__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_noreturn__
|
|
|
|
#endif
|
|
|
|
|
2010-08-31 10:28:51 +00:00
|
|
|
/* Use this __atribute__ on function-ptr declarations, only */
|
|
|
|
#if OPAL_HAVE_ATTRIBUTE_NORETURN_FUNCPTR
|
|
|
|
# define __opal_attribute_noreturn_funcptr__ __attribute__((__noreturn__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_noreturn_funcptr__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_PACKED
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_packed__ __attribute__((__packed__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_packed__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_PURE
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_pure__ __attribute__((__pure__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_pure__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_SENTINEL
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_sentinel__ __attribute__((__sentinel__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_sentinel__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_UNUSED
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_unused__ __attribute__((__unused__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_unused__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_VISIBILITY
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_visibility__(a) __attribute__((__visibility__(a)))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_visibility__(a)
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_WARN_UNUSED_RESULT
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_warn_unused_result__ __attribute__((__warn_unused_result__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_warn_unused_result__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_WEAK_ALIAS
|
2007-02-02 06:28:30 +00:00
|
|
|
# define __opal_attribute_weak_alias__(a) __attribute__((__weak__, __alias__(a)))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_weak_alias__(a)
|
|
|
|
#endif
|
|
|
|
|
2013-09-10 15:34:09 +00:00
|
|
|
#if OPAL_HAVE_ATTRIBUTE_DESTRUCTOR
|
|
|
|
# define __opal_attribute_destructor__ __attribute__((__destructor__))
|
|
|
|
#else
|
|
|
|
# define __opal_attribute_destructor__
|
|
|
|
#endif
|
|
|
|
|
2009-05-06 20:11:28 +00:00
|
|
|
# if OPAL_C_HAVE_VISIBILITY
|
2007-05-04 09:03:37 +00:00
|
|
|
# define OPAL_DECLSPEC __opal_attribute_visibility__("default")
|
|
|
|
# define OPAL_MODULE_DECLSPEC __opal_attribute_visibility__("default")
|
|
|
|
# else
|
|
|
|
# define OPAL_DECLSPEC
|
|
|
|
# define OPAL_MODULE_DECLSPEC
|
|
|
|
# endif
|
2004-10-22 16:06:05 +00:00
|
|
|
|
2009-08-20 11:42:18 +00:00
|
|
|
#if !defined(__STDC_LIMIT_MACROS) && (defined(c_plusplus) || defined (__cplusplus))
|
2006-08-22 19:28:47 +00:00
|
|
|
/* When using a C++ compiler, the max / min value #defines for std
|
|
|
|
types are only included if __STDC_LIMIT_MACROS is set before
|
|
|
|
including stdint.h */
|
|
|
|
#define __STDC_LIMIT_MACROS
|
|
|
|
#endif
|
2009-03-04 15:35:54 +00:00
|
|
|
#include "opal_config.h"
|
2006-08-22 19:28:47 +00:00
|
|
|
#include "opal_stdint.h"
|
|
|
|
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
/***********************************************************************
|
|
|
|
*
|
|
|
|
* Code that is only for when building Open MPI or utilities that are
|
|
|
|
* using the internals of Open MPI. It should not be included when
|
Major simplifications to component versioning:
- After long discussions and ruminations on how we run components in
LAM/MPI, made the decision that, by default, all components included
in Open MPI will use the version number of their parent project
(i.e., OMPI or ORTE). They are certaint free to use a different
number, but this simplification makes the common cases easy:
- components are only released when the parent project is released
- it is easy (trivial?) to distinguish which version component goes
with with version of the parent project
- removed all autogen/configure code for templating the version .h
file in components
- made all ORTE components use ORTE_*_VERSION for version numbers
- made all OMPI components use OMPI_*_VERSION for version numbers
- removed all VERSION files from components
- configure now displays OPAL, ORTE, and OMPI version numbers
- ditto for ompi_info
- right now, faking it -- OPAL and ORTE and OMPI will always have the
same version number (i.e., they all come from the same top-level
VERSION file). But this paves the way for the Great Configure
Reorganization, where, among other things, each project will have
its own version number.
So all in all, we went from a boatload of version numbers to
[effectively] three. That's pretty good. :-)
This commit was SVN r6344.
2005-07-04 20:12:36 +00:00
|
|
|
* building MPI applications
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
*
|
|
|
|
**********************************************************************/
|
2005-04-19 16:01:42 +00:00
|
|
|
#if OMPI_BUILDING
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
|
2006-10-20 03:24:59 +00:00
|
|
|
#ifndef HAVE_PTRDIFF_T
|
2009-05-06 20:11:28 +00:00
|
|
|
typedef OPAL_PTRDIFF_TYPE ptrdiff_t;
|
2006-10-20 03:24:59 +00:00
|
|
|
#endif
|
|
|
|
|
2004-01-09 18:40:26 +00:00
|
|
|
/*
|
|
|
|
* Maximum size of a filename path.
|
|
|
|
*/
|
2004-01-14 07:06:57 +00:00
|
|
|
#include <limits.h>
|
2009-01-27 22:57:50 +00:00
|
|
|
#ifdef HAVE_SYS_PARAM_H
|
|
|
|
#include <sys/param.h>
|
|
|
|
#endif
|
2004-01-09 18:40:26 +00:00
|
|
|
#if defined(PATH_MAX)
|
2009-05-06 20:11:28 +00:00
|
|
|
#define OPAL_PATH_MAX (PATH_MAX + 1)
|
2004-01-09 18:40:26 +00:00
|
|
|
#elif defined(_POSIX_PATH_MAX)
|
2009-05-06 20:11:28 +00:00
|
|
|
#define OPAL_PATH_MAX (_POSIX_PATH_MAX + 1)
|
2004-01-09 18:40:26 +00:00
|
|
|
#else
|
2009-05-06 20:11:28 +00:00
|
|
|
#define OPAL_PATH_MAX 256
|
2004-01-09 18:40:26 +00:00
|
|
|
#endif
|
2004-01-14 07:06:57 +00:00
|
|
|
|
2005-09-06 16:10:05 +00:00
|
|
|
/*
|
2006-08-22 19:28:47 +00:00
|
|
|
* Set the compile-time path-separator on this system and variable separator
|
2005-09-06 16:10:05 +00:00
|
|
|
*/
|
2006-08-21 21:55:41 +00:00
|
|
|
#define OPAL_PATH_SEP "/"
|
|
|
|
#define OPAL_ENV_SEP ':'
|
2005-09-06 16:10:05 +00:00
|
|
|
|
2004-02-10 00:07:09 +00:00
|
|
|
|
|
|
|
/*
|
|
|
|
* Do we want memory debugging?
|
2004-07-14 15:13:32 +00:00
|
|
|
*
|
|
|
|
* A few scenarios:
|
|
|
|
*
|
|
|
|
* 1. In the OMPI C library: we want these defines in all cases
|
|
|
|
* 2. In the OMPI C++ bindings: we do not want them
|
|
|
|
* 3. In the OMPI C++ executables: we do want them
|
|
|
|
*
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* So for 1, everyone must include <opal_config.h> first. For 2, the
|
|
|
|
* C++ bindings will never include <opal_config.h> -- they will only
|
|
|
|
* include <mpi.h>, which includes <opal_config.h>, but after
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
* setting OMPI_BUILDING to 0 For 3, it's the same as 1 -- just include
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* <opal_config.h> first.
|
2005-08-09 22:40:42 +00:00
|
|
|
*
|
George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
|
|
|
* Give code that needs to include opal_config.h but really can't have
|
2005-08-09 22:40:42 +00:00
|
|
|
* this stuff enabled (like the memory manager code) a way to turn us
|
|
|
|
* off
|
2004-02-10 00:07:09 +00:00
|
|
|
*/
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_ENABLE_MEM_DEBUG && !defined(OPAL_DISABLE_ENABLE_MEM_DEBUG)
|
2004-02-10 00:07:09 +00:00
|
|
|
|
2005-07-04 01:36:20 +00:00
|
|
|
/* It is safe to include opal/util/malloc.h here because a) it will only
|
2004-06-07 15:33:53 +00:00
|
|
|
happen when we are building OMPI and therefore have a full OMPI
|
2004-02-10 00:07:09 +00:00
|
|
|
source tree [including headers] available, and b) we guaranteed to
|
2005-07-04 01:36:20 +00:00
|
|
|
*not* to include anything else via opal/util/malloc.h, so we won't
|
2004-02-10 00:07:09 +00:00
|
|
|
have Cascading Includes Of Death. */
|
2005-07-04 01:36:20 +00:00
|
|
|
# include "opal/util/malloc.h"
|
2004-11-27 12:43:19 +00:00
|
|
|
# if defined(malloc)
|
|
|
|
# undef malloc
|
|
|
|
# endif
|
2005-07-04 01:36:20 +00:00
|
|
|
# define malloc(size) opal_malloc((size), __FILE__, __LINE__)
|
2004-11-27 12:43:19 +00:00
|
|
|
# if defined(calloc)
|
|
|
|
# undef calloc
|
|
|
|
# endif
|
2005-07-04 01:36:20 +00:00
|
|
|
# define calloc(nmembers, size) opal_calloc((nmembers), (size), __FILE__, __LINE__)
|
2004-11-27 12:43:19 +00:00
|
|
|
# if defined(realloc)
|
|
|
|
# undef realloc
|
|
|
|
# endif
|
2005-07-04 01:36:20 +00:00
|
|
|
# define realloc(ptr, size) opal_realloc((ptr), (size), __FILE__, __LINE__)
|
2004-11-27 12:43:19 +00:00
|
|
|
# if defined(free)
|
|
|
|
# undef free
|
|
|
|
# endif
|
2005-07-04 01:36:20 +00:00
|
|
|
# define free(ptr) opal_free((ptr), __FILE__, __LINE__)
|
2004-11-27 12:43:19 +00:00
|
|
|
|
|
|
|
/*
|
2009-05-06 20:11:28 +00:00
|
|
|
* If we're mem debugging, make the OPAL_DEBUG_ZERO resolve to memset
|
2004-11-27 12:43:19 +00:00
|
|
|
*/
|
|
|
|
# include <string.h>
|
2009-05-06 20:11:28 +00:00
|
|
|
# define OPAL_DEBUG_ZERO(obj) memset(&(obj), 0, sizeof(obj))
|
2004-11-27 12:43:19 +00:00
|
|
|
#else
|
2009-05-06 20:11:28 +00:00
|
|
|
# define OPAL_DEBUG_ZERO(obj)
|
2004-02-10 00:07:09 +00:00
|
|
|
#endif
|
2004-02-13 05:39:46 +00:00
|
|
|
|
2004-08-10 22:41:17 +00:00
|
|
|
/*
|
2005-03-29 02:48:50 +00:00
|
|
|
* printf functions for portability (only when building Open MPI)
|
2004-08-10 22:41:17 +00:00
|
|
|
*/
|
2004-08-19 19:27:15 +00:00
|
|
|
#if !defined(HAVE_VASPRINTF) || !defined(HAVE_VSNPRINTF)
|
|
|
|
#include <stdarg.h>
|
|
|
|
#include <stdlib.h>
|
|
|
|
#endif
|
|
|
|
|
2005-02-16 15:38:37 +00:00
|
|
|
#if !defined(HAVE_ASPRINTF) || !defined(HAVE_SNPRINTF) || !defined(HAVE_VASPRINTF) || !defined(HAVE_VSNPRINTF)
|
2005-07-04 02:16:57 +00:00
|
|
|
#include "opal/util/printf.h"
|
2005-02-16 15:38:37 +00:00
|
|
|
#endif
|
|
|
|
|
2004-08-10 22:41:17 +00:00
|
|
|
#ifndef HAVE_ASPRINTF
|
2005-07-04 02:16:57 +00:00
|
|
|
# define asprintf opal_asprintf
|
2004-08-10 22:41:17 +00:00
|
|
|
#endif
|
2004-08-19 19:27:15 +00:00
|
|
|
|
2004-08-10 22:41:17 +00:00
|
|
|
#ifndef HAVE_SNPRINTF
|
2005-07-04 02:16:57 +00:00
|
|
|
# define snprintf opal_snprintf
|
2004-08-10 22:41:17 +00:00
|
|
|
#endif
|
2004-08-19 19:27:15 +00:00
|
|
|
|
2004-08-10 22:41:17 +00:00
|
|
|
#ifndef HAVE_VASPRINTF
|
2005-07-04 02:16:57 +00:00
|
|
|
# define vasprintf opal_vasprintf
|
2004-08-10 22:41:17 +00:00
|
|
|
#endif
|
2004-08-19 19:27:15 +00:00
|
|
|
|
2004-08-10 22:41:17 +00:00
|
|
|
#ifndef HAVE_VSNPRINTF
|
2005-07-04 02:16:57 +00:00
|
|
|
# define vsnprintf opal_vsnprintf
|
2004-08-10 22:41:17 +00:00
|
|
|
#endif
|
2005-03-29 02:48:50 +00:00
|
|
|
|
2005-07-14 04:11:59 +00:00
|
|
|
/*
|
|
|
|
* Some platforms (Solaris) have a broken qsort implementation. Work
|
|
|
|
* around by using our own.
|
|
|
|
*/
|
2009-05-06 20:11:28 +00:00
|
|
|
#if OPAL_HAVE_BROKEN_QSORT
|
2005-07-04 02:16:57 +00:00
|
|
|
#ifdef qsort
|
|
|
|
#undef qsort
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#include "opal/util/qsort.h"
|
|
|
|
#define qsort opal_qsort
|
2005-06-26 23:11:37 +00:00
|
|
|
#endif
|
|
|
|
|
2005-07-14 04:11:59 +00:00
|
|
|
/*
|
|
|
|
* On some homogenous big-iron machines (Sandia's Red Storm), there
|
|
|
|
* are no htonl and friends. If that's the case, provide stubs. I
|
|
|
|
* would hope we never find a platform that doesn't have these macros
|
2005-12-13 06:13:25 +00:00
|
|
|
* and would want to talk to the outside world... On other platforms
|
2013-02-28 17:31:47 +00:00
|
|
|
* we fail to detect them correctly.
|
2005-07-14 04:11:59 +00:00
|
|
|
*/
|
2013-02-28 17:31:47 +00:00
|
|
|
#if !defined(HAVE_UNIX_BYTESWAP)
|
2005-07-14 04:11:59 +00:00
|
|
|
static inline uint32_t htonl(uint32_t hostvar) { return hostvar; }
|
|
|
|
static inline uint32_t ntohl(uint32_t netvar) { return netvar; }
|
|
|
|
static inline uint16_t htons(uint16_t hostvar) { return hostvar; }
|
|
|
|
static inline uint16_t ntohs(uint16_t netvar) { return netvar; }
|
|
|
|
#endif
|
|
|
|
|
2005-03-28 20:25:39 +00:00
|
|
|
/*
|
|
|
|
* Define __func__-preprocessor directive if the compiler does not
|
|
|
|
* already define it. Define it to __FILE__ so that we at least have
|
|
|
|
* a clue where the developer is trying to indicate where the error is
|
|
|
|
* coming from (assuming that __func__ is typically used for
|
|
|
|
* printf-style debugging).
|
|
|
|
*/
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
#if defined(HAVE_DECL___FUNC__) && !HAVE_DECL___FUNC__
|
2004-10-29 19:14:11 +00:00
|
|
|
#define __func__ __FILE__
|
|
|
|
#endif
|
|
|
|
|
2006-08-22 19:28:47 +00:00
|
|
|
#define IOVBASE_TYPE void
|
|
|
|
|
2015-06-23 11:31:48 -07:00
|
|
|
/* ensure the bool type is defined as it is used everywhere */
|
|
|
|
#include <stdbool.h>
|
|
|
|
|
2006-08-28 04:19:42 +00:00
|
|
|
/**
|
|
|
|
* If we generate our own bool type, we need a special way to cast the result
|
2013-02-28 17:31:47 +00:00
|
|
|
* in such a way to keep the compilers silent.
|
2006-08-28 04:19:42 +00:00
|
|
|
*/
|
|
|
|
# define OPAL_INT_TO_BOOL(VALUE) (bool)(VALUE)
|
|
|
|
|
2006-09-08 00:10:40 +00:00
|
|
|
/**
|
|
|
|
* Top level define to check 2 things: a) if we want ipv6 support, and
|
|
|
|
* b) the underlying system supports ipv6. Having one #define for
|
|
|
|
* this makes it simpler to check throughout the code base.
|
|
|
|
*/
|
2006-09-08 23:42:32 +00:00
|
|
|
#if OPAL_ENABLE_IPV6 && defined(HAVE_STRUCT_SOCKADDR_IN6)
|
As per the RFC, bring in the ORTE async progress code and the rewrite of OOB:
*** THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE ***
Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro.
***************************************************************************************
I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week.
The code is in https://bitbucket.org/rhc/ompi-oob2
WHAT: Rewrite of ORTE OOB
WHY: Support asynchronous progress and a host of other features
WHEN: Wed, August 21
SYNOPSIS:
The current OOB has served us well, but a number of limitations have been identified over the years. Specifically:
* it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code)
* we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface.
* the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients
* there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort
* only one transport (i.e., component) can be "active"
The revised OOB resolves these problems:
* async progress is used for all application processes, with the progress thread blocking in the event library
* each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on")
* multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC.
* a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions.
* opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object
* NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions
* obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel
* the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport
* routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active
* all blocking send/recv APIs have been removed. Everything operates asynchronously.
KNOWN LIMITATIONS:
* although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline
* the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker
* routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways
* obviously, not every error path has been tested nor necessarily covered
* determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when *all* transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost.
* reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways
* the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC
This commit was SVN r29058.
2013-08-22 16:37:40 +00:00
|
|
|
#define OPAL_ENABLE_IPV6 1
|
2006-09-08 00:10:40 +00:00
|
|
|
#else
|
As per the RFC, bring in the ORTE async progress code and the rewrite of OOB:
*** THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE ***
Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro.
***************************************************************************************
I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week.
The code is in https://bitbucket.org/rhc/ompi-oob2
WHAT: Rewrite of ORTE OOB
WHY: Support asynchronous progress and a host of other features
WHEN: Wed, August 21
SYNOPSIS:
The current OOB has served us well, but a number of limitations have been identified over the years. Specifically:
* it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code)
* we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface.
* the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients
* there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort
* only one transport (i.e., component) can be "active"
The revised OOB resolves these problems:
* async progress is used for all application processes, with the progress thread blocking in the event library
* each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on")
* multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC.
* a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions.
* opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object
* NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions
* obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel
* the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport
* routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active
* all blocking send/recv APIs have been removed. Everything operates asynchronously.
KNOWN LIMITATIONS:
* although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline
* the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker
* routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways
* obviously, not every error path has been tested nor necessarily covered
* determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when *all* transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost.
* reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways
* the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC
This commit was SVN r29058.
2013-08-22 16:37:40 +00:00
|
|
|
#define OPAL_ENABLE_IPV6 0
|
2006-09-08 00:10:40 +00:00
|
|
|
#endif
|
|
|
|
|
2007-05-30 15:25:22 +00:00
|
|
|
#if !defined(HAVE_STRUCT_SOCKADDR_STORAGE) && defined(HAVE_STRUCT_SOCKADDR_IN)
|
2007-05-17 01:17:59 +00:00
|
|
|
#define sockaddr_storage sockaddr
|
|
|
|
#define ss_family sa_family
|
|
|
|
#endif
|
2007-07-20 01:34:02 +00:00
|
|
|
|
|
|
|
/* Compatibility structure so that we don't have to have as many
|
|
|
|
#if checks in the code base */
|
|
|
|
#if !defined(HAVE_STRUCT_SOCKADDR_IN6) && defined(HAVE_STRUCT_SOCKADDR_IN)
|
|
|
|
#define sockaddr_in6 sockaddr_in
|
|
|
|
#define sin6_len sin_len
|
|
|
|
#define sin6_family sin_family
|
|
|
|
#define sin6_port sin_port
|
2009-07-02 18:00:26 +00:00
|
|
|
#define sin6_addr sin_addr
|
2007-07-20 01:34:02 +00:00
|
|
|
#endif
|
|
|
|
|
2007-05-17 01:17:59 +00:00
|
|
|
#if !HAVE_DECL_AF_UNSPEC
|
2007-07-20 01:34:02 +00:00
|
|
|
#define AF_UNSPEC 0
|
2007-05-17 01:17:59 +00:00
|
|
|
#endif
|
|
|
|
#if !HAVE_DECL_PF_UNSPEC
|
2007-07-20 01:34:02 +00:00
|
|
|
#define PF_UNSPEC 0
|
|
|
|
#endif
|
|
|
|
#if !HAVE_DECL_AF_INET6
|
|
|
|
#define AF_INET6 AF_UNSPEC
|
|
|
|
#endif
|
|
|
|
#if !HAVE_DECL_PF_INET6
|
|
|
|
#define PF_INET6 PF_UNSPEC
|
2007-05-17 01:17:59 +00:00
|
|
|
#endif
|
|
|
|
|
2007-04-12 16:34:01 +00:00
|
|
|
#if defined(__APPLE__) && defined(HAVE_INTTYPES_H)
|
|
|
|
/* Prior to Mac OS X 10.3, the length modifier "ll" wasn't
|
|
|
|
supported, but "q" was for long long. This isn't ANSI
|
|
|
|
C and causes a warning when using PRI?64 macros. We
|
|
|
|
don't support versions prior to OS X 10.3, so we dont'
|
|
|
|
need such backward compatibility. Instead, redefine
|
|
|
|
the macros to be "ll", which is ANSI C and doesn't
|
|
|
|
cause a compiler warning. */
|
|
|
|
#include <inttypes.h>
|
|
|
|
#if defined(__PRI_64_LENGTH_MODIFIER__)
|
|
|
|
#undef __PRI_64_LENGTH_MODIFIER__
|
|
|
|
#define __PRI_64_LENGTH_MODIFIER__ "ll"
|
|
|
|
#endif
|
|
|
|
#if defined(__SCN_64_LENGTH_MODIFIER__)
|
|
|
|
#undef __SCN_64_LENGTH_MODIFIER__
|
|
|
|
#define __SCN_64_LENGTH_MODIFIER__ "ll"
|
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
|
2007-07-10 03:46:57 +00:00
|
|
|
#ifdef MCS_VXWORKS
|
|
|
|
/* VXWorks puts some common functions in oddly named headers. Rather
|
|
|
|
than update all the places the functions are used, which would be a
|
|
|
|
maintenance disatster, just update here... */
|
|
|
|
#ifdef HAVE_IOLIB_H
|
|
|
|
/* pipe(), ioctl() */
|
|
|
|
#include <ioLib.h>
|
|
|
|
#endif
|
|
|
|
#ifdef HAVE_SOCKLIB_H
|
|
|
|
/* socket() */
|
|
|
|
#include <sockLib.h>
|
|
|
|
#endif
|
|
|
|
#ifdef HAVE_HOSTLIB_H
|
|
|
|
/* gethostname() */
|
|
|
|
#include <hostLib.h>
|
|
|
|
|
|
|
|
#ifndef MAXHOSTNAMELEN
|
|
|
|
#define MAXHOSTNAMELEN 64
|
|
|
|
#endif
|
|
|
|
#endif
|
|
|
|
|
|
|
|
#endif
|
|
|
|
|
2009-03-16 21:09:54 +00:00
|
|
|
/* If we're in C++, then just undefine restrict and then define it to
|
|
|
|
nothing. "restrict" is not part of the C++ language, and we don't
|
|
|
|
have a corresponding AC_CXX_RESTRICT to figure out what the C++
|
|
|
|
compiler supports. */
|
|
|
|
#if defined(c_plusplus) || defined(__cplusplus)
|
|
|
|
#undef restrict
|
|
|
|
#define restrict
|
2009-01-26 20:13:44 +00:00
|
|
|
#endif
|
|
|
|
|
2011-03-24 22:39:56 +00:00
|
|
|
#else
|
|
|
|
|
|
|
|
/* For a similar reason to what is listed in opal_config_top.h, we
|
|
|
|
want to protect others from the autoconf/automake-generated
|
|
|
|
PACKAGE_<foo> macros in opal_config.h. We can't put these undef's
|
|
|
|
directly in opal_config.h because they'll be turned into #defines'
|
2015-06-23 20:59:57 -07:00
|
|
|
via autoconf.
|
2011-03-24 22:39:56 +00:00
|
|
|
|
|
|
|
So put them here in case any only else includes OMPI/ORTE/OPAL's
|
|
|
|
config.h files. */
|
|
|
|
|
|
|
|
#undef PACKAGE_BUGREPORT
|
|
|
|
#undef PACKAGE_NAME
|
|
|
|
#undef PACKAGE_STRING
|
|
|
|
#undef PACKAGE_TARNAME
|
|
|
|
#undef PACKAGE_VERSION
|
|
|
|
#undef PACKAGE_URL
|
|
|
|
#undef HAVE_CONFIG_H
|
|
|
|
|
clean up the OMPI_BUILDING #define. Rather than being defined to 1 if
we are part of the source tree and not defined otherwise, we are going
with an always defined if ompi_config.h is included policy. If
ompi_config.h is included before mpi.h or before OMPI_BUILDING is set,
it will set OMPI_BUILDING to 1 and enable all the internal code that
is in ompi_config_bottom.h. Otherwise, it will only include the
system configuration data (enough for defining the C and C++ interfaces
to MPI, but not perturbing the user environment).
This should fix the problems with bool and the like that the Eclipse
folks were seeing. It also cleans up some build system hacks that
we had along the way.
Also, don't use int64_t as the default size of MPI_Offset, because it
requires us including stdint.h in mpi.h, which is something we really
shouldn't be doing.
And finally, fix a ROMIO Makefile that didn't set -DOMPI_BUILDING=1,
as ROMIO includes mpi.h, but not ompi_config.h
This commit was SVN r5430.
2005-04-19 03:51:20 +00:00
|
|
|
#endif /* OMPI_BUILDING */
|