Resolve merge conflicts
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
Этот коммит содержится в:
Коммит
9557fa087f
2
LICENSE
2
LICENSE
@ -53,7 +53,7 @@ Copyright (c) 2014-2015 Hewlett-Packard Development Company, LP. All
|
|||||||
rights reserved.
|
rights reserved.
|
||||||
Copyright (c) 2013-2017 Research Organization for Information Science (RIST).
|
Copyright (c) 2013-2017 Research Organization for Information Science (RIST).
|
||||||
All rights reserved.
|
All rights reserved.
|
||||||
Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
|
Copyright (c) 2017-2018 Amazon.com, Inc. or its affiliates. All Rights
|
||||||
reserved.
|
reserved.
|
||||||
Copyright (c) 2018 DataDirect Networks. All rights reserved.
|
Copyright (c) 2018 DataDirect Networks. All rights reserved.
|
||||||
|
|
||||||
|
47
NEWS
47
NEWS
@ -80,6 +80,53 @@ Master (not on release branches yet)
|
|||||||
Currently, this means the Open SHMEM layer will only build if
|
Currently, this means the Open SHMEM layer will only build if
|
||||||
a MXM or UCX library is found.
|
a MXM or UCX library is found.
|
||||||
|
|
||||||
|
3.1.2 -- August, 2018
|
||||||
|
------------------------
|
||||||
|
|
||||||
|
- A subtle race condition bug was discovered in the "vader" BTL
|
||||||
|
(shared memory communications) that, in rare instances, can cause
|
||||||
|
MPI processes to crash or incorrectly classify (or effectively drop)
|
||||||
|
an MPI message sent via shared memory. If you are using the "ob1"
|
||||||
|
PML with "vader" for shared memory communication (note that vader is
|
||||||
|
the default for shared memory communication with ob1), you need to
|
||||||
|
upgrade to v3.1.2 or later to fix this issue. You may also upgrade
|
||||||
|
to the following versions to fix this issue:
|
||||||
|
- Open MPI v2.1.5 (expected end of August, 2018) or later in the
|
||||||
|
v2.1.x series
|
||||||
|
- Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
|
||||||
|
series
|
||||||
|
- Assorted Portals 4.0 bug fixes.
|
||||||
|
- Fix for possible data corruption in MPI_BSEND.
|
||||||
|
- Move shared memory file for vader btl into /dev/shm on Linux.
|
||||||
|
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
|
||||||
|
- Upgrade PMIx to v2.1.3.
|
||||||
|
- Numerous One-sided bug fixes.
|
||||||
|
- Fix for race condition in uGNI BTL.
|
||||||
|
- Improve handling of large number of interfaces with TCP BTL.
|
||||||
|
- Numerous UCX bug fixes.
|
||||||
|
|
||||||
|
3.1.1 -- June, 2018
|
||||||
|
-------------------
|
||||||
|
|
||||||
|
- Fix potential hang in UCX PML during MPI_FINALIZE
|
||||||
|
- Update internal PMIx to v2.1.2rc2 to fix forward version compatibility.
|
||||||
|
- Add new MCA parameter osc_sm_backing_store to allow users to specify
|
||||||
|
where in the filesystem the backing file for the shared memory
|
||||||
|
one-sided component should live. Defaults to /dev/shm on Linux.
|
||||||
|
- Fix potential hang on non-x86 platforms when using builds with
|
||||||
|
optimization flags turned off.
|
||||||
|
- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
|
||||||
|
race conditions in the component.
|
||||||
|
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
|
||||||
|
match the MPI standard. This may break applications which use
|
||||||
|
name-based parameters in Fortran which used our internal names
|
||||||
|
rather than those documented in the MPI standard.
|
||||||
|
- Revamp Java detection to properly handle new Java versions which do
|
||||||
|
not provide a javah wrapper.
|
||||||
|
- Fix RMA function signatures for use-mpi-f08 bindings to have the
|
||||||
|
asynchonous property on all buffers.
|
||||||
|
- Improved configure logic for finding the UCX library.
|
||||||
|
|
||||||
3.1.0 -- May, 2018
|
3.1.0 -- May, 2018
|
||||||
------------------
|
------------------
|
||||||
|
|
||||||
|
26
README
26
README
@ -8,7 +8,7 @@ Copyright (c) 2004-2008 High Performance Computing Center Stuttgart,
|
|||||||
University of Stuttgart. All rights reserved.
|
University of Stuttgart. All rights reserved.
|
||||||
Copyright (c) 2004-2007 The Regents of the University of California.
|
Copyright (c) 2004-2007 The Regents of the University of California.
|
||||||
All rights reserved.
|
All rights reserved.
|
||||||
Copyright (c) 2006-2017 Cisco Systems, Inc. All rights reserved.
|
Copyright (c) 2006-2018 Cisco Systems, Inc. All rights reserved.
|
||||||
Copyright (c) 2006-2011 Mellanox Technologies. All rights reserved.
|
Copyright (c) 2006-2011 Mellanox Technologies. All rights reserved.
|
||||||
Copyright (c) 2006-2012 Oracle and/or its affiliates. All rights reserved.
|
Copyright (c) 2006-2012 Oracle and/or its affiliates. All rights reserved.
|
||||||
Copyright (c) 2007 Myricom, Inc. All rights reserved.
|
Copyright (c) 2007 Myricom, Inc. All rights reserved.
|
||||||
@ -605,7 +605,6 @@ Network Support
|
|||||||
- Loopback (send-to-self)
|
- Loopback (send-to-self)
|
||||||
- Shared memory
|
- Shared memory
|
||||||
- TCP
|
- TCP
|
||||||
- Intel Phi SCIF
|
|
||||||
- SMCUDA
|
- SMCUDA
|
||||||
- Cisco usNIC
|
- Cisco usNIC
|
||||||
- uGNI (Cray Gemini, Aries)
|
- uGNI (Cray Gemini, Aries)
|
||||||
@ -768,6 +767,26 @@ Open MPI is unable to find relevant support for <foo>, configure will
|
|||||||
assume that it was unable to provide a feature that was specifically
|
assume that it was unable to provide a feature that was specifically
|
||||||
requested and will abort so that a human can resolve out the issue.
|
requested and will abort so that a human can resolve out the issue.
|
||||||
|
|
||||||
|
Additionally, if a search directory is specified in the form
|
||||||
|
--with-<foo>=<dir>, Open MPI will:
|
||||||
|
|
||||||
|
1. Search for <foo>'s header files in <dir>/include.
|
||||||
|
2. Search for <foo>'s library files:
|
||||||
|
2a. If --with-<foo>-libdir=<libdir> was specified, search in
|
||||||
|
<libdir>.
|
||||||
|
2b. Otherwise, search in <dir>/lib, and if they are not found
|
||||||
|
there, search again in <dir>/lib64.
|
||||||
|
3. If both the relevant header files and libraries are found:
|
||||||
|
3a. Open MPI will build support for <foo>.
|
||||||
|
3b. If the root path where the <foo> libraries are found is neither
|
||||||
|
"/usr" nor "/usr/local", Open MPI will compile itself with
|
||||||
|
RPATH flags pointing to the directory where <foo>'s libraries
|
||||||
|
are located. Open MPI does not RPATH /usr/lib[64] and
|
||||||
|
/usr/local/lib[64] because many systems already search these
|
||||||
|
directories for run-time libraries by default; adding RPATH for
|
||||||
|
them could have unintended consequences for the search path
|
||||||
|
ordering.
|
||||||
|
|
||||||
INSTALLATION OPTIONS
|
INSTALLATION OPTIONS
|
||||||
|
|
||||||
--prefix=<directory>
|
--prefix=<directory>
|
||||||
@ -1000,9 +1019,6 @@ NETWORKING SUPPORT / OPTIONS
|
|||||||
covers most cases. This option is only needed for special
|
covers most cases. This option is only needed for special
|
||||||
configurations.
|
configurations.
|
||||||
|
|
||||||
--with-scif=<dir>
|
|
||||||
Look in directory for Intel SCIF support libraries
|
|
||||||
|
|
||||||
--with-verbs=<directory>
|
--with-verbs=<directory>
|
||||||
Specify the directory where the verbs (also known as OpenFabrics
|
Specify the directory where the verbs (also known as OpenFabrics
|
||||||
verbs, or Linux verbs, and previously known as OpenIB) libraries and
|
verbs, or Linux verbs, and previously known as OpenIB) libraries and
|
||||||
|
@ -61,7 +61,7 @@ my $include_list;
|
|||||||
my $exclude_list;
|
my $exclude_list;
|
||||||
|
|
||||||
# Minimum versions
|
# Minimum versions
|
||||||
my $ompi_automake_version = "1.12.2";
|
my $ompi_automake_version = "1.13.4";
|
||||||
my $ompi_autoconf_version = "2.69";
|
my $ompi_autoconf_version = "2.69";
|
||||||
my $ompi_libtool_version = "2.4.2";
|
my $ompi_libtool_version = "2.4.2";
|
||||||
|
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
# -*- shell-script -*-
|
# -*- shell-script -*-
|
||||||
#
|
#
|
||||||
# Copyright (c) 2009-2017 Cisco Systems, Inc. All rights reserved
|
# Copyright (c) 2009-2017 Cisco Systems, Inc. All rights reserved
|
||||||
# Copyright (c) 2017 Research Organization for Information Science
|
# Copyright (c) 2017-2018 Research Organization for Information Science
|
||||||
# and Technology (RIST). All rights reserved.
|
# and Technology (RIST). All rights reserved.
|
||||||
# Copyright (c) 2018 Los Alamos National Security, LLC. All rights
|
# Copyright (c) 2018 Los Alamos National Security, LLC. All rights
|
||||||
# reserved.
|
# reserved.
|
||||||
@ -38,6 +38,7 @@ AC_DEFUN([OMPI_CONFIG_FILES],[
|
|||||||
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-file-interfaces.h
|
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-file-interfaces.h
|
||||||
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-removed-interfaces.h
|
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-removed-interfaces.h
|
||||||
ompi/mpi/fortran/use-mpi-f08/Makefile
|
ompi/mpi/fortran/use-mpi-f08/Makefile
|
||||||
|
ompi/mpi/fortran/use-mpi-f08/bindings/Makefile
|
||||||
ompi/mpi/fortran/use-mpi-f08/mod/Makefile
|
ompi/mpi/fortran/use-mpi-f08/mod/Makefile
|
||||||
ompi/mpi/fortran/mpiext-use-mpi/Makefile
|
ompi/mpi/fortran/mpiext-use-mpi/Makefile
|
||||||
ompi/mpi/fortran/mpiext-use-mpi-f08/Makefile
|
ompi/mpi/fortran/mpiext-use-mpi-f08/Makefile
|
||||||
|
@ -347,7 +347,8 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
|
|||||||
], [])],
|
], [])],
|
||||||
[AC_MSG_RESULT([found])
|
[AC_MSG_RESULT([found])
|
||||||
opal_external_pmix_version=4x
|
opal_external_pmix_version=4x
|
||||||
opal_external_pmix_version_found=1],
|
opal_external_pmix_version_found=1
|
||||||
|
opal_external_pmix_happy=yes],
|
||||||
[AC_MSG_RESULT([not found])])])
|
[AC_MSG_RESULT([not found])])])
|
||||||
|
|
||||||
AS_IF([test "$opal_external_pmix_version_found" = "0"],
|
AS_IF([test "$opal_external_pmix_version_found" = "0"],
|
||||||
@ -437,9 +438,11 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
|
|||||||
[Whether the external PMIx library is v1])
|
[Whether the external PMIx library is v1])
|
||||||
AM_CONDITIONAL([OPAL_WANT_PRUN], [test "$opal_prun_happy" = "yes"])
|
AM_CONDITIONAL([OPAL_WANT_PRUN], [test "$opal_prun_happy" = "yes"])
|
||||||
|
|
||||||
AS_IF([test "$opal_external_pmix_version" = "1x"],
|
AS_IF([test "$opal_external_pmix_happy" = "yes"],
|
||||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [1.2.x: WARNING - DYNAMIC OPS NOT SUPPORTED])],
|
[AS_IF([test "$opal_external_pmix_version" = "1x"],
|
||||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [$opal_external_pmix_version])])
|
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External (1.2.5) WARNING - DYNAMIC OPS NOT SUPPORTED])],
|
||||||
|
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External ($opal_external_pmix_version)])])],
|
||||||
|
[OPAL_SUMMARY_ADD([[Miscellaneous]], [[PMIx support]], [opal_pmix], [Internal])])
|
||||||
|
|
||||||
OPAL_VAR_SCOPE_POP
|
OPAL_VAR_SCOPE_POP
|
||||||
])
|
])
|
||||||
|
@ -13,7 +13,7 @@ dnl Copyright (c) 2008-2018 Cisco Systems, Inc. All rights reserved.
|
|||||||
dnl Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
dnl Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
||||||
dnl Copyright (c) 2015-2017 Research Organization for Information Science
|
dnl Copyright (c) 2015-2017 Research Organization for Information Science
|
||||||
dnl and Technology (RIST). All rights reserved.
|
dnl and Technology (RIST). All rights reserved.
|
||||||
dnl Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
|
dnl Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
|
||||||
dnl reserved.
|
dnl reserved.
|
||||||
dnl Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
|
dnl Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
|
||||||
dnl reserved.
|
dnl reserved.
|
||||||
@ -122,6 +122,57 @@ int main(int argc, char** argv)
|
|||||||
}
|
}
|
||||||
]])
|
]])
|
||||||
|
|
||||||
|
dnl This is a C test to see if 128-bit __atomic_compare_exchange_n()
|
||||||
|
dnl actually works (e.g., it compiles and links successfully on
|
||||||
|
dnl ARM64+clang, but returns incorrect answers as of August 2018).
|
||||||
|
AC_DEFUN([OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE],[[
|
||||||
|
#include <stdint.h>
|
||||||
|
#include <stdbool.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
#include <stdatomic.h>
|
||||||
|
|
||||||
|
typedef union {
|
||||||
|
uint64_t fake@<:@2@:>@;
|
||||||
|
_Atomic __int128 real;
|
||||||
|
} ompi128;
|
||||||
|
|
||||||
|
static void test1(void)
|
||||||
|
{
|
||||||
|
// As of Aug 2018, we could not figure out a way to assign 128-bit
|
||||||
|
// constants -- the compilers would not accept it. So use a fake
|
||||||
|
// union to assign 2 uin64_t's to make a single __int128.
|
||||||
|
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
|
||||||
|
ompi128 expected = { .fake = { 0x11EEDDCCBBAA0099, 0x88776655443322FF }};
|
||||||
|
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
|
||||||
|
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
|
||||||
|
desired.real, true,
|
||||||
|
atomic_relaxed, atomic_relaxed);
|
||||||
|
if ( !(r == false && ptr.real == expected.real)) {
|
||||||
|
exit(1);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
static void test2(void)
|
||||||
|
{
|
||||||
|
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
|
||||||
|
ompi128 expected = ptr;
|
||||||
|
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
|
||||||
|
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
|
||||||
|
desired.real, true,
|
||||||
|
atomic_relaxed, atomic_relaxed);
|
||||||
|
if (!(r == true && ptr.real == desired.real)) {
|
||||||
|
exit(2);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int main(int argc, char** argv)
|
||||||
|
{
|
||||||
|
test1();
|
||||||
|
test2();
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
]])
|
||||||
|
|
||||||
dnl ------------------------------------------------------------------
|
dnl ------------------------------------------------------------------
|
||||||
|
|
||||||
dnl
|
dnl
|
||||||
@ -329,6 +380,71 @@ __atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
|
|||||||
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
|
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
|
||||||
])
|
])
|
||||||
|
|
||||||
|
AC_DEFUN([OPAL_CHECK_C11_CSWAP_INT128], [
|
||||||
|
OPAL_VAR_SCOPE_PUSH([atomic_compare_exchange_result atomic_compare_exchange_CFLAGS_save atomic_compare_exchange_LIBS_save])
|
||||||
|
|
||||||
|
atomic_compare_exchange_CFLAGS_save=$CFLAGS
|
||||||
|
atomic_compare_exchange_LIBS_save=$LIBS
|
||||||
|
|
||||||
|
# Do we have C11 atomics on 128-bit integers?
|
||||||
|
# Use a special macro because we need to check with a few different
|
||||||
|
# CFLAGS/LIBS.
|
||||||
|
OPAL_ASM_CHECK_ATOMIC_FUNC([atomic_compare_exchange_strong_16],
|
||||||
|
[AC_LANG_SOURCE(OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE)],
|
||||||
|
[atomic_compare_exchange_result=1],
|
||||||
|
[atomic_compare_exchange_result=0])
|
||||||
|
|
||||||
|
# If we have it and it works, check to make sure it is always lock
|
||||||
|
# free.
|
||||||
|
AS_IF([test $atomic_compare_exchange_result -eq 1],
|
||||||
|
[AC_MSG_CHECKING([if C11 __int128 atomic compare-and-swap is always lock-free])
|
||||||
|
AC_RUN_IFELSE([AC_LANG_PROGRAM([#include <stdatomic.h>], [_Atomic __int128_t x; if (!atomic_is_lock_free(&x)) { return 1; }])],
|
||||||
|
[AC_MSG_RESULT([yes])],
|
||||||
|
[atomic_compare_exchange_result=0
|
||||||
|
# If this test fails, need to reset CFLAGS/LIBS (the
|
||||||
|
# above tests atomically set CFLAGS/LIBS or not; this
|
||||||
|
# test is running after the fact, so we have to undo
|
||||||
|
# the side-effects of setting CFLAGS/LIBS if the above
|
||||||
|
# tests passed).
|
||||||
|
CFLAGS=$atomic_compare_exchange_CFLAGS_save
|
||||||
|
LIBS=$atomic_compare_exchange_LIBS_save
|
||||||
|
AC_MSG_RESULT([no])],
|
||||||
|
[AC_MSG_RESULT([cannot test -- assume yes (cross compiling)])])
|
||||||
|
])
|
||||||
|
|
||||||
|
AC_DEFINE_UNQUOTED([OPAL_HAVE_C11_CSWAP_INT128],
|
||||||
|
[$atomic_compare_exchange_result],
|
||||||
|
[Whether C11 atomic compare swap is both supported and lock-free on 128-bit values])
|
||||||
|
|
||||||
|
dnl If we could not find decent support for 128-bits atomic let's
|
||||||
|
dnl try the GCC _sync
|
||||||
|
AS_IF([test $atomic_compare_exchange_result -eq 0],
|
||||||
|
[OPAL_CHECK_SYNC_BUILTIN_CSWAP_INT128])
|
||||||
|
|
||||||
|
OPAL_VAR_SCOPE_POP
|
||||||
|
])
|
||||||
|
|
||||||
|
AC_DEFUN([OPAL_CHECK_GCC_ATOMIC_BUILTINS], [
|
||||||
|
AC_MSG_CHECKING([for __atomic builtin atomics])
|
||||||
|
|
||||||
|
AC_TRY_LINK([
|
||||||
|
#include <stdint.h>
|
||||||
|
uint32_t tmp, old = 0;
|
||||||
|
uint64_t tmp64, old64 = 0;], [
|
||||||
|
__atomic_thread_fence(__ATOMIC_SEQ_CST);
|
||||||
|
__atomic_compare_exchange_n(&tmp, &old, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||||
|
__atomic_add_fetch(&tmp, 1, __ATOMIC_RELAXED);
|
||||||
|
__atomic_compare_exchange_n(&tmp64, &old64, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||||
|
__atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
|
||||||
|
[AC_MSG_RESULT([yes])
|
||||||
|
$1],
|
||||||
|
[AC_MSG_RESULT([no])
|
||||||
|
$2])
|
||||||
|
|
||||||
|
# Check for 128-bit support
|
||||||
|
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
|
||||||
|
])
|
||||||
|
|
||||||
|
|
||||||
dnl #################################################################
|
dnl #################################################################
|
||||||
dnl
|
dnl
|
||||||
@ -1020,17 +1136,27 @@ AC_DEFUN([OPAL_CONFIG_ASM],[
|
|||||||
AC_REQUIRE([OPAL_SETUP_CC])
|
AC_REQUIRE([OPAL_SETUP_CC])
|
||||||
AC_REQUIRE([AM_PROG_AS])
|
AC_REQUIRE([AM_PROG_AS])
|
||||||
|
|
||||||
|
AC_ARG_ENABLE([c11-atomics],[AC_HELP_STRING([--enable-c11-atomics],
|
||||||
|
[Enable use of C11 atomics if available (default: enabled)])])
|
||||||
|
|
||||||
AC_ARG_ENABLE([builtin-atomics],
|
AC_ARG_ENABLE([builtin-atomics],
|
||||||
[AC_HELP_STRING([--enable-builtin-atomics],
|
[AC_HELP_STRING([--enable-builtin-atomics],
|
||||||
[Enable use of __sync builtin atomics (default: enabled)])])
|
[Enable use of __sync builtin atomics (default: disabled)])])
|
||||||
|
|
||||||
opal_cv_asm_builtin="BUILTIN_NO"
|
OPAL_CHECK_C11_CSWAP_INT128
|
||||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
|
|
||||||
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
|
if test "x$enable_c11_atomics" != "xno" && test "$opal_cv_c11_supported" = "yes" ; then
|
||||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
|
opal_cv_asm_builtin="BUILTIN_C11"
|
||||||
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
|
OPAL_CHECK_C11_CSWAP_INT128
|
||||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
else
|
||||||
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
|
opal_cv_asm_builtin="BUILTIN_NO"
|
||||||
|
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||||
|
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
|
||||||
|
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||||
|
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
|
||||||
|
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||||
|
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
|
||||||
|
fi
|
||||||
|
|
||||||
OPAL_CHECK_ASM_PROC
|
OPAL_CHECK_ASM_PROC
|
||||||
OPAL_CHECK_ASM_TEXT
|
OPAL_CHECK_ASM_TEXT
|
||||||
|
@ -10,7 +10,7 @@ dnl Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|||||||
dnl University of Stuttgart. All rights reserved.
|
dnl University of Stuttgart. All rights reserved.
|
||||||
dnl Copyright (c) 2004-2005 The Regents of the University of California.
|
dnl Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
dnl All rights reserved.
|
dnl All rights reserved.
|
||||||
dnl Copyright (c) 2014-2015 Intel, Inc. All rights reserved.
|
dnl Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
|
||||||
dnl Copyright (c) 2015 Cisco Systems, Inc. All rights reserved.
|
dnl Copyright (c) 2015 Cisco Systems, Inc. All rights reserved.
|
||||||
dnl $COPYRIGHT$
|
dnl $COPYRIGHT$
|
||||||
dnl
|
dnl
|
||||||
@ -60,6 +60,8 @@ do
|
|||||||
;;
|
;;
|
||||||
-with-platform=* | --with-platform=*)
|
-with-platform=* | --with-platform=*)
|
||||||
;;
|
;;
|
||||||
|
-with*=internal)
|
||||||
|
;;
|
||||||
*)
|
*)
|
||||||
case $subdir_arg in
|
case $subdir_arg in
|
||||||
*\'*) subdir_arg=`echo "$subdir_arg" | sed "s/'/'\\\\\\\\''/g"` ;;
|
*\'*) subdir_arg=`echo "$subdir_arg" | sed "s/'/'\\\\\\\\''/g"` ;;
|
||||||
|
@ -100,7 +100,7 @@ OPAL_VAR_SCOPE_POP
|
|||||||
#
|
#
|
||||||
# Init automake
|
# Init automake
|
||||||
#
|
#
|
||||||
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.12.2 tar-ustar])
|
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.13.4 tar-ustar])
|
||||||
|
|
||||||
# SILENT_RULES is new in AM 1.11, but we require 1.11 or higher via
|
# SILENT_RULES is new in AM 1.11, but we require 1.11 or higher via
|
||||||
# autogen. Limited testing shows that calling SILENT_RULES directly
|
# autogen. Limited testing shows that calling SILENT_RULES directly
|
||||||
@ -858,7 +858,7 @@ OPAL_SEARCH_LIBS_CORE([ceil], [m])
|
|||||||
# -lrt might be needed for clock_gettime
|
# -lrt might be needed for clock_gettime
|
||||||
OPAL_SEARCH_LIBS_CORE([clock_gettime], [rt])
|
OPAL_SEARCH_LIBS_CORE([clock_gettime], [rt])
|
||||||
|
|
||||||
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair strncpy_s usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
|
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
|
||||||
|
|
||||||
# Sanity check: ensure that we got at least one of statfs or statvfs.
|
# Sanity check: ensure that we got at least one of statfs or statvfs.
|
||||||
if test $ac_cv_func_statfs = no && test $ac_cv_func_statvfs = no; then
|
if test $ac_cv_func_statfs = no && test $ac_cv_func_statvfs = no; then
|
||||||
|
@ -88,12 +88,8 @@ EXTRA_DIST = \
|
|||||||
platform/lanl/darwin/mic-common \
|
platform/lanl/darwin/mic-common \
|
||||||
platform/lanl/darwin/debug \
|
platform/lanl/darwin/debug \
|
||||||
platform/lanl/darwin/debug.conf \
|
platform/lanl/darwin/debug.conf \
|
||||||
platform/lanl/darwin/debug-mic \
|
|
||||||
platform/lanl/darwin/debug-mic.conf \
|
|
||||||
platform/lanl/darwin/optimized \
|
platform/lanl/darwin/optimized \
|
||||||
platform/lanl/darwin/optimized.conf \
|
platform/lanl/darwin/optimized.conf \
|
||||||
platform/lanl/darwin/optimized-mic \
|
|
||||||
platform/lanl/darwin/optimized-mic.conf \
|
|
||||||
platform/snl/portals4-m5 \
|
platform/snl/portals4-m5 \
|
||||||
platform/snl/portals4-orte \
|
platform/snl/portals4-orte \
|
||||||
platform/ibm/debug-ppc32-gcc \
|
platform/ibm/debug-ppc32-gcc \
|
||||||
|
@ -10,7 +10,7 @@
|
|||||||
|
|
||||||
m4=1.4.16
|
m4=1.4.16
|
||||||
ac=2.69
|
ac=2.69
|
||||||
am=1.12.2
|
am=1.13.4
|
||||||
lt=2.4.2
|
lt=2.4.2
|
||||||
flex=2.5.35
|
flex=2.5.35
|
||||||
|
|
||||||
|
1
contrib/dist/linux/buildrpm.sh
поставляемый
1
contrib/dist/linux/buildrpm.sh
поставляемый
@ -267,7 +267,6 @@ fi
|
|||||||
# Find where the top RPM-building directory is
|
# Find where the top RPM-building directory is
|
||||||
#
|
#
|
||||||
|
|
||||||
rpmtopdir=
|
|
||||||
file=~/.rpmmacros
|
file=~/.rpmmacros
|
||||||
if test -r $file; then
|
if test -r $file; then
|
||||||
rpmtopdir=${rpmtopdir:-"`grep %_topdir $file | awk '{ print $2 }'`"}
|
rpmtopdir=${rpmtopdir:-"`grep %_topdir $file | awk '{ print $2 }'`"}
|
||||||
|
@ -1,100 +0,0 @@
|
|||||||
#
|
|
||||||
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
||||||
# University Research and Technology
|
|
||||||
# Corporation. All rights reserved.
|
|
||||||
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
||||||
# of Tennessee Research Foundation. All rights
|
|
||||||
# reserved.
|
|
||||||
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
||||||
# University of Stuttgart. All rights reserved.
|
|
||||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
|
||||||
# All rights reserved.
|
|
||||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
|
||||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
|
|
||||||
# All rights reserved.
|
|
||||||
# $COPYRIGHT$
|
|
||||||
#
|
|
||||||
# Additional copyrights may follow
|
|
||||||
#
|
|
||||||
# $HEADER$
|
|
||||||
#
|
|
||||||
|
|
||||||
# This is the default system-wide MCA parameters defaults file.
|
|
||||||
# Specifically, the MCA parameter "mca_param_files" defaults to a
|
|
||||||
# value of
|
|
||||||
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
|
|
||||||
# (this file is the latter of the two). So if the default value of
|
|
||||||
# mca_param_files is not changed, this file is used to set system-wide
|
|
||||||
# MCA parameters. This file can therefore be used to set system-wide
|
|
||||||
# default MCA parameters for all users. Of course, users can override
|
|
||||||
# these values if they want, but this file is an excellent location
|
|
||||||
# for setting system-specific MCA parameters for those users who don't
|
|
||||||
# know / care enough to investigate the proper values for them.
|
|
||||||
|
|
||||||
# Note that this file is only applicable where it is visible (in a
|
|
||||||
# filesystem sense). Specifically, MPI processes each read this file
|
|
||||||
# during their startup to determine what default values for MCA
|
|
||||||
# parameters should be used. mpirun does not bundle up the values in
|
|
||||||
# this file from the node where it was run and send them to all nodes;
|
|
||||||
# the default value decisions are effectively distributed. Hence,
|
|
||||||
# these values are only applicable on nodes that "see" this file. If
|
|
||||||
# $sysconf is a directory on a local disk, it is likely that changes
|
|
||||||
# to this file will need to be propagated to other nodes. If $sysconf
|
|
||||||
# is a directory that is shared via a networked filesystem, changes to
|
|
||||||
# this file will be visible to all nodes that share this $sysconf.
|
|
||||||
|
|
||||||
# The format is straightforward: one per line, mca_param_name =
|
|
||||||
# rvalue. Quoting is ignored (so if you use quotes or escape
|
|
||||||
# characters, they'll be included as part of the value). For example:
|
|
||||||
|
|
||||||
# Disable run-time MPI parameter checking
|
|
||||||
# mpi_param_check = 0
|
|
||||||
|
|
||||||
# Note that the value "~/" will be expanded to the current user's home
|
|
||||||
# directory. For example:
|
|
||||||
|
|
||||||
# Change component loading path
|
|
||||||
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
|
|
||||||
|
|
||||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
|
||||||
# parameters available and their default values.
|
|
||||||
#
|
|
||||||
|
|
||||||
# Basic behavior to smooth startup
|
|
||||||
mca_base_component_show_load_errors = 0
|
|
||||||
opal_set_max_sys_limits = 1
|
|
||||||
orte_report_launch_progress = 1
|
|
||||||
|
|
||||||
# Define timeout for daemons to report back during launch
|
|
||||||
orte_startup_timeout = 10000
|
|
||||||
|
|
||||||
## Protect the shared file systems
|
|
||||||
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
|
|
||||||
orte_tmpdir_base = /tmp
|
|
||||||
|
|
||||||
## Require an allocation to run - protects the frontend
|
|
||||||
## from inadvertent job executions
|
|
||||||
orte_allocation_required = 1
|
|
||||||
|
|
||||||
## Add the interface for out-of-band communication
|
|
||||||
## and set it up
|
|
||||||
oob_tcp_if_include=mic0
|
|
||||||
oob_tcp_peer_retries = 1000
|
|
||||||
oob_tcp_sndbuf = 32768
|
|
||||||
oob_tcp_rcvbuf = 32768
|
|
||||||
|
|
||||||
## Define the MPI interconnects
|
|
||||||
btl = sm,scif,openib,self
|
|
||||||
|
|
||||||
## Setup OpenIB - just in case
|
|
||||||
btl_openib_want_fork_support = 0
|
|
||||||
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
|
|
||||||
|
|
||||||
## Enable cpu affinity
|
|
||||||
hwloc_base_binding_policy = core
|
|
||||||
|
|
||||||
## Setup MPI options
|
|
||||||
mpi_show_handle_leaks = 1
|
|
||||||
mpi_warn_on_fork = 1
|
|
||||||
#mpi_abort_print_stack = 1
|
|
||||||
|
|
@ -10,7 +10,7 @@
|
|||||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
|
# Copyright (c) 2011-2018 Los Alamos National Security, LLC.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
# $COPYRIGHT$
|
# $COPYRIGHT$
|
||||||
#
|
#
|
||||||
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
|
|||||||
oob_tcp_rcvbuf = 32768
|
oob_tcp_rcvbuf = 32768
|
||||||
|
|
||||||
## Define the MPI interconnects
|
## Define the MPI interconnects
|
||||||
btl = sm,scif,openib,self
|
btl = sm,openib,self
|
||||||
|
|
||||||
## Setup OpenIB - just in case
|
## Setup OpenIB - just in case
|
||||||
btl_openib_want_fork_support = 0
|
btl_openib_want_fork_support = 0
|
||||||
|
@ -1,100 +0,0 @@
|
|||||||
#
|
|
||||||
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
|
||||||
# University Research and Technology
|
|
||||||
# Corporation. All rights reserved.
|
|
||||||
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
|
||||||
# of Tennessee Research Foundation. All rights
|
|
||||||
# reserved.
|
|
||||||
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
|
||||||
# University of Stuttgart. All rights reserved.
|
|
||||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
|
||||||
# All rights reserved.
|
|
||||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
|
||||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
|
|
||||||
# reserved.
|
|
||||||
# $COPYRIGHT$
|
|
||||||
#
|
|
||||||
# Additional copyrights may follow
|
|
||||||
#
|
|
||||||
# $HEADER$
|
|
||||||
#
|
|
||||||
|
|
||||||
# This is the default system-wide MCA parameters defaults file.
|
|
||||||
# Specifically, the MCA parameter "mca_param_files" defaults to a
|
|
||||||
# value of
|
|
||||||
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
|
|
||||||
# (this file is the latter of the two). So if the default value of
|
|
||||||
# mca_param_files is not changed, this file is used to set system-wide
|
|
||||||
# MCA parameters. This file can therefore be used to set system-wide
|
|
||||||
# default MCA parameters for all users. Of course, users can override
|
|
||||||
# these values if they want, but this file is an excellent location
|
|
||||||
# for setting system-specific MCA parameters for those users who don't
|
|
||||||
# know / care enough to investigate the proper values for them.
|
|
||||||
|
|
||||||
# Note that this file is only applicable where it is visible (in a
|
|
||||||
# filesystem sense). Specifically, MPI processes each read this file
|
|
||||||
# during their startup to determine what default values for MCA
|
|
||||||
# parameters should be used. mpirun does not bundle up the values in
|
|
||||||
# this file from the node where it was run and send them to all nodes;
|
|
||||||
# the default value decisions are effectively distributed. Hence,
|
|
||||||
# these values are only applicable on nodes that "see" this file. If
|
|
||||||
# $sysconf is a directory on a local disk, it is likely that changes
|
|
||||||
# to this file will need to be propagated to other nodes. If $sysconf
|
|
||||||
# is a directory that is shared via a networked filesystem, changes to
|
|
||||||
# this file will be visible to all nodes that share this $sysconf.
|
|
||||||
|
|
||||||
# The format is straightforward: one per line, mca_param_name =
|
|
||||||
# rvalue. Quoting is ignored (so if you use quotes or escape
|
|
||||||
# characters, they'll be included as part of the value). For example:
|
|
||||||
|
|
||||||
# Disable run-time MPI parameter checking
|
|
||||||
# mpi_param_check = 0
|
|
||||||
|
|
||||||
# Note that the value "~/" will be expanded to the current user's home
|
|
||||||
# directory. For example:
|
|
||||||
|
|
||||||
# Change component loading path
|
|
||||||
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
|
|
||||||
|
|
||||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
|
||||||
# parameters available and their default values.
|
|
||||||
#
|
|
||||||
|
|
||||||
# Basic behavior to smooth startup
|
|
||||||
mca_base_component_show_load_errors = 0
|
|
||||||
opal_set_max_sys_limits = 1
|
|
||||||
orte_report_launch_progress = 1
|
|
||||||
|
|
||||||
# Define timeout for daemons to report back during launch
|
|
||||||
orte_startup_timeout = 10000
|
|
||||||
|
|
||||||
## Protect the shared file systems
|
|
||||||
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
|
|
||||||
orte_tmpdir_base = /tmp
|
|
||||||
|
|
||||||
## Require an allocation to run - protects the frontend
|
|
||||||
## from inadvertent job executions
|
|
||||||
orte_allocation_required = 1
|
|
||||||
|
|
||||||
## Add the interface for out-of-band communication
|
|
||||||
## and set it up
|
|
||||||
oob_tcp_if_include = mic0
|
|
||||||
oob_tcp_peer_retries = 1000
|
|
||||||
oob_tcp_sndbuf = 32768
|
|
||||||
oob_tcp_rcvbuf = 32768
|
|
||||||
|
|
||||||
## Define the MPI interconnects
|
|
||||||
btl = sm,scif,openib,self
|
|
||||||
|
|
||||||
## Setup OpenIB - just in case
|
|
||||||
btl_openib_want_fork_support = 0
|
|
||||||
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
|
|
||||||
|
|
||||||
## Enable cpu affinity
|
|
||||||
hwloc_base_binding_policy = core
|
|
||||||
|
|
||||||
## Setup MPI options
|
|
||||||
mpi_show_handle_leaks = 0
|
|
||||||
mpi_warn_on_fork = 1
|
|
||||||
#mpi_abort_print_stack = 0
|
|
||||||
|
|
@ -10,7 +10,7 @@
|
|||||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
# All rights reserved.
|
# All rights reserved.
|
||||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
|
# Copyright (c) 2011-2018 Los Alamos National Security, LLC. All rights
|
||||||
# reserved.
|
# reserved.
|
||||||
# $COPYRIGHT$
|
# $COPYRIGHT$
|
||||||
#
|
#
|
||||||
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
|
|||||||
oob_tcp_rcvbuf = 32768
|
oob_tcp_rcvbuf = 32768
|
||||||
|
|
||||||
## Define the MPI interconnects
|
## Define the MPI interconnects
|
||||||
btl = sm,scif,openib,self
|
btl = sm,openib,self
|
||||||
|
|
||||||
## Setup OpenIB - just in case
|
## Setup OpenIB - just in case
|
||||||
btl_openib_want_fork_support = 0
|
btl_openib_want_fork_support = 0
|
||||||
|
@ -23,26 +23,11 @@ if [ "$mellanox_autodetect" == "yes" ]; then
|
|||||||
with_ucx=$ucx_dir
|
with_ucx=$ucx_dir
|
||||||
fi
|
fi
|
||||||
|
|
||||||
mxm_dir=${mxm_dir:="$(pkg-config --variable=prefix mxm)"}
|
|
||||||
if [ -d $mxm_dir ]; then
|
|
||||||
with_mxm=$mxm_dir
|
|
||||||
fi
|
|
||||||
|
|
||||||
fca_dir=${fca_dir:="$(pkg-config --variable=prefix fca)"}
|
|
||||||
if [ -d $fca_dir ]; then
|
|
||||||
with_fca=$fca_dir
|
|
||||||
fi
|
|
||||||
|
|
||||||
hcoll_dir=${hcoll_dir:="$(pkg-config --variable=prefix hcoll)"}
|
hcoll_dir=${hcoll_dir:="$(pkg-config --variable=prefix hcoll)"}
|
||||||
if [ -d $hcoll_dir ]; then
|
if [ -d $hcoll_dir ]; then
|
||||||
with_hcoll=$hcoll_dir
|
with_hcoll=$hcoll_dir
|
||||||
fi
|
fi
|
||||||
|
|
||||||
knem_dir=${knem_dir:="$(pkg-config --variable=prefix knem)"}
|
|
||||||
if [ -d $knem_dir ]; then
|
|
||||||
with_knem=$knem_dir
|
|
||||||
fi
|
|
||||||
|
|
||||||
slurm_dir=${slurm_dir:="/usr"}
|
slurm_dir=${slurm_dir:="/usr"}
|
||||||
if [ -f $slurm_dir/include/slurm/slurm.h ]; then
|
if [ -f $slurm_dir/include/slurm/slurm.h ]; then
|
||||||
with_slurm=$slurm_dir
|
with_slurm=$slurm_dir
|
||||||
|
@ -56,12 +56,10 @@
|
|||||||
|
|
||||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
||||||
# parameters available and their default values.
|
# parameters available and their default values.
|
||||||
coll_fca_enable = 0
|
|
||||||
scoll_fca_enable = 0
|
|
||||||
#rmaps_base_mapping_policy = dist:auto
|
#rmaps_base_mapping_policy = dist:auto
|
||||||
coll = ^ml
|
coll = ^ml
|
||||||
hwloc_base_binding_policy = core
|
hwloc_base_binding_policy = core
|
||||||
btl = vader,openib,self
|
btl = self
|
||||||
# Basic behavior to smooth startup
|
# Basic behavior to smooth startup
|
||||||
mca_base_component_show_load_errors = 0
|
mca_base_component_show_load_errors = 0
|
||||||
orte_abort_timeout = 10
|
orte_abort_timeout = 10
|
||||||
@ -77,3 +75,6 @@ oob_tcp_sndbuf = 32768
|
|||||||
oob_tcp_rcvbuf = 32768
|
oob_tcp_rcvbuf = 32768
|
||||||
|
|
||||||
opal_event_include=epoll
|
opal_event_include=epoll
|
||||||
|
|
||||||
|
bml_r2_show_unreach_errors = 0
|
||||||
|
|
||||||
|
@ -15,7 +15,7 @@
|
|||||||
# Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
# Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
# reserved.
|
# reserved.
|
||||||
# Copyright (c) 2015-2017 Intel, Inc. All rights reserved.
|
# Copyright (c) 2015-2017 Intel, Inc. All rights reserved.
|
||||||
# Copyright (c) 2015-2017 Research Organization for Information Science
|
# Copyright (c) 2015-2018 Research Organization for Information Science
|
||||||
# and Technology (RIST). All rights reserved.
|
# and Technology (RIST). All rights reserved.
|
||||||
# Copyright (c) 2016 IBM Corporation. All rights reserved.
|
# Copyright (c) 2016 IBM Corporation. All rights reserved.
|
||||||
# Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
# Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
@ -93,6 +93,7 @@ SUBDIRS = \
|
|||||||
$(OMPI_FORTRAN_USEMPI_DIR) \
|
$(OMPI_FORTRAN_USEMPI_DIR) \
|
||||||
mpi/fortran/mpiext-use-mpi \
|
mpi/fortran/mpiext-use-mpi \
|
||||||
mpi/fortran/use-mpi-f08/mod \
|
mpi/fortran/use-mpi-f08/mod \
|
||||||
|
mpi/fortran/use-mpi-f08/bindings \
|
||||||
$(OMPI_MPIEXT_USEMPIF08_DIRS) \
|
$(OMPI_MPIEXT_USEMPIF08_DIRS) \
|
||||||
mpi/fortran/use-mpi-f08 \
|
mpi/fortran/use-mpi-f08 \
|
||||||
mpi/fortran/mpiext-use-mpi-f08 \
|
mpi/fortran/mpiext-use-mpi-f08 \
|
||||||
@ -124,6 +125,7 @@ DIST_SUBDIRS = \
|
|||||||
mpi/fortran/mpiext-use-mpi \
|
mpi/fortran/mpiext-use-mpi \
|
||||||
mpi/fortran/use-mpi-f08 \
|
mpi/fortran/use-mpi-f08 \
|
||||||
mpi/fortran/use-mpi-f08/mod \
|
mpi/fortran/use-mpi-f08/mod \
|
||||||
|
mpi/fortran/use-mpi-f08/bindings \
|
||||||
mpi/fortran/mpiext-use-mpi-f08 \
|
mpi/fortran/mpiext-use-mpi-f08 \
|
||||||
mpi/java \
|
mpi/java \
|
||||||
$(OMPI_MPIEXT_ALL_SUBDIRS) \
|
$(OMPI_MPIEXT_ALL_SUBDIRS) \
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2013-2016 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reseved.
|
* reseved.
|
||||||
* Copyright (c) 2015 Research Organization for Information Science
|
* Copyright (c) 2015 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
@ -99,7 +99,7 @@ int ompi_comm_request_schedule_append (ompi_comm_request_t *request, ompi_comm_r
|
|||||||
static int ompi_comm_request_progress (void)
|
static int ompi_comm_request_progress (void)
|
||||||
{
|
{
|
||||||
ompi_comm_request_t *request, *next;
|
ompi_comm_request_t *request, *next;
|
||||||
static int32_t progressing = 0;
|
static opal_atomic_int32_t progressing = 0;
|
||||||
|
|
||||||
/* don't allow re-entry */
|
/* don't allow re-entry */
|
||||||
if (opal_atomic_swap_32 (&progressing, 1)) {
|
if (opal_atomic_swap_32 (&progressing, 1)) {
|
||||||
|
@ -75,7 +75,7 @@ struct ompi_datatype_t {
|
|||||||
struct opal_hash_table_t *d_keyhash; /**< Attribute fields */
|
struct opal_hash_table_t *d_keyhash; /**< Attribute fields */
|
||||||
|
|
||||||
void* args; /**< Data description for the user */
|
void* args; /**< Data description for the user */
|
||||||
void* packed_description; /**< Packed description of the datatype */
|
opal_atomic_intptr_t packed_description; /**< Packed description of the datatype */
|
||||||
uint64_t pml_data; /**< PML-specific information */
|
uint64_t pml_data; /**< PML-specific information */
|
||||||
/* --- cacheline 6 boundary (384 bytes) --- */
|
/* --- cacheline 6 boundary (384 bytes) --- */
|
||||||
char name[MPI_MAX_OBJECT_NAME];/**< Externally visible name */
|
char name[MPI_MAX_OBJECT_NAME];/**< Externally visible name */
|
||||||
|
@ -45,7 +45,7 @@ __ompi_datatype_create_from_args( int32_t* i, ptrdiff_t * a,
|
|||||||
ompi_datatype_t** d, int32_t type );
|
ompi_datatype_t** d, int32_t type );
|
||||||
|
|
||||||
typedef struct __dt_args {
|
typedef struct __dt_args {
|
||||||
int32_t ref_count;
|
opal_atomic_int32_t ref_count;
|
||||||
int32_t create_type;
|
int32_t create_type;
|
||||||
size_t total_pack_size;
|
size_t total_pack_size;
|
||||||
int32_t ci;
|
int32_t ci;
|
||||||
@ -104,7 +104,7 @@ typedef struct __dt_args {
|
|||||||
pArgs->total_pack_size = (4 + (IC) + (DC)) * sizeof(int) + \
|
pArgs->total_pack_size = (4 + (IC) + (DC)) * sizeof(int) + \
|
||||||
(AC) * sizeof(ptrdiff_t); \
|
(AC) * sizeof(ptrdiff_t); \
|
||||||
(PDATA)->args = (void*)pArgs; \
|
(PDATA)->args = (void*)pArgs; \
|
||||||
(PDATA)->packed_description = NULL; \
|
(PDATA)->packed_description = 0; \
|
||||||
} while(0)
|
} while(0)
|
||||||
|
|
||||||
|
|
||||||
@ -483,12 +483,12 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
|||||||
{
|
{
|
||||||
ompi_datatype_args_t* args = (ompi_datatype_args_t*)datatype->args;
|
ompi_datatype_args_t* args = (ompi_datatype_args_t*)datatype->args;
|
||||||
int next_index = OMPI_DATATYPE_MAX_PREDEFINED;
|
int next_index = OMPI_DATATYPE_MAX_PREDEFINED;
|
||||||
void *packed_description = datatype->packed_description;
|
void *packed_description = (void *) datatype->packed_description;
|
||||||
void* recursive_buffer;
|
void* recursive_buffer;
|
||||||
|
|
||||||
if (NULL == packed_description) {
|
if (NULL == packed_description) {
|
||||||
void *_tmp_ptr = NULL;
|
void *_tmp_ptr = NULL;
|
||||||
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (void *) &_tmp_ptr, (void *) 1)) {
|
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (intptr_t *) &_tmp_ptr, 1)) {
|
||||||
if( ompi_datatype_is_predefined(datatype) ) {
|
if( ompi_datatype_is_predefined(datatype) ) {
|
||||||
packed_description = malloc(2 * sizeof(int));
|
packed_description = malloc(2 * sizeof(int));
|
||||||
} else if( NULL == args ) {
|
} else if( NULL == args ) {
|
||||||
@ -510,10 +510,10 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
|||||||
}
|
}
|
||||||
|
|
||||||
opal_atomic_wmb ();
|
opal_atomic_wmb ();
|
||||||
datatype->packed_description = packed_description;
|
datatype->packed_description = (intptr_t) packed_description;
|
||||||
} else {
|
} else {
|
||||||
/* another thread beat us to it */
|
/* another thread beat us to it */
|
||||||
packed_description = datatype->packed_description;
|
packed_description = (void *) datatype->packed_description;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -521,11 +521,11 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
|||||||
struct timespec interval = {.tv_sec = 0, .tv_nsec = 1000};
|
struct timespec interval = {.tv_sec = 0, .tv_nsec = 1000};
|
||||||
|
|
||||||
/* wait until the packed description is updated */
|
/* wait until the packed description is updated */
|
||||||
while ((void *) 1 == datatype->packed_description) {
|
while (1 == datatype->packed_description) {
|
||||||
nanosleep (&interval, NULL);
|
nanosleep (&interval, NULL);
|
||||||
}
|
}
|
||||||
|
|
||||||
packed_description = datatype->packed_description;
|
packed_description = (void *) datatype->packed_description;
|
||||||
}
|
}
|
||||||
|
|
||||||
*packed_buffer = (const void *) packed_description;
|
*packed_buffer = (const void *) packed_description;
|
||||||
@ -534,7 +534,7 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
|||||||
|
|
||||||
size_t ompi_datatype_pack_description_length( ompi_datatype_t* datatype )
|
size_t ompi_datatype_pack_description_length( ompi_datatype_t* datatype )
|
||||||
{
|
{
|
||||||
void *packed_description = datatype->packed_description;
|
void *packed_description = (void *) datatype->packed_description;
|
||||||
|
|
||||||
if( ompi_datatype_is_predefined(datatype) ) {
|
if( ompi_datatype_is_predefined(datatype) ) {
|
||||||
return 2 * sizeof(int);
|
return 2 * sizeof(int);
|
||||||
|
@ -36,7 +36,7 @@ static void __ompi_datatype_allocate( ompi_datatype_t* datatype )
|
|||||||
datatype->id = -1;
|
datatype->id = -1;
|
||||||
datatype->d_keyhash = NULL;
|
datatype->d_keyhash = NULL;
|
||||||
datatype->name[0] = '\0';
|
datatype->name[0] = '\0';
|
||||||
datatype->packed_description = NULL;
|
datatype->packed_description = 0;
|
||||||
datatype->pml_data = 0;
|
datatype->pml_data = 0;
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -46,10 +46,10 @@ static void __ompi_datatype_release(ompi_datatype_t * datatype)
|
|||||||
ompi_datatype_release_args( datatype );
|
ompi_datatype_release_args( datatype );
|
||||||
datatype->args = NULL;
|
datatype->args = NULL;
|
||||||
}
|
}
|
||||||
if( NULL != datatype->packed_description ) {
|
|
||||||
free( datatype->packed_description );
|
free ((void *) datatype->packed_description );
|
||||||
datatype->packed_description = NULL;
|
datatype->packed_description = 0;
|
||||||
}
|
|
||||||
if( datatype->d_f_to_c_index >= 0 ) {
|
if( datatype->d_f_to_c_index >= 0 ) {
|
||||||
opal_pointer_array_set_item( &ompi_datatype_f_to_c_table, datatype->d_f_to_c_index, NULL );
|
opal_pointer_array_set_item( &ompi_datatype_f_to_c_table, datatype->d_f_to_c_index, NULL );
|
||||||
datatype->d_f_to_c_index = -1;
|
datatype->d_f_to_c_index = -1;
|
||||||
|
@ -406,7 +406,7 @@ extern const ompi_datatype_t* ompi_datatype_basicDatatypes[OMPI_DATATYPE_MPI_MAX
|
|||||||
.d_f_to_c_index = -1, \
|
.d_f_to_c_index = -1, \
|
||||||
.d_keyhash = NULL, \
|
.d_keyhash = NULL, \
|
||||||
.args = NULL, \
|
.args = NULL, \
|
||||||
.packed_description = NULL, \
|
.packed_description = 0, \
|
||||||
.name = "MPI_" # NAME
|
.name = "MPI_" # NAME
|
||||||
|
|
||||||
#define OMPI_DATATYPE_INITIALIZER_UNAVAILABLE(FLAGS) \
|
#define OMPI_DATATYPE_INITIALIZER_UNAVAILABLE(FLAGS) \
|
||||||
|
@ -383,7 +383,7 @@ opal_pointer_array_t ompi_datatype_f_to_c_table = {{0}};
|
|||||||
(PDST)->super.desc = (PSRC)->super.desc; \
|
(PDST)->super.desc = (PSRC)->super.desc; \
|
||||||
(PDST)->super.opt_desc = (PSRC)->super.opt_desc; \
|
(PDST)->super.opt_desc = (PSRC)->super.opt_desc; \
|
||||||
(PDST)->packed_description = (PSRC)->packed_description; \
|
(PDST)->packed_description = (PSRC)->packed_description; \
|
||||||
(PSRC)->packed_description = NULL; \
|
(PSRC)->packed_description = 0; \
|
||||||
/* transfer the ptypes */ \
|
/* transfer the ptypes */ \
|
||||||
(PDST)->super.ptypes = (PSRC)->super.ptypes; \
|
(PDST)->super.ptypes = (PSRC)->super.ptypes; \
|
||||||
(PSRC)->super.ptypes = NULL; \
|
(PSRC)->super.ptypes = NULL; \
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2007-2016 Cisco Systems, Inc. All rights reserved.
|
* Copyright (c) 2007-2018 Cisco Systems, Inc. All rights reserved.
|
||||||
* Copyright (c) 2004-2010 The University of Tennessee and The University
|
* Copyright (c) 2004-2010 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
@ -1157,8 +1157,18 @@ static int fetch_request( mqs_process *proc, mpi_process_info *p_info,
|
|||||||
mqs_fetch_data( proc, ompi_datatype + i_info->ompi_datatype_t.offset.name,
|
mqs_fetch_data( proc, ompi_datatype + i_info->ompi_datatype_t.offset.name,
|
||||||
64, data_name );
|
64, data_name );
|
||||||
if( '\0' != data_name[0] ) {
|
if( '\0' != data_name[0] ) {
|
||||||
snprintf( (char*)res->extra_text[1], 64, "Data: %d * %s",
|
// res->extra_text[x] is only 64 chars long -- same as
|
||||||
(int)res->desired_length, data_name );
|
// data_name. If you try to snprintf it into
|
||||||
|
// res->extra_text with additional text, some compilers
|
||||||
|
// will warn that we might truncate the string (because it
|
||||||
|
// can see the static char array lengths). So just put
|
||||||
|
// data_name in res->extra_text[2] (vs. extra_text[1]),
|
||||||
|
// where it is guaranteed to fit.
|
||||||
|
data_name[4] = '\0';
|
||||||
|
snprintf( (char*)res->extra_text[1], 64, "Data: %d",
|
||||||
|
(int)res->desired_length);
|
||||||
|
snprintf( (char*)res->extra_text[2], 64, "%s",
|
||||||
|
data_name );
|
||||||
}
|
}
|
||||||
/* And now compute the real length as specified by the user */
|
/* And now compute the real length as specified by the user */
|
||||||
res->desired_length *=
|
res->desired_length *=
|
||||||
|
@ -202,7 +202,7 @@ ompi_errhandler_t *ompi_errhandler_create(ompi_errhandler_type_t object_type,
|
|||||||
new_errhandler->eh_comm_fn = (MPI_Comm_errhandler_function *)func;
|
new_errhandler->eh_comm_fn = (MPI_Comm_errhandler_function *)func;
|
||||||
break;
|
break;
|
||||||
case (OMPI_ERRHANDLER_TYPE_FILE):
|
case (OMPI_ERRHANDLER_TYPE_FILE):
|
||||||
new_errhandler->eh_file_fn = (ompi_file_errhandler_fn *)func;
|
new_errhandler->eh_file_fn = (ompi_file_errhandler_function *)func;
|
||||||
break;
|
break;
|
||||||
case (OMPI_ERRHANDLER_TYPE_WIN):
|
case (OMPI_ERRHANDLER_TYPE_WIN):
|
||||||
new_errhandler->eh_win_fn = (MPI_Win_errhandler_function *)func;
|
new_errhandler->eh_win_fn = (MPI_Win_errhandler_function *)func;
|
||||||
|
@ -117,7 +117,7 @@ struct ompi_errhandler_t {
|
|||||||
can be invoked on any MPI object type, so we need callbacks for
|
can be invoked on any MPI object type, so we need callbacks for
|
||||||
all of three. */
|
all of three. */
|
||||||
MPI_Comm_errhandler_function *eh_comm_fn;
|
MPI_Comm_errhandler_function *eh_comm_fn;
|
||||||
ompi_file_errhandler_fn *eh_file_fn;
|
ompi_file_errhandler_function *eh_file_fn;
|
||||||
MPI_Win_errhandler_function *eh_win_fn;
|
MPI_Win_errhandler_function *eh_win_fn;
|
||||||
ompi_errhandler_fortran_handler_fn_t *eh_fort_fn;
|
ompi_errhandler_fortran_handler_fn_t *eh_fort_fn;
|
||||||
|
|
||||||
|
@ -356,7 +356,8 @@ static inline struct ompi_proc_t *ompi_group_dense_lookup (ompi_group_t *group,
|
|||||||
ompi_proc_t *real_proc =
|
ompi_proc_t *real_proc =
|
||||||
(ompi_proc_t *) ompi_proc_for_name (ompi_proc_sentinel_to_name ((uintptr_t) proc));
|
(ompi_proc_t *) ompi_proc_for_name (ompi_proc_sentinel_to_name ((uintptr_t) proc));
|
||||||
|
|
||||||
if (opal_atomic_compare_exchange_strong_ptr (group->grp_proc_pointers + peer_id, &proc, real_proc)) {
|
if (opal_atomic_compare_exchange_strong_ptr ((opal_atomic_intptr_t *)(group->grp_proc_pointers + peer_id),
|
||||||
|
(intptr_t *) &proc, (intptr_t) real_proc)) {
|
||||||
OBJ_RETAIN(real_proc);
|
OBJ_RETAIN(real_proc);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -385,11 +385,11 @@ typedef int (MPI_Datarep_conversion_function)(void *, MPI_Datatype,
|
|||||||
typedef void (MPI_Comm_errhandler_function)(MPI_Comm *, int *, ...);
|
typedef void (MPI_Comm_errhandler_function)(MPI_Comm *, int *, ...);
|
||||||
|
|
||||||
/* This is a little hackish, but errhandler.h needs space for a
|
/* This is a little hackish, but errhandler.h needs space for a
|
||||||
MPI_File_errhandler_fn. While it could just be removed, this
|
MPI_File_errhandler_function. While it could just be removed, this
|
||||||
allows us to maintain a stable ABI within OMPI, at least for
|
allows us to maintain a stable ABI within OMPI, at least for
|
||||||
apps that don't use MPI I/O. */
|
apps that don't use MPI I/O. */
|
||||||
typedef void (ompi_file_errhandler_fn)(MPI_File *, int *, ...);
|
typedef void (ompi_file_errhandler_function)(MPI_File *, int *, ...);
|
||||||
typedef ompi_file_errhandler_fn MPI_File_errhandler_function;
|
typedef ompi_file_errhandler_function MPI_File_errhandler_function;
|
||||||
typedef void (MPI_Win_errhandler_function)(MPI_Win *, int *, ...);
|
typedef void (MPI_Win_errhandler_function)(MPI_Win *, int *, ...);
|
||||||
typedef void (MPI_User_function)(void *, void *, int *, MPI_Datatype *);
|
typedef void (MPI_User_function)(void *, void *, int *, MPI_Datatype *);
|
||||||
typedef int (MPI_Comm_copy_attr_function)(MPI_Comm, int, void *,
|
typedef int (MPI_Comm_copy_attr_function)(MPI_Comm, int, void *,
|
||||||
@ -412,7 +412,7 @@ typedef int (MPI_Grequest_cancel_function)(void *, int);
|
|||||||
*/
|
*/
|
||||||
typedef MPI_Comm_errhandler_function MPI_Comm_errhandler_fn
|
typedef MPI_Comm_errhandler_function MPI_Comm_errhandler_fn
|
||||||
__mpi_interface_removed__("MPI_Comm_errhandler_fn was removed in MPI-3.0; use MPI_Comm_errhandler_function instead");
|
__mpi_interface_removed__("MPI_Comm_errhandler_fn was removed in MPI-3.0; use MPI_Comm_errhandler_function instead");
|
||||||
typedef ompi_file_errhandler_fn MPI_File_errhandler_fn
|
typedef ompi_file_errhandler_function MPI_File_errhandler_fn
|
||||||
__mpi_interface_removed__("MPI_File_errhandler_fn was removed in MPI-3.0; use MPI_File_errhandler_function instead");
|
__mpi_interface_removed__("MPI_File_errhandler_fn was removed in MPI-3.0; use MPI_File_errhandler_function instead");
|
||||||
typedef MPI_Win_errhandler_function MPI_Win_errhandler_fn
|
typedef MPI_Win_errhandler_function MPI_Win_errhandler_fn
|
||||||
__mpi_interface_removed__("MPI_Win_errhandler_fn was removed in MPI-3.0; use MPI_Win_errhandler_function instead");
|
__mpi_interface_removed__("MPI_Win_errhandler_fn was removed in MPI-3.0; use MPI_Win_errhandler_function instead");
|
||||||
@ -1088,8 +1088,13 @@ OMPI_DECLSPEC extern struct ompi_predefined_datatype_t ompi_mpi_ub __mpi_interfa
|
|||||||
#define MPI_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_int)
|
#define MPI_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_int)
|
||||||
#define MPI_SHORT_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_short_int)
|
#define MPI_SHORT_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_short_int)
|
||||||
#define MPI_2INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_2int)
|
#define MPI_2INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_2int)
|
||||||
|
#if !OMPI_OMIT_MPI1_COMPAT_DECLS
|
||||||
|
/*
|
||||||
|
* Removed datatypes
|
||||||
|
*/
|
||||||
#define MPI_UB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_ub)
|
#define MPI_UB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_ub)
|
||||||
#define MPI_LB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_lb)
|
#define MPI_LB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_lb)
|
||||||
|
#endif
|
||||||
#define MPI_WCHAR OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_wchar)
|
#define MPI_WCHAR OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_wchar)
|
||||||
#if OPAL_HAVE_LONG_LONG
|
#if OPAL_HAVE_LONG_LONG
|
||||||
#define MPI_LONG_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_long_int)
|
#define MPI_LONG_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_long_int)
|
||||||
|
@ -90,7 +90,7 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
|
|||||||
mca_coll_base_module_t *module)
|
mca_coll_base_module_t *module)
|
||||||
{
|
{
|
||||||
int line = -1, rank, size, sendto, recvfrom, distance, blockcount, err = 0;
|
int line = -1, rank, size, sendto, recvfrom, distance, blockcount, err = 0;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -99,9 +99,6 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgather_intra_bruck rank %d", rank));
|
"coll:base:allgather_intra_bruck rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -262,7 +259,7 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
|
|||||||
{
|
{
|
||||||
int line = -1, rank, size, pow2size, err;
|
int line = -1, rank, size, pow2size, err;
|
||||||
int remote, distance, sendblocklocation;
|
int remote, distance, sendblocklocation;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -289,9 +286,6 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
|
|||||||
"coll:base:allgather_intra_recursivedoubling rank %d, size %d",
|
"coll:base:allgather_intra_recursivedoubling rank %d, size %d",
|
||||||
rank, size));
|
rank, size));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -369,7 +363,7 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
|
|||||||
mca_coll_base_module_t *module)
|
mca_coll_base_module_t *module)
|
||||||
{
|
{
|
||||||
int line = -1, rank, size, err, sendto, recvfrom, i, recvdatafrom, senddatafrom;
|
int line = -1, rank, size, err, sendto, recvfrom, i, recvdatafrom, senddatafrom;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -378,9 +372,6 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgather_intra_ring rank %d", rank));
|
"coll:base:allgather_intra_ring rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -499,7 +490,7 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
|
|||||||
{
|
{
|
||||||
int line = -1, rank, size, i, even_rank, err;
|
int line = -1, rank, size, i, even_rank, err;
|
||||||
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -517,9 +508,6 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgather_intra_neighborexchange rank %d", rank));
|
"coll:base:allgather_intra_neighborexchange rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -616,7 +604,7 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
|
|||||||
{
|
{
|
||||||
int line = -1, err, rank, remote;
|
int line = -1, err, rank, remote;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
ptrdiff_t sext, rext, lb;
|
ptrdiff_t rext, lb;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
rank = ompi_comm_rank(comm);
|
||||||
|
|
||||||
@ -627,9 +615,6 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
|
|||||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||||
}
|
}
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
|
@ -100,7 +100,7 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
|
|||||||
{
|
{
|
||||||
int line = -1, err = 0, rank, size, sendto, recvfrom, distance, blockcount, i;
|
int line = -1, err = 0, rank, size, sendto, recvfrom, distance, blockcount, i;
|
||||||
int *new_rcounts = NULL, *new_rdispls = NULL, *new_scounts = NULL, *new_sdispls = NULL;
|
int *new_rcounts = NULL, *new_rdispls = NULL, *new_scounts = NULL, *new_sdispls = NULL;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
||||||
|
|
||||||
@ -110,9 +110,6 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgather_intra_bruck rank %d", rank));
|
"coll:base:allgather_intra_bruck rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -229,7 +226,7 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
|
|||||||
mca_coll_base_module_t *module)
|
mca_coll_base_module_t *module)
|
||||||
{
|
{
|
||||||
int line = -1, rank, size, sendto, recvfrom, i, recvdatafrom, senddatafrom, err = 0;
|
int line = -1, rank, size, sendto, recvfrom, i, recvdatafrom, senddatafrom, err = 0;
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -238,9 +235,6 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgatherv_intra_ring rank %d", rank));
|
"coll:base:allgatherv_intra_ring rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -361,7 +355,7 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
|
|||||||
int line = -1, rank, size, i, even_rank, err = 0;
|
int line = -1, rank, size, i, even_rank, err = 0;
|
||||||
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
||||||
int new_scounts[2], new_sdispls[2], new_rcounts[2], new_rdispls[2];
|
int new_scounts[2], new_sdispls[2], new_rcounts[2], new_rdispls[2];
|
||||||
ptrdiff_t slb, rlb, sext, rext;
|
ptrdiff_t rlb, rext;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
||||||
|
|
||||||
@ -381,9 +375,6 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
|
|||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"coll:base:allgatherv_intra_neighborexchange rank %d", rank));
|
"coll:base:allgatherv_intra_neighborexchange rank %d", rank));
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
@ -509,7 +500,7 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
|
|||||||
{
|
{
|
||||||
int line = -1, err = 0, rank, remote;
|
int line = -1, err = 0, rank, remote;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL;
|
||||||
ptrdiff_t sext, rext, lb;
|
ptrdiff_t rext, lb;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
rank = ompi_comm_rank(comm);
|
||||||
|
|
||||||
@ -520,9 +511,6 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
|
|||||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||||
}
|
}
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
|
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
|
||||||
|
|
||||||
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
|
@ -350,7 +350,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
|
|||||||
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
||||||
ptrdiff_t true_lb, true_extent, lb, extent;
|
ptrdiff_t true_lb, true_extent, lb, extent;
|
||||||
ptrdiff_t block_offset, max_real_segsize;
|
ptrdiff_t block_offset, max_real_segsize;
|
||||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
rank = ompi_comm_rank(comm);
|
rank = ompi_comm_rank(comm);
|
||||||
@ -528,6 +528,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
|
|||||||
error_hndl:
|
error_hndl:
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
||||||
__FILE__, line, rank, ret));
|
__FILE__, line, rank, ret));
|
||||||
|
ompi_coll_base_free_reqs(reqs, 2);
|
||||||
(void)line; // silence compiler warning
|
(void)line; // silence compiler warning
|
||||||
if (NULL != inbuf[0]) free(inbuf[0]);
|
if (NULL != inbuf[0]) free(inbuf[0]);
|
||||||
if (NULL != inbuf[1]) free(inbuf[1]);
|
if (NULL != inbuf[1]) free(inbuf[1]);
|
||||||
@ -627,7 +628,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
|
|||||||
size_t typelng;
|
size_t typelng;
|
||||||
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
||||||
ptrdiff_t block_offset, max_real_segsize;
|
ptrdiff_t block_offset, max_real_segsize;
|
||||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||||
ptrdiff_t lb, extent, gap;
|
ptrdiff_t lb, extent, gap;
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
@ -847,6 +848,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
|
|||||||
error_hndl:
|
error_hndl:
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
||||||
__FILE__, line, rank, ret));
|
__FILE__, line, rank, ret));
|
||||||
|
ompi_coll_base_free_reqs(reqs, 2);
|
||||||
(void)line; // silence compiler warning
|
(void)line; // silence compiler warning
|
||||||
if (NULL != inbuf[0]) free(inbuf[0]);
|
if (NULL != inbuf[0]) free(inbuf[0]);
|
||||||
if (NULL != inbuf[1]) free(inbuf[1]);
|
if (NULL != inbuf[1]) free(inbuf[1]);
|
||||||
|
@ -393,6 +393,7 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
|
|||||||
if (0 < total_reqs) {
|
if (0 < total_reqs) {
|
||||||
reqs = ompi_coll_base_comm_get_reqs(module->base_data, 2 * total_reqs);
|
reqs = ompi_coll_base_comm_get_reqs(module->base_data, 2 * total_reqs);
|
||||||
if (NULL == reqs) { error = -1; line = __LINE__; goto error_hndl; }
|
if (NULL == reqs) { error = -1; line = __LINE__; goto error_hndl; }
|
||||||
|
reqs[0] = reqs[1] = MPI_REQUEST_NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
prcv = (char *) rbuf;
|
prcv = (char *) rbuf;
|
||||||
@ -468,6 +469,15 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
|
|||||||
return MPI_SUCCESS;
|
return MPI_SUCCESS;
|
||||||
|
|
||||||
error_hndl:
|
error_hndl:
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == error) {
|
||||||
|
for( ri = 0; ri < nreqs; ri++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == reqs[ri]) continue;
|
||||||
|
if (MPI_ERR_PENDING == reqs[ri]->req_status.MPI_ERROR) continue;
|
||||||
|
error = reqs[ri]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"%s:%4d\tError occurred %d, rank %2d", __FILE__, line, error,
|
"%s:%4d\tError occurred %d, rank %2d", __FILE__, line, error,
|
||||||
rank));
|
rank));
|
||||||
@ -661,7 +671,16 @@ int ompi_coll_base_alltoall_intra_basic_linear(const void *sbuf, int scount,
|
|||||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||||
|
|
||||||
err_hndl:
|
err_hndl:
|
||||||
if( MPI_SUCCESS != err ) {
|
if (MPI_SUCCESS != err) {
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == err) {
|
||||||
|
for( i = 0; i < nreqs; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == req[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == req[i]->req_status.MPI_ERROR) continue;
|
||||||
|
err = req[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||||
__FILE__, line, err, rank) );
|
__FILE__, line, err, rank) );
|
||||||
(void)line; // silence compiler warning
|
(void)line; // silence compiler warning
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||||
* University Research and Technology
|
* University Research and Technology
|
||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||||
@ -276,6 +276,15 @@ ompi_coll_base_alltoallv_intra_basic_linear(const void *sbuf, const int *scounts
|
|||||||
err = ompi_request_wait_all(nreqs, reqs, MPI_STATUSES_IGNORE);
|
err = ompi_request_wait_all(nreqs, reqs, MPI_STATUSES_IGNORE);
|
||||||
|
|
||||||
err_hndl:
|
err_hndl:
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == err) {
|
||||||
|
for( i = 0; i < nreqs; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||||
|
err = reqs[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
/* Free the requests in all cases as they are persistent */
|
/* Free the requests in all cases as they are persistent */
|
||||||
ompi_coll_base_free_reqs(reqs, nreqs);
|
ompi_coll_base_free_reqs(reqs, nreqs);
|
||||||
|
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||||
* University Research and Technology
|
* University Research and Technology
|
||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||||
@ -102,8 +102,10 @@ int ompi_coll_base_barrier_intra_doublering(struct ompi_communicator_t *comm,
|
|||||||
{
|
{
|
||||||
int rank, size, err = 0, line = 0, left, right;
|
int rank, size, err = 0, line = 0, left, right;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return OMPI_SUCCESS;
|
||||||
|
rank = ompi_comm_rank(comm);
|
||||||
|
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,"ompi_coll_base_barrier_intra_doublering rank %d", rank));
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,"ompi_coll_base_barrier_intra_doublering rank %d", rank));
|
||||||
|
|
||||||
@ -172,8 +174,10 @@ int ompi_coll_base_barrier_intra_recursivedoubling(struct ompi_communicator_t *c
|
|||||||
{
|
{
|
||||||
int rank, size, adjsize, err, line, mask, remote;
|
int rank, size, adjsize, err, line, mask, remote;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return OMPI_SUCCESS;
|
||||||
|
rank = ompi_comm_rank(comm);
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"ompi_coll_base_barrier_intra_recursivedoubling rank %d",
|
"ompi_coll_base_barrier_intra_recursivedoubling rank %d",
|
||||||
rank));
|
rank));
|
||||||
@ -251,8 +255,10 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
|
|||||||
{
|
{
|
||||||
int rank, size, distance, to, from, err, line = 0;
|
int rank, size, distance, to, from, err, line = 0;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return MPI_SUCCESS;
|
||||||
|
rank = ompi_comm_rank(comm);
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"ompi_coll_base_barrier_intra_bruck rank %d", rank));
|
"ompi_coll_base_barrier_intra_bruck rank %d", rank));
|
||||||
|
|
||||||
@ -285,16 +291,19 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
|
|||||||
int ompi_coll_base_barrier_intra_two_procs(struct ompi_communicator_t *comm,
|
int ompi_coll_base_barrier_intra_two_procs(struct ompi_communicator_t *comm,
|
||||||
mca_coll_base_module_t *module)
|
mca_coll_base_module_t *module)
|
||||||
{
|
{
|
||||||
int remote, err;
|
int remote, size, err;
|
||||||
|
|
||||||
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return MPI_SUCCESS;
|
||||||
|
if( 2 != ompi_comm_size(comm) ) {
|
||||||
|
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||||
|
}
|
||||||
|
|
||||||
remote = ompi_comm_rank(comm);
|
remote = ompi_comm_rank(comm);
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"ompi_coll_base_barrier_intra_two_procs rank %d", remote));
|
"ompi_coll_base_barrier_intra_two_procs rank %d", remote));
|
||||||
|
|
||||||
if (2 != ompi_comm_size(comm)) {
|
|
||||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
|
||||||
}
|
|
||||||
|
|
||||||
remote = (remote + 1) & 0x1;
|
remote = (remote + 1) & 0x1;
|
||||||
|
|
||||||
err = ompi_coll_base_sendrecv_zero(remote, MCA_COLL_BASE_TAG_BARRIER,
|
err = ompi_coll_base_sendrecv_zero(remote, MCA_COLL_BASE_TAG_BARRIER,
|
||||||
@ -324,8 +333,10 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
|
|||||||
int i, err, rank, size, line;
|
int i, err, rank, size, line;
|
||||||
ompi_request_t** requests = NULL;
|
ompi_request_t** requests = NULL;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return MPI_SUCCESS;
|
||||||
|
rank = ompi_comm_rank(comm);
|
||||||
|
|
||||||
/* All non-root send & receive zero-length message. */
|
/* All non-root send & receive zero-length message. */
|
||||||
if (rank > 0) {
|
if (rank > 0) {
|
||||||
@ -367,11 +378,21 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
|
|||||||
/* All done */
|
/* All done */
|
||||||
return MPI_SUCCESS;
|
return MPI_SUCCESS;
|
||||||
err_hndl:
|
err_hndl:
|
||||||
|
if( NULL != requests ) {
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == err) {
|
||||||
|
for( i = 0; i < size; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == requests[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == requests[i]->req_status.MPI_ERROR) continue;
|
||||||
|
err = requests[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
ompi_coll_base_free_reqs(requests, size);
|
||||||
|
}
|
||||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||||
__FILE__, line, err, rank) );
|
__FILE__, line, err, rank) );
|
||||||
(void)line; // silence compiler warning
|
(void)line; // silence compiler warning
|
||||||
if( NULL != requests )
|
|
||||||
ompi_coll_base_free_reqs(requests, size);
|
|
||||||
return err;
|
return err;
|
||||||
}
|
}
|
||||||
/* copied function (with appropriate renaming) ends here */
|
/* copied function (with appropriate renaming) ends here */
|
||||||
@ -385,8 +406,10 @@ int ompi_coll_base_barrier_intra_tree(struct ompi_communicator_t *comm,
|
|||||||
{
|
{
|
||||||
int rank, size, depth, err, jump, partner;
|
int rank, size, depth, err, jump, partner;
|
||||||
|
|
||||||
rank = ompi_comm_rank(comm);
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
|
if( 1 == size )
|
||||||
|
return MPI_SUCCESS;
|
||||||
|
rank = ompi_comm_rank(comm);
|
||||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||||
"ompi_coll_base_barrier_intra_tree %d",
|
"ompi_coll_base_barrier_intra_tree %d",
|
||||||
rank));
|
rank));
|
||||||
|
@ -3,7 +3,7 @@
|
|||||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||||
* University Research and Technology
|
* University Research and Technology
|
||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||||
@ -214,13 +214,29 @@ ompi_coll_base_bcast_intra_generic( void* buffer,
|
|||||||
return (MPI_SUCCESS);
|
return (MPI_SUCCESS);
|
||||||
|
|
||||||
error_hndl:
|
error_hndl:
|
||||||
|
if (MPI_ERR_IN_STATUS == err) {
|
||||||
|
for( req_index = 0; req_index < 2; req_index++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == recv_reqs[req_index]) continue;
|
||||||
|
if (MPI_ERR_PENDING == recv_reqs[req_index]->req_status.MPI_ERROR) continue;
|
||||||
|
err = recv_reqs[req_index]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
ompi_coll_base_free_reqs( recv_reqs, 2);
|
||||||
|
if( NULL != send_reqs ) {
|
||||||
|
if (MPI_ERR_IN_STATUS == err) {
|
||||||
|
for( req_index = 0; req_index < tree->tree_nextsize; req_index++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == send_reqs[req_index]) continue;
|
||||||
|
if (MPI_ERR_PENDING == send_reqs[req_index]->req_status.MPI_ERROR) continue;
|
||||||
|
err = send_reqs[req_index]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
|
||||||
|
}
|
||||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||||
__FILE__, line, err, rank) );
|
__FILE__, line, err, rank) );
|
||||||
(void)line; // silence compiler warnings
|
(void)line; // silence compiler warnings
|
||||||
ompi_coll_base_free_reqs( recv_reqs, 2);
|
|
||||||
if( NULL != send_reqs ) {
|
|
||||||
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
|
|
||||||
}
|
|
||||||
|
|
||||||
return err;
|
return err;
|
||||||
}
|
}
|
||||||
@ -649,12 +665,21 @@ ompi_coll_base_bcast_intra_basic_linear(void *buff, int count,
|
|||||||
* care what the error was -- just that there *was* an error. The
|
* care what the error was -- just that there *was* an error. The
|
||||||
* PML will finish all requests, even if one or more of them fail.
|
* PML will finish all requests, even if one or more of them fail.
|
||||||
* i.e., by the end of this call, all the requests are free-able.
|
* i.e., by the end of this call, all the requests are free-able.
|
||||||
* So free them anyway -- even if there was an error, and return
|
* So free them anyway -- even if there was an error.
|
||||||
* the error after we free everything. */
|
* Note we still need to get the actual error, as collective
|
||||||
|
* operations cannot return MPI_ERR_IN_STATUS.
|
||||||
|
*/
|
||||||
|
|
||||||
err = ompi_request_wait_all(i, reqs, MPI_STATUSES_IGNORE);
|
err = ompi_request_wait_all(i, reqs, MPI_STATUSES_IGNORE);
|
||||||
err_hndl:
|
err_hndl:
|
||||||
if( MPI_SUCCESS != err ) { /* Free the reqs */
|
if( MPI_SUCCESS != err ) { /* Free the reqs */
|
||||||
|
/* first find the real error code */
|
||||||
|
for( preq = reqs; preq < reqs+i; preq++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == *preq) continue;
|
||||||
|
if (MPI_ERR_PENDING == (*preq)->req_status.MPI_ERROR) continue;
|
||||||
|
err = (*preq)->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
ompi_coll_base_free_reqs(reqs, i);
|
ompi_coll_base_free_reqs(reqs, i);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -326,6 +326,15 @@ ompi_coll_base_gather_intra_linear_sync(const void *sbuf, int scount,
|
|||||||
return MPI_SUCCESS;
|
return MPI_SUCCESS;
|
||||||
error_hndl:
|
error_hndl:
|
||||||
if (NULL != reqs) {
|
if (NULL != reqs) {
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == ret) {
|
||||||
|
for( i = 0; i < size; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||||
|
ret = reqs[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
ompi_coll_base_free_reqs(reqs, size);
|
ompi_coll_base_free_reqs(reqs, size);
|
||||||
}
|
}
|
||||||
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
||||||
|
@ -338,16 +338,34 @@ int ompi_coll_base_reduce_generic( const void* sendbuf, void* recvbuf, int origi
|
|||||||
return OMPI_SUCCESS;
|
return OMPI_SUCCESS;
|
||||||
|
|
||||||
error_hndl: /* error handler */
|
error_hndl: /* error handler */
|
||||||
|
/* find a real error code */
|
||||||
|
if (MPI_ERR_IN_STATUS == ret) {
|
||||||
|
for( i = 0; i < 2; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||||
|
ret = reqs[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
ompi_coll_base_free_reqs(reqs, 2);
|
||||||
|
if( NULL != sreq ) {
|
||||||
|
if (MPI_ERR_IN_STATUS == ret) {
|
||||||
|
for( i = 0; i < max_outstanding_reqs; i++ ) {
|
||||||
|
if (MPI_REQUEST_NULL == sreq[i]) continue;
|
||||||
|
if (MPI_ERR_PENDING == sreq[i]->req_status.MPI_ERROR) continue;
|
||||||
|
ret = sreq[i]->req_status.MPI_ERROR;
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
|
||||||
|
}
|
||||||
|
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
|
||||||
|
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
|
||||||
|
if( accumbuf_free != NULL ) free(accumbuf);
|
||||||
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
||||||
"ERROR_HNDL: node %d file %s line %d error %d\n",
|
"ERROR_HNDL: node %d file %s line %d error %d\n",
|
||||||
rank, __FILE__, line, ret ));
|
rank, __FILE__, line, ret ));
|
||||||
(void)line; // silence compiler warning
|
(void)line; // silence compiler warning
|
||||||
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
|
|
||||||
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
|
|
||||||
if( accumbuf_free != NULL ) free(accumbuf);
|
|
||||||
if( NULL != sreq ) {
|
|
||||||
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
|
|
||||||
}
|
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -464,7 +464,7 @@ ompi_coll_base_reduce_scatter_intra_ring( const void *sbuf, void *rbuf, const in
|
|||||||
char *tmpsend = NULL, *tmprecv = NULL, *accumbuf = NULL, *accumbuf_free = NULL;
|
char *tmpsend = NULL, *tmprecv = NULL, *accumbuf = NULL, *accumbuf_free = NULL;
|
||||||
char *inbuf_free[2] = {NULL, NULL}, *inbuf[2] = {NULL, NULL};
|
char *inbuf_free[2] = {NULL, NULL}, *inbuf[2] = {NULL, NULL};
|
||||||
ptrdiff_t extent, max_real_segsize, dsize, gap = 0;
|
ptrdiff_t extent, max_real_segsize, dsize, gap = 0;
|
||||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||||
|
|
||||||
size = ompi_comm_size(comm);
|
size = ompi_comm_size(comm);
|
||||||
rank = ompi_comm_rank(comm);
|
rank = ompi_comm_rank(comm);
|
||||||
|
@ -41,7 +41,7 @@ int ompi_coll_base_sendrecv_actual( const void* sendbuf, size_t scount,
|
|||||||
{ /* post receive first, then send, then wait... should be fast (I hope) */
|
{ /* post receive first, then send, then wait... should be fast (I hope) */
|
||||||
int err, line = 0;
|
int err, line = 0;
|
||||||
size_t rtypesize, stypesize;
|
size_t rtypesize, stypesize;
|
||||||
ompi_request_t *req;
|
ompi_request_t *req = MPI_REQUEST_NULL;
|
||||||
ompi_status_public_t rstatus;
|
ompi_status_public_t rstatus;
|
||||||
|
|
||||||
/* post new irecv */
|
/* post new irecv */
|
||||||
|
@ -71,12 +71,13 @@ BEGIN_C_DECLS
|
|||||||
|
|
||||||
extern bool libnbc_ibcast_skip_dt_decision;
|
extern bool libnbc_ibcast_skip_dt_decision;
|
||||||
extern int libnbc_iexscan_algorithm;
|
extern int libnbc_iexscan_algorithm;
|
||||||
|
extern int libnbc_iscan_algorithm;
|
||||||
|
|
||||||
struct ompi_coll_libnbc_component_t {
|
struct ompi_coll_libnbc_component_t {
|
||||||
mca_coll_base_component_2_0_0_t super;
|
mca_coll_base_component_2_0_0_t super;
|
||||||
opal_free_list_t requests;
|
opal_free_list_t requests;
|
||||||
opal_list_t active_requests;
|
opal_list_t active_requests;
|
||||||
int32_t active_comms;
|
opal_atomic_int32_t active_comms;
|
||||||
opal_mutex_t lock; /* protect access to the active_requests list */
|
opal_mutex_t lock; /* protect access to the active_requests list */
|
||||||
};
|
};
|
||||||
typedef struct ompi_coll_libnbc_component_t ompi_coll_libnbc_component_t;
|
typedef struct ompi_coll_libnbc_component_t ompi_coll_libnbc_component_t;
|
||||||
|
@ -54,6 +54,14 @@ static mca_base_var_enum_value_t iexscan_algorithms[] = {
|
|||||||
{0, NULL}
|
{0, NULL}
|
||||||
};
|
};
|
||||||
|
|
||||||
|
int libnbc_iscan_algorithm = 0; /* iscan user forced algorithm */
|
||||||
|
static mca_base_var_enum_value_t iscan_algorithms[] = {
|
||||||
|
{0, "ignore"},
|
||||||
|
{1, "linear"},
|
||||||
|
{2, "recursive_doubling"},
|
||||||
|
{0, NULL}
|
||||||
|
};
|
||||||
|
|
||||||
static int libnbc_open(void);
|
static int libnbc_open(void);
|
||||||
static int libnbc_close(void);
|
static int libnbc_close(void);
|
||||||
static int libnbc_register(void);
|
static int libnbc_register(void);
|
||||||
@ -177,6 +185,16 @@ libnbc_register(void)
|
|||||||
&libnbc_iexscan_algorithm);
|
&libnbc_iexscan_algorithm);
|
||||||
OBJ_RELEASE(new_enum);
|
OBJ_RELEASE(new_enum);
|
||||||
|
|
||||||
|
libnbc_iscan_algorithm = 0;
|
||||||
|
(void) mca_base_var_enum_create("coll_libnbc_iscan_algorithms", iscan_algorithms, &new_enum);
|
||||||
|
mca_base_component_var_register(&mca_coll_libnbc_component.super.collm_version,
|
||||||
|
"iscan_algorithm",
|
||||||
|
"Which iscan algorithm is used: 0 ignore, 1 linear, 2 recursive_doubling",
|
||||||
|
MCA_BASE_VAR_TYPE_INT, new_enum, 0, MCA_BASE_VAR_FLAG_SETTABLE,
|
||||||
|
OPAL_INFO_LVL_5, MCA_BASE_VAR_SCOPE_ALL,
|
||||||
|
&libnbc_iscan_algorithm);
|
||||||
|
OBJ_RELEASE(new_enum);
|
||||||
|
|
||||||
return OMPI_SUCCESS;
|
return OMPI_SUCCESS;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -62,7 +62,6 @@ struct dict {
|
|||||||
int (*_insert) __P((void *obj, void *k, void *d, int ow));
|
int (*_insert) __P((void *obj, void *k, void *d, int ow));
|
||||||
int (*_probe) __P((void *obj, void *key, void **dat));
|
int (*_probe) __P((void *obj, void *key, void **dat));
|
||||||
void *(*_search) __P((void *obj, const void *k));
|
void *(*_search) __P((void *obj, const void *k));
|
||||||
const void *(*_csearch) __P((const void *obj, const void *k));
|
|
||||||
int (*_remove) __P((void *obj, const void *key, int del));
|
int (*_remove) __P((void *obj, const void *key, int del));
|
||||||
void (*_walk) __P((void *obj, dict_vis_func func));
|
void (*_walk) __P((void *obj, dict_vis_func func));
|
||||||
unsigned (*_count) __P((const void *obj));
|
unsigned (*_count) __P((const void *obj));
|
||||||
@ -75,7 +74,6 @@ struct dict {
|
|||||||
#define dict_insert(dct,k,d,o) (dct)->_insert((dct)->_object, (k), (d), (o))
|
#define dict_insert(dct,k,d,o) (dct)->_insert((dct)->_object, (k), (d), (o))
|
||||||
#define dict_probe(dct,k,d) (dct)->_probe((dct)->_object, (k), (d))
|
#define dict_probe(dct,k,d) (dct)->_probe((dct)->_object, (k), (d))
|
||||||
#define dict_search(dct,k) (dct)->_search((dct)->_object, (k))
|
#define dict_search(dct,k) (dct)->_search((dct)->_object, (k))
|
||||||
#define dict_csearch(dct,k) (dct)->_csearch((dct)->_object, (k))
|
|
||||||
#define dict_remove(dct,k,del) (dct)->_remove((dct)->_object, (k), (del))
|
#define dict_remove(dct,k,del) (dct)->_remove((dct)->_object, (k), (del))
|
||||||
#define dict_walk(dct,f) (dct)->_walk((dct)->_object, (f))
|
#define dict_walk(dct,f) (dct)->_walk((dct)->_object, (f))
|
||||||
#define dict_count(dct) (dct)->_count((dct)->_object)
|
#define dict_count(dct) (dct)->_count((dct)->_object)
|
||||||
|
@ -15,7 +15,6 @@
|
|||||||
typedef int (*insert_func) __P((void *, void *k, void *d, int o));
|
typedef int (*insert_func) __P((void *, void *k, void *d, int o));
|
||||||
typedef int (*probe_func) __P((void *, void *k, void **d));
|
typedef int (*probe_func) __P((void *, void *k, void **d));
|
||||||
typedef void *(*search_func) __P((void *, const void *k));
|
typedef void *(*search_func) __P((void *, const void *k));
|
||||||
typedef const void *(*csearch_func) __P((const void *, const void *k));
|
|
||||||
typedef int (*remove_func) __P((void *, const void *k, int d));
|
typedef int (*remove_func) __P((void *, const void *k, int d));
|
||||||
typedef void (*walk_func) __P((void *, dict_vis_func visit));
|
typedef void (*walk_func) __P((void *, dict_vis_func visit));
|
||||||
typedef unsigned (*count_func) __P((const void *));
|
typedef unsigned (*count_func) __P((const void *));
|
||||||
|
@ -90,7 +90,6 @@ hb_dict_new(dict_cmp_func key_cmp, dict_del_func key_del,
|
|||||||
dct->_insert = (insert_func)hb_tree_insert;
|
dct->_insert = (insert_func)hb_tree_insert;
|
||||||
dct->_probe = (probe_func)hb_tree_probe;
|
dct->_probe = (probe_func)hb_tree_probe;
|
||||||
dct->_search = (search_func)hb_tree_search;
|
dct->_search = (search_func)hb_tree_search;
|
||||||
dct->_csearch = (csearch_func)hb_tree_csearch;
|
|
||||||
dct->_remove = (remove_func)hb_tree_remove;
|
dct->_remove = (remove_func)hb_tree_remove;
|
||||||
dct->_empty = (empty_func)hb_tree_empty;
|
dct->_empty = (empty_func)hb_tree_empty;
|
||||||
dct->_walk = (walk_func)hb_tree_walk;
|
dct->_walk = (walk_func)hb_tree_walk;
|
||||||
@ -170,12 +169,6 @@ hb_tree_search(hb_tree *tree, const void *key)
|
|||||||
return NULL;
|
return NULL;
|
||||||
}
|
}
|
||||||
|
|
||||||
const void *
|
|
||||||
hb_tree_csearch(const hb_tree *tree, const void *key)
|
|
||||||
{
|
|
||||||
return hb_tree_csearch((hb_tree *)tree, key);
|
|
||||||
}
|
|
||||||
|
|
||||||
int
|
int
|
||||||
hb_tree_insert(hb_tree *tree, void *key, void *dat, int overwrite)
|
hb_tree_insert(hb_tree *tree, void *key, void *dat, int overwrite)
|
||||||
{
|
{
|
||||||
|
@ -26,7 +26,6 @@ void hb_tree_destroy __P((hb_tree *tree, int del));
|
|||||||
int hb_tree_insert __P((hb_tree *tree, void *key, void *dat, int overwrite));
|
int hb_tree_insert __P((hb_tree *tree, void *key, void *dat, int overwrite));
|
||||||
int hb_tree_probe __P((hb_tree *tree, void *key, void **dat));
|
int hb_tree_probe __P((hb_tree *tree, void *key, void **dat));
|
||||||
void *hb_tree_search __P((hb_tree *tree, const void *key));
|
void *hb_tree_search __P((hb_tree *tree, const void *key));
|
||||||
const void *hb_tree_csearch __P((const hb_tree *tree, const void *key));
|
|
||||||
int hb_tree_remove __P((hb_tree *tree, const void *key, int del));
|
int hb_tree_remove __P((hb_tree *tree, const void *key, int del));
|
||||||
void hb_tree_empty __P((hb_tree *tree, int del));
|
void hb_tree_empty __P((hb_tree *tree, int del));
|
||||||
void hb_tree_walk __P((hb_tree *tree, dict_vis_func visit));
|
void hb_tree_walk __P((hb_tree *tree, dict_vis_func visit));
|
||||||
|
@ -11,8 +11,8 @@
|
|||||||
* Copyright (c) 2012 Oracle and/or its affiliates. All rights reserved.
|
* Copyright (c) 2012 Oracle and/or its affiliates. All rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -130,7 +130,7 @@ int ompi_coll_libnbc_iallgatherv(const void* sendbuf, int sendcount, MPI_Datatyp
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -209,7 +209,7 @@ int ompi_coll_libnbc_iallgatherv_inter(const void* sendbuf, int sendcount, MPI_D
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -7,8 +7,8 @@
|
|||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2013-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2017 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -206,7 +206,7 @@ int ompi_coll_libnbc_iallreduce(const void* sendbuf, void* recvbuf, int count, M
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -289,7 +289,7 @@ int ompi_coll_libnbc_iallreduce_inter(const void* sendbuf, void* recvbuf, int co
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -292,7 +292,7 @@ int ompi_coll_libnbc_ialltoall(const void* sendbuf, int sendcount, MPI_Datatype
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -376,7 +376,7 @@ int ompi_coll_libnbc_ialltoall_inter (const void* sendbuf, int sendcount, MPI_Da
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -153,7 +153,7 @@ int ompi_coll_libnbc_ialltoallv(const void* sendbuf, const int *sendcounts, cons
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -241,7 +241,7 @@ int ompi_coll_libnbc_ialltoallv_inter (const void* sendbuf, const int *sendcount
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -139,7 +139,7 @@ int ompi_coll_libnbc_ialltoallw(const void* sendbuf, const int *sendcounts, cons
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -214,7 +214,7 @@ int ompi_coll_libnbc_ialltoallw_inter(const void* sendbuf, const int *sendcounts
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -7,8 +7,8 @@
|
|||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
@ -108,7 +108,7 @@ int ompi_coll_libnbc_ibarrier(struct ompi_communicator_t *comm, ompi_request_t *
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -195,7 +195,7 @@ int ompi_coll_libnbc_ibarrier_inter(struct ompi_communicator_t *comm, ompi_reque
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
|
||||||
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ibcast(void *buffer, int count, MPI_Datatype datatype, int
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -405,7 +405,7 @@ int ompi_coll_libnbc_ibcast_inter(void *buffer, int count, MPI_Datatype datatype
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -7,8 +7,8 @@
|
|||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -176,7 +176,7 @@ int ompi_coll_libnbc_iexscan(const void* sendbuf, void* recvbuf, int count, MPI_
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -8,8 +8,8 @@
|
|||||||
* Copyright (c) 2013 The University of Tennessee and The University
|
* Copyright (c) 2013 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -185,7 +185,7 @@ int ompi_coll_libnbc_igather(const void* sendbuf, int sendcount, MPI_Datatype se
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -265,7 +265,7 @@ int ompi_coll_libnbc_igather_inter(const void* sendbuf, int sendcount, MPI_Datat
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -8,8 +8,8 @@
|
|||||||
* Copyright (c) 2013 The University of Tennessee and The University
|
* Copyright (c) 2013 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
||||||
@ -117,7 +117,7 @@ int ompi_coll_libnbc_igatherv(const void* sendbuf, int sendcount, MPI_Datatype s
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -197,7 +197,7 @@ int ompi_coll_libnbc_igatherv_inter(const void* sendbuf, int sendcount, MPI_Data
|
|||||||
|
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -173,7 +173,7 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -181,157 +181,6 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
|
|||||||
return OMPI_SUCCESS;
|
return OMPI_SUCCESS;
|
||||||
}
|
}
|
||||||
|
|
||||||
/* better binomial bcast
|
|
||||||
* working principle:
|
|
||||||
* - each node gets a virtual rank vrank
|
|
||||||
* - the 'root' node get vrank 0
|
|
||||||
* - node 0 gets the vrank of the 'root'
|
|
||||||
* - all other ranks stay identical (they do not matter)
|
|
||||||
*
|
|
||||||
* Algorithm:
|
|
||||||
* - each node with vrank > 2^r and vrank < 2^r+1 receives from node
|
|
||||||
* vrank - 2^r (vrank=1 receives from 0, vrank 0 receives never)
|
|
||||||
* - each node sends each round r to node vrank + 2^r
|
|
||||||
* - a node stops to send if 2^r > commsize
|
|
||||||
*/
|
|
||||||
#define RANK2VRANK(rank, vrank, root) \
|
|
||||||
{ \
|
|
||||||
vrank = rank; \
|
|
||||||
if (rank == 0) vrank = root; \
|
|
||||||
if (rank == root) vrank = 0; \
|
|
||||||
}
|
|
||||||
#define VRANK2RANK(rank, vrank, root) \
|
|
||||||
{ \
|
|
||||||
rank = vrank; \
|
|
||||||
if (vrank == 0) rank = root; \
|
|
||||||
if (vrank == root) rank = 0; \
|
|
||||||
}
|
|
||||||
static inline int bcast_sched_binomial(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
|
|
||||||
int maxr, vrank, peer, res;
|
|
||||||
|
|
||||||
maxr = (int)ceil((log((double)p)/LOG2));
|
|
||||||
|
|
||||||
RANK2VRANK(rank, vrank, root);
|
|
||||||
|
|
||||||
/* receive from the right hosts */
|
|
||||||
if (vrank != 0) {
|
|
||||||
for (int r = 0 ; r < maxr ; ++r) {
|
|
||||||
if ((vrank >= (1 << r)) && (vrank < (1 << (r + 1)))) {
|
|
||||||
VRANK2RANK(peer, vrank - (1 << r), root);
|
|
||||||
res = NBC_Sched_recv (buffer, false, count, datatype, peer, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
res = NBC_Sched_barrier (schedule);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/* now send to the right hosts */
|
|
||||||
for (int r = 0 ; r < maxr ; ++r) {
|
|
||||||
if (((vrank + (1 << r) < p) && (vrank < (1 << r))) || (vrank == 0)) {
|
|
||||||
VRANK2RANK(peer, vrank + (1 << r), root);
|
|
||||||
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return OMPI_SUCCESS;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* simple linear MPI_Ibcast */
|
|
||||||
static inline int bcast_sched_linear(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
|
|
||||||
int res;
|
|
||||||
|
|
||||||
/* send to all others */
|
|
||||||
if(rank == root) {
|
|
||||||
for (int peer = 0 ; peer < p ; ++peer) {
|
|
||||||
if (peer != root) {
|
|
||||||
/* send msg to peer */
|
|
||||||
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
/* recv msg from root */
|
|
||||||
res = NBC_Sched_recv (buffer, false, count, datatype, root, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return OMPI_SUCCESS;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* simple chained MPI_Ibcast */
|
|
||||||
static inline int bcast_sched_chain(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype, int fragsize, size_t size) {
|
|
||||||
int res, vrank, rpeer, speer, numfrag, fragcount, thiscount;
|
|
||||||
MPI_Aint ext;
|
|
||||||
char *buf;
|
|
||||||
|
|
||||||
RANK2VRANK(rank, vrank, root);
|
|
||||||
VRANK2RANK(rpeer, vrank-1, root);
|
|
||||||
VRANK2RANK(speer, vrank+1, root);
|
|
||||||
res = ompi_datatype_type_extent(datatype, &ext);
|
|
||||||
if (MPI_SUCCESS != res) {
|
|
||||||
NBC_Error("MPI Error in ompi_datatype_type_extent() (%i)", res);
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
|
|
||||||
if (count == 0) {
|
|
||||||
return OMPI_SUCCESS;
|
|
||||||
}
|
|
||||||
|
|
||||||
numfrag = count * size/fragsize;
|
|
||||||
if ((count * size) % fragsize != 0) {
|
|
||||||
numfrag++;
|
|
||||||
}
|
|
||||||
|
|
||||||
fragcount = count/numfrag;
|
|
||||||
|
|
||||||
for (int fragnum = 0 ; fragnum < numfrag ; ++fragnum) {
|
|
||||||
buf = (char *) buffer + fragnum * fragcount * ext;
|
|
||||||
thiscount = fragcount;
|
|
||||||
if (fragnum == numfrag-1) {
|
|
||||||
/* last fragment may not be full */
|
|
||||||
thiscount = count - fragcount * fragnum;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* root does not receive */
|
|
||||||
if (vrank != 0) {
|
|
||||||
res = NBC_Sched_recv (buf, false, thiscount, datatype, rpeer, schedule, true);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/* last rank does not send */
|
|
||||||
if (vrank != p-1) {
|
|
||||||
res = NBC_Sched_send (buf, false, thiscount, datatype, speer, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* this barrier here seems awaward but isn't!!!! */
|
|
||||||
if (vrank == 0) {
|
|
||||||
res = NBC_Sched_barrier (schedule);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
return OMPI_SUCCESS;
|
|
||||||
}
|
|
||||||
|
|
||||||
int ompi_coll_libnbc_neighbor_allgather_init(const void *sbuf, int scount, MPI_Datatype stype, void *rbuf,
|
int ompi_coll_libnbc_neighbor_allgather_init(const void *sbuf, int scount, MPI_Datatype stype, void *rbuf,
|
||||||
int rcount, MPI_Datatype rtype, struct ompi_communicator_t *comm,
|
int rcount, MPI_Datatype rtype, struct ompi_communicator_t *comm,
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -175,7 +175,7 @@ int ompi_coll_libnbc_ineighbor_allgatherv(const void *sbuf, int scount, MPI_Data
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -177,7 +177,7 @@ int ompi_coll_libnbc_ineighbor_alltoall(const void *sbuf, int scount, MPI_Dataty
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ineighbor_alltoallv(const void *sbuf, const int *scounts, c
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -167,7 +167,7 @@ int ompi_coll_libnbc_ineighbor_alltoallw(const void *sbuf, const int *scounts, c
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -516,6 +516,11 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
|
|||||||
int res;
|
int res;
|
||||||
ptrdiff_t ext, lb;
|
ptrdiff_t ext, lb;
|
||||||
|
|
||||||
|
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
|
||||||
|
if (OMPI_SUCCESS != res) {
|
||||||
|
NBC_Error ("MPI Error in ompi_datatype_pack_external_size() (%i)", res);
|
||||||
|
return res;
|
||||||
|
}
|
||||||
#if OPAL_CUDA_SUPPORT
|
#if OPAL_CUDA_SUPPORT
|
||||||
if(NBC_Type_intrinsic(srctype) && !(opal_cuda_check_bufs((char *)tgt, (char *)src))) {
|
if(NBC_Type_intrinsic(srctype) && !(opal_cuda_check_bufs((char *)tgt, (char *)src))) {
|
||||||
#else
|
#else
|
||||||
@ -523,7 +528,6 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
|
|||||||
#endif /* OPAL_CUDA_SUPPORT */
|
#endif /* OPAL_CUDA_SUPPORT */
|
||||||
/* if we have the same types and they are contiguous (intrinsic
|
/* if we have the same types and they are contiguous (intrinsic
|
||||||
* types are contiguous), we can just use a single memcpy */
|
* types are contiguous), we can just use a single memcpy */
|
||||||
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
|
|
||||||
res = ompi_datatype_get_extent (srctype, &lb, &ext);
|
res = ompi_datatype_get_extent (srctype, &lb, &ext);
|
||||||
if (OMPI_SUCCESS != res) {
|
if (OMPI_SUCCESS != res) {
|
||||||
NBC_Error ("MPI Error in MPI_Type_extent() (%i)", res);
|
NBC_Error ("MPI Error in MPI_Type_extent() (%i)", res);
|
||||||
|
@ -7,8 +7,8 @@
|
|||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -218,7 +218,7 @@ int ompi_coll_libnbc_ireduce(const void* sendbuf, void* recvbuf, int count, MPI_
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -284,7 +284,7 @@ int ompi_coll_libnbc_ireduce_inter(const void* sendbuf, void* recvbuf, int count
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -7,8 +7,8 @@
|
|||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 The University of Tennessee and The University
|
* Copyright (c) 2015 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
@ -219,7 +219,7 @@ int ompi_coll_libnbc_ireduce_scatter (const void* sendbuf, void* recvbuf, const
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -361,7 +361,7 @@ int ompi_coll_libnbc_ireduce_scatter_inter (const void* sendbuf, void* recvbuf,
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -8,8 +8,8 @@
|
|||||||
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -217,7 +217,7 @@ int ompi_coll_libnbc_ireduce_scatter_block(const void* sendbuf, void* recvbuf, i
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -356,7 +356,7 @@ int ompi_coll_libnbc_ireduce_scatter_block_inter(const void* sendbuf, void* recv
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -5,8 +5,8 @@
|
|||||||
* Corporation. All rights reserved.
|
* Corporation. All rights reserved.
|
||||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||||
* rights reserved.
|
* rights reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
@ -18,8 +18,20 @@
|
|||||||
* Author(s): Torsten Hoefler <htor@cs.indiana.edu>
|
* Author(s): Torsten Hoefler <htor@cs.indiana.edu>
|
||||||
*
|
*
|
||||||
*/
|
*/
|
||||||
|
#include "opal/include/opal/align.h"
|
||||||
|
#include "ompi/op/op.h"
|
||||||
|
|
||||||
#include "nbc_internal.h"
|
#include "nbc_internal.h"
|
||||||
|
|
||||||
|
static inline int scan_sched_linear(
|
||||||
|
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||||
|
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
|
||||||
|
void *tmpbuf);
|
||||||
|
static inline int scan_sched_recursivedoubling(
|
||||||
|
int rank, int comm_size, const void *sendbuf, void *recvbuf,
|
||||||
|
int count, MPI_Datatype datatype, MPI_Op op, char inplace,
|
||||||
|
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2);
|
||||||
|
|
||||||
#ifdef NBC_CACHE_SCHEDULE
|
#ifdef NBC_CACHE_SCHEDULE
|
||||||
/* tree comparison function for schedule cache */
|
/* tree comparison function for schedule cache */
|
||||||
int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
|
int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
|
||||||
@ -39,27 +51,41 @@ int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
/* linear iscan
|
|
||||||
* working principle:
|
|
||||||
* 1. each node (but node 0) receives from left neighbor
|
|
||||||
* 2. performs op
|
|
||||||
* 3. all but rank p-1 do sends to it's right neighbor and exits
|
|
||||||
*
|
|
||||||
*/
|
|
||||||
static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
||||||
struct ompi_communicator_t *comm, ompi_request_t ** request,
|
struct ompi_communicator_t *comm, ompi_request_t ** request,
|
||||||
struct mca_coll_base_module_2_3_0_t *module, bool persistent) {
|
struct mca_coll_base_module_2_3_0_t *module, bool persistent) {
|
||||||
int rank, p, res;
|
int rank, p, res;
|
||||||
ptrdiff_t gap, span;
|
ptrdiff_t gap, span;
|
||||||
NBC_Schedule *schedule;
|
NBC_Schedule *schedule;
|
||||||
void *tmpbuf = NULL;
|
void *tmpbuf = NULL, *tmpbuf1 = NULL, *tmpbuf2 = NULL;
|
||||||
char inplace;
|
enum { NBC_SCAN_LINEAR, NBC_SCAN_RDBL } alg;
|
||||||
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
|
char inplace;
|
||||||
|
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
|
||||||
|
|
||||||
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
|
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
|
||||||
|
|
||||||
rank = ompi_comm_rank (comm);
|
rank = ompi_comm_rank (comm);
|
||||||
p = ompi_comm_size (comm);
|
p = ompi_comm_size (comm);
|
||||||
|
|
||||||
|
if (count == 0) {
|
||||||
|
return nbc_get_noop_request(persistent, request);
|
||||||
|
}
|
||||||
|
|
||||||
|
span = opal_datatype_span(&datatype->super, count, &gap);
|
||||||
|
if (libnbc_iscan_algorithm == 2) {
|
||||||
|
alg = NBC_SCAN_RDBL;
|
||||||
|
ptrdiff_t span_align = OPAL_ALIGN(span, datatype->super.align, ptrdiff_t);
|
||||||
|
tmpbuf = malloc(span_align + span);
|
||||||
|
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
|
||||||
|
tmpbuf1 = (void *)(-gap);
|
||||||
|
tmpbuf2 = (char *)(span_align) - gap;
|
||||||
|
} else {
|
||||||
|
alg = NBC_SCAN_LINEAR;
|
||||||
|
if (rank > 0) {
|
||||||
|
tmpbuf = malloc(span);
|
||||||
|
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
#ifdef NBC_CACHE_SCHEDULE
|
#ifdef NBC_CACHE_SCHEDULE
|
||||||
NBC_Scan_args *args, *found, search;
|
NBC_Scan_args *args, *found, search;
|
||||||
@ -75,60 +101,28 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
|
|||||||
#endif
|
#endif
|
||||||
schedule = OBJ_NEW(NBC_Schedule);
|
schedule = OBJ_NEW(NBC_Schedule);
|
||||||
if (OPAL_UNLIKELY(NULL == schedule)) {
|
if (OPAL_UNLIKELY(NULL == schedule)) {
|
||||||
return OMPI_ERR_OUT_OF_RESOURCE;
|
free(tmpbuf);
|
||||||
}
|
|
||||||
|
|
||||||
if (!inplace) {
|
|
||||||
/* copy data to receivebuf */
|
|
||||||
res = NBC_Sched_copy ((void *)sendbuf, false, count, datatype,
|
|
||||||
recvbuf, false, count, datatype, schedule, false);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
OBJ_RELEASE(schedule);
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
if(rank != 0) {
|
|
||||||
span = opal_datatype_span(&datatype->super, count, &gap);
|
|
||||||
tmpbuf = malloc (span);
|
|
||||||
if (NULL == tmpbuf) {
|
|
||||||
OBJ_RELEASE(schedule);
|
|
||||||
return OMPI_ERR_OUT_OF_RESOURCE;
|
return OMPI_ERR_OUT_OF_RESOURCE;
|
||||||
}
|
|
||||||
|
|
||||||
/* we have to wait until we have the data */
|
|
||||||
res = NBC_Sched_recv ((void *)(-gap), true, count, datatype, rank-1, schedule, true);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
OBJ_RELEASE(schedule);
|
|
||||||
free(tmpbuf);
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
|
|
||||||
/* perform the reduce in my local buffer */
|
|
||||||
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
|
|
||||||
res = NBC_Sched_op ((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
|
|
||||||
true);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
|
||||||
OBJ_RELEASE(schedule);
|
|
||||||
free(tmpbuf);
|
|
||||||
return res;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
if (rank != p-1) {
|
if (alg == NBC_SCAN_LINEAR) {
|
||||||
res = NBC_Sched_send (recvbuf, false, count, datatype, rank+1, schedule, false);
|
res = scan_sched_linear(rank, p, sendbuf, recvbuf, count, datatype,
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
op, inplace, schedule, tmpbuf);
|
||||||
OBJ_RELEASE(schedule);
|
} else {
|
||||||
free(tmpbuf);
|
res = scan_sched_recursivedoubling(rank, p, sendbuf, recvbuf, count,
|
||||||
return res;
|
datatype, op, inplace, schedule, tmpbuf1, tmpbuf2);
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
res = NBC_Sched_commit (schedule);
|
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
OBJ_RELEASE(schedule);
|
OBJ_RELEASE(schedule);
|
||||||
free(tmpbuf);
|
free(tmpbuf);
|
||||||
return res;
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
res = NBC_Sched_commit(schedule);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
|
OBJ_RELEASE(schedule);
|
||||||
|
free(tmpbuf);
|
||||||
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
#ifdef NBC_CACHE_SCHEDULE
|
#ifdef NBC_CACHE_SCHEDULE
|
||||||
@ -162,14 +156,160 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
|
|||||||
}
|
}
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
|
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
OBJ_RELEASE(schedule);
|
OBJ_RELEASE(schedule);
|
||||||
free(tmpbuf);
|
free(tmpbuf);
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
return OMPI_SUCCESS;
|
return OMPI_SUCCESS;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* scan_sched_linear:
|
||||||
|
*
|
||||||
|
* Function: Linear algorithm for inclusive scan.
|
||||||
|
* Accepts: Same as MPI_Iscan
|
||||||
|
* Returns: MPI_SUCCESS or error code
|
||||||
|
*
|
||||||
|
* Working principle:
|
||||||
|
* 1. Each process (but process 0) receives from left neighbor
|
||||||
|
* 2. Performs op
|
||||||
|
* 3. All but rank p-1 do sends to it's right neighbor and exits
|
||||||
|
*
|
||||||
|
* Schedule length: O(1)
|
||||||
|
*/
|
||||||
|
static inline int scan_sched_linear(
|
||||||
|
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||||
|
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
|
||||||
|
void *tmpbuf)
|
||||||
|
{
|
||||||
|
int res = OMPI_SUCCESS;
|
||||||
|
|
||||||
|
if (!inplace) {
|
||||||
|
/* Copy data to recvbuf */
|
||||||
|
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
|
||||||
|
recvbuf, false, count, datatype, schedule, false);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rank > 0) {
|
||||||
|
ptrdiff_t gap;
|
||||||
|
opal_datatype_span(&datatype->super, count, &gap);
|
||||||
|
/* We have to wait until we have the data */
|
||||||
|
res = NBC_Sched_recv((void *)(-gap), true, count, datatype, rank - 1, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
|
||||||
|
/* Perform the reduce in my local buffer */
|
||||||
|
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
|
||||||
|
res = NBC_Sched_op((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
|
||||||
|
true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
}
|
||||||
|
|
||||||
|
if (rank != comm_size - 1) {
|
||||||
|
res = NBC_Sched_send(recvbuf, false, count, datatype, rank + 1, schedule, false);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
}
|
||||||
|
|
||||||
|
cleanup_and_return:
|
||||||
|
return res;
|
||||||
|
}
|
||||||
|
|
||||||
|
/*
|
||||||
|
* scan_sched_recursivedoubling:
|
||||||
|
*
|
||||||
|
* Function: Recursive doubling algorithm for inclusive scan.
|
||||||
|
* Accepts: Same as MPI_Iscan
|
||||||
|
* Returns: MPI_SUCCESS or error code
|
||||||
|
*
|
||||||
|
* Description: Implements recursive doubling algorithm for MPI_Iscan.
|
||||||
|
* The algorithm preserves order of operations so it can
|
||||||
|
* be used both by commutative and non-commutative operations.
|
||||||
|
*
|
||||||
|
* Example for 5 processes and commutative operation MPI_SUM:
|
||||||
|
* Process: 0 1 2 3 4
|
||||||
|
* recvbuf: [0] [1] [2] [3] [4]
|
||||||
|
* psend: [0] [1] [2] [3] [4]
|
||||||
|
*
|
||||||
|
* Step 1:
|
||||||
|
* recvbuf: [0] [0+1] [2] [2+3] [4]
|
||||||
|
* psend: [1+0] [0+1] [3+2] [2+3] [4]
|
||||||
|
*
|
||||||
|
* Step 2:
|
||||||
|
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [4]
|
||||||
|
* psend: [(3+2)+(1+0)] [(2+3)+(0+1)] [(1+0)+(3+2)] [(1+0)+(2+3)] [4]
|
||||||
|
*
|
||||||
|
* Step 3:
|
||||||
|
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [((3+2)+(1+0))+4]
|
||||||
|
* psend: [4+((3+2)+(1+0))] [((3+2)+(1+0))+4]
|
||||||
|
*
|
||||||
|
* Time complexity (worst case): \ceil(\log_2(p))(2\alpha + 2m\beta + 2m\gamma)
|
||||||
|
* Memory requirements (per process): 2 * count * typesize = O(count)
|
||||||
|
* Limitations: intra-communicators only
|
||||||
|
* Schedule length: O(log(p))
|
||||||
|
*/
|
||||||
|
static inline int scan_sched_recursivedoubling(
|
||||||
|
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||||
|
MPI_Datatype datatype, MPI_Op op, char inplace,
|
||||||
|
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2)
|
||||||
|
{
|
||||||
|
int res = OMPI_SUCCESS;
|
||||||
|
|
||||||
|
if (!inplace) {
|
||||||
|
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
|
||||||
|
recvbuf, false, count, datatype, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
}
|
||||||
|
if (comm_size < 2)
|
||||||
|
goto cleanup_and_return;
|
||||||
|
|
||||||
|
char *psend = (char *)tmpbuf1;
|
||||||
|
char *precv = (char *)tmpbuf2;
|
||||||
|
res = NBC_Sched_copy(recvbuf, false, count, datatype,
|
||||||
|
psend, true, count, datatype, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
|
||||||
|
int is_commute = ompi_op_is_commute(op);
|
||||||
|
for (int mask = 1; mask < comm_size; mask <<= 1) {
|
||||||
|
int remote = rank ^ mask;
|
||||||
|
if (remote < comm_size) {
|
||||||
|
res = NBC_Sched_send(psend, true, count, datatype, remote, schedule, false);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
res = NBC_Sched_recv(precv, true, count, datatype, remote, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
|
||||||
|
if (rank > remote) {
|
||||||
|
/* Accumulate prefix reduction: recvbuf = precv <op> recvbuf */
|
||||||
|
res = NBC_Sched_op(precv, true, recvbuf, false, count,
|
||||||
|
datatype, op, schedule, false);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
/* Partial result: psend = precv <op> psend */
|
||||||
|
res = NBC_Sched_op(precv, true, psend, true, count,
|
||||||
|
datatype, op, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
} else {
|
||||||
|
if (is_commute) {
|
||||||
|
/* psend = precv <op> psend */
|
||||||
|
res = NBC_Sched_op(precv, true, psend, true, count,
|
||||||
|
datatype, op, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
} else {
|
||||||
|
/* precv = psend <op> precv */
|
||||||
|
res = NBC_Sched_op(psend, true, precv, true, count,
|
||||||
|
datatype, op, schedule, true);
|
||||||
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||||
|
char *tmp = psend;
|
||||||
|
psend = precv;
|
||||||
|
precv = tmp;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
cleanup_and_return:
|
||||||
|
return res;
|
||||||
}
|
}
|
||||||
|
|
||||||
int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
||||||
@ -182,7 +322,7 @@ int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Da
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -10,8 +10,8 @@
|
|||||||
* Copyright (c) 2013 The University of Tennessee and The University
|
* Copyright (c) 2013 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -179,7 +179,7 @@ int ompi_coll_libnbc_iscatter (const void* sendbuf, int sendcount, MPI_Datatype
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -258,7 +258,7 @@ int ompi_coll_libnbc_iscatter_inter (const void* sendbuf, int sendcount, MPI_Dat
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -10,8 +10,8 @@
|
|||||||
* Copyright (c) 2013 The University of Tennessee and The University
|
* Copyright (c) 2013 The University of Tennessee and The University
|
||||||
* of Tennessee Research Foundation. All rights
|
* of Tennessee Research Foundation. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -114,7 +114,7 @@ int ompi_coll_libnbc_iscatterv(const void* sendbuf, const int *sendcounts, const
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
@ -192,7 +192,7 @@ int ompi_coll_libnbc_iscatterv_inter(const void* sendbuf, const int *sendcounts,
|
|||||||
}
|
}
|
||||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||||
*request = &ompi_request_null.request;
|
*request = &ompi_request_null.request;
|
||||||
return res;
|
return res;
|
||||||
}
|
}
|
||||||
|
@ -36,7 +36,7 @@ struct mca_coll_monitoring_module_t {
|
|||||||
mca_coll_base_module_t super;
|
mca_coll_base_module_t super;
|
||||||
mca_coll_base_comm_coll_t real;
|
mca_coll_base_comm_coll_t real;
|
||||||
mca_monitoring_coll_data_t*data;
|
mca_monitoring_coll_data_t*data;
|
||||||
int32_t is_initialized;
|
opal_atomic_int32_t is_initialized;
|
||||||
};
|
};
|
||||||
typedef struct mca_coll_monitoring_module_t mca_coll_monitoring_module_t;
|
typedef struct mca_coll_monitoring_module_t mca_coll_monitoring_module_t;
|
||||||
OMPI_DECLSPEC OBJ_CLASS_DECLARATION(mca_coll_monitoring_module_t);
|
OMPI_DECLSPEC OBJ_CLASS_DECLARATION(mca_coll_monitoring_module_t);
|
||||||
|
@ -1,6 +1,7 @@
|
|||||||
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2013-2015 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2013-2015 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
||||||
* Copyright (c) 2015 Research Organization for Information Science
|
* Copyright (c) 2015 Research Organization for Information Science
|
||||||
@ -91,7 +92,7 @@ typedef struct ompi_coll_portals4_tree_t {
|
|||||||
|
|
||||||
struct mca_coll_portals4_module_t {
|
struct mca_coll_portals4_module_t {
|
||||||
mca_coll_base_module_t super;
|
mca_coll_base_module_t super;
|
||||||
size_t coll_count;
|
opal_atomic_size_t coll_count;
|
||||||
|
|
||||||
/* record handlers dedicated to fallback if offloaded operations are not supported */
|
/* record handlers dedicated to fallback if offloaded operations are not supported */
|
||||||
mca_coll_base_module_reduce_fn_t previous_reduce;
|
mca_coll_base_module_reduce_fn_t previous_reduce;
|
||||||
|
@ -114,7 +114,7 @@ BEGIN_C_DECLS
|
|||||||
typedef struct mca_coll_sm_in_use_flag_t {
|
typedef struct mca_coll_sm_in_use_flag_t {
|
||||||
/** Number of processes currently using this set of
|
/** Number of processes currently using this set of
|
||||||
segments */
|
segments */
|
||||||
volatile uint32_t mcsiuf_num_procs_using;
|
opal_atomic_uint32_t mcsiuf_num_procs_using;
|
||||||
/** Must match data->mcb_count */
|
/** Must match data->mcb_count */
|
||||||
volatile uint32_t mcsiuf_operation_count;
|
volatile uint32_t mcsiuf_operation_count;
|
||||||
} mca_coll_sm_in_use_flag_t;
|
} mca_coll_sm_in_use_flag_t;
|
||||||
@ -152,7 +152,7 @@ BEGIN_C_DECLS
|
|||||||
/** Pointer to my parent's barrier control pages (will be NULL
|
/** Pointer to my parent's barrier control pages (will be NULL
|
||||||
for communicator rank 0; odd index pages are "in", even
|
for communicator rank 0; odd index pages are "in", even
|
||||||
index pages are "out") */
|
index pages are "out") */
|
||||||
uint32_t *mcb_barrier_control_parent;
|
opal_atomic_uint32_t *mcb_barrier_control_parent;
|
||||||
|
|
||||||
/** Pointers to my childrens' barrier control pages (they're
|
/** Pointers to my childrens' barrier control pages (they're
|
||||||
contiguous in memory, so we only point to the base -- the
|
contiguous in memory, so we only point to the base -- the
|
||||||
|
@ -56,7 +56,8 @@ int mca_coll_sm_barrier_intra(struct ompi_communicator_t *comm,
|
|||||||
int rank, buffer_set;
|
int rank, buffer_set;
|
||||||
mca_coll_sm_comm_t *data;
|
mca_coll_sm_comm_t *data;
|
||||||
uint32_t i, num_children;
|
uint32_t i, num_children;
|
||||||
volatile uint32_t *me_in, *me_out, *parent, *children = NULL;
|
volatile uint32_t *me_in, *me_out, *children = NULL;
|
||||||
|
opal_atomic_uint32_t *parent;
|
||||||
int uint_control_size;
|
int uint_control_size;
|
||||||
mca_coll_sm_module_t *sm_module = (mca_coll_sm_module_t*) module;
|
mca_coll_sm_module_t *sm_module = (mca_coll_sm_module_t*) module;
|
||||||
|
|
||||||
|
@ -372,7 +372,7 @@ int ompi_coll_sm_lazy_enable(mca_coll_base_module_t *module,
|
|||||||
data->mcb_barrier_control_me = (uint32_t*)
|
data->mcb_barrier_control_me = (uint32_t*)
|
||||||
(base + (rank * control_size * num_barrier_buffers * 2));
|
(base + (rank * control_size * num_barrier_buffers * 2));
|
||||||
if (data->mcb_tree[rank].mcstn_parent) {
|
if (data->mcb_tree[rank].mcstn_parent) {
|
||||||
data->mcb_barrier_control_parent = (uint32_t*)
|
data->mcb_barrier_control_parent = (opal_atomic_uint32_t*)
|
||||||
(base +
|
(base +
|
||||||
(data->mcb_tree[rank].mcstn_parent->mcstn_id * control_size *
|
(data->mcb_tree[rank].mcstn_parent->mcstn_id * control_size *
|
||||||
num_barrier_buffers * 2));
|
num_barrier_buffers * 2));
|
||||||
|
@ -7,7 +7,7 @@
|
|||||||
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
||||||
* Copyright (c) 2016-2017 Research Organization for Information Science
|
* Copyright (c) 2016-2017 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* Copyright (c) 2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2017-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
*
|
*
|
||||||
@ -34,7 +34,7 @@
|
|||||||
|
|
||||||
/*** Monitoring specific variables ***/
|
/*** Monitoring specific variables ***/
|
||||||
/* Keep tracks of how many components are currently using the common part */
|
/* Keep tracks of how many components are currently using the common part */
|
||||||
static int32_t mca_common_monitoring_hold = 0;
|
static opal_atomic_int32_t mca_common_monitoring_hold = 0;
|
||||||
/* Output parameters */
|
/* Output parameters */
|
||||||
int mca_common_monitoring_output_stream_id = -1;
|
int mca_common_monitoring_output_stream_id = -1;
|
||||||
static opal_output_stream_t mca_common_monitoring_output_stream_obj = {
|
static opal_output_stream_t mca_common_monitoring_output_stream_obj = {
|
||||||
@ -61,18 +61,18 @@ static char* mca_common_monitoring_initial_filename = "";
|
|||||||
static char* mca_common_monitoring_current_filename = NULL;
|
static char* mca_common_monitoring_current_filename = NULL;
|
||||||
|
|
||||||
/* array for stroring monitoring data*/
|
/* array for stroring monitoring data*/
|
||||||
static size_t* pml_data = NULL;
|
static opal_atomic_size_t* pml_data = NULL;
|
||||||
static size_t* pml_count = NULL;
|
static opal_atomic_size_t* pml_count = NULL;
|
||||||
static size_t* filtered_pml_data = NULL;
|
static opal_atomic_size_t* filtered_pml_data = NULL;
|
||||||
static size_t* filtered_pml_count = NULL;
|
static opal_atomic_size_t* filtered_pml_count = NULL;
|
||||||
static size_t* osc_data_s = NULL;
|
static opal_atomic_size_t* osc_data_s = NULL;
|
||||||
static size_t* osc_count_s = NULL;
|
static opal_atomic_size_t* osc_count_s = NULL;
|
||||||
static size_t* osc_data_r = NULL;
|
static opal_atomic_size_t* osc_data_r = NULL;
|
||||||
static size_t* osc_count_r = NULL;
|
static opal_atomic_size_t* osc_count_r = NULL;
|
||||||
static size_t* coll_data = NULL;
|
static opal_atomic_size_t* coll_data = NULL;
|
||||||
static size_t* coll_count = NULL;
|
static opal_atomic_size_t* coll_count = NULL;
|
||||||
|
|
||||||
static size_t* size_histogram = NULL;
|
static opal_atomic_size_t* size_histogram = NULL;
|
||||||
static const int max_size_histogram = 66;
|
static const int max_size_histogram = 66;
|
||||||
static double log10_2 = 0.;
|
static double log10_2 = 0.;
|
||||||
|
|
||||||
@ -241,7 +241,7 @@ void mca_common_monitoring_finalize( void )
|
|||||||
opal_output_close(mca_common_monitoring_output_stream_id);
|
opal_output_close(mca_common_monitoring_output_stream_id);
|
||||||
free(mca_common_monitoring_output_stream_obj.lds_prefix);
|
free(mca_common_monitoring_output_stream_obj.lds_prefix);
|
||||||
/* Free internal data structure */
|
/* Free internal data structure */
|
||||||
free(pml_data); /* a single allocation */
|
free((void *) pml_data); /* a single allocation */
|
||||||
opal_hash_table_remove_all( common_monitoring_translation_ht );
|
opal_hash_table_remove_all( common_monitoring_translation_ht );
|
||||||
OBJ_RELEASE(common_monitoring_translation_ht);
|
OBJ_RELEASE(common_monitoring_translation_ht);
|
||||||
mca_common_monitoring_coll_finalize();
|
mca_common_monitoring_coll_finalize();
|
||||||
@ -446,7 +446,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
|
|||||||
|
|
||||||
if( NULL == pml_data ) {
|
if( NULL == pml_data ) {
|
||||||
int array_size = (10 + max_size_histogram) * nprocs_world;
|
int array_size = (10 + max_size_histogram) * nprocs_world;
|
||||||
pml_data = (size_t*)calloc(array_size, sizeof(size_t));
|
pml_data = (opal_atomic_size_t*)calloc(array_size, sizeof(size_t));
|
||||||
pml_count = pml_data + nprocs_world;
|
pml_count = pml_data + nprocs_world;
|
||||||
filtered_pml_data = pml_count + nprocs_world;
|
filtered_pml_data = pml_count + nprocs_world;
|
||||||
filtered_pml_count = filtered_pml_data + nprocs_world;
|
filtered_pml_count = filtered_pml_data + nprocs_world;
|
||||||
@ -493,7 +493,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
|
|||||||
static void mca_common_monitoring_reset( void )
|
static void mca_common_monitoring_reset( void )
|
||||||
{
|
{
|
||||||
int array_size = (10 + max_size_histogram) * nprocs_world;
|
int array_size = (10 + max_size_histogram) * nprocs_world;
|
||||||
memset(pml_data, 0, array_size * sizeof(size_t));
|
memset((void *) pml_data, 0, array_size * sizeof(size_t));
|
||||||
mca_common_monitoring_coll_reset();
|
mca_common_monitoring_coll_reset();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -30,12 +30,12 @@ struct mca_monitoring_coll_data_t {
|
|||||||
int world_rank;
|
int world_rank;
|
||||||
int is_released;
|
int is_released;
|
||||||
ompi_communicator_t*p_comm;
|
ompi_communicator_t*p_comm;
|
||||||
size_t o2a_count;
|
opal_atomic_size_t o2a_count;
|
||||||
size_t o2a_size;
|
opal_atomic_size_t o2a_size;
|
||||||
size_t a2o_count;
|
opal_atomic_size_t a2o_count;
|
||||||
size_t a2o_size;
|
opal_atomic_size_t a2o_size;
|
||||||
size_t a2a_count;
|
opal_atomic_size_t a2a_count;
|
||||||
size_t a2a_size;
|
opal_atomic_size_t a2a_size;
|
||||||
};
|
};
|
||||||
|
|
||||||
/* Collectives operation monitoring */
|
/* Collectives operation monitoring */
|
||||||
|
@ -4,7 +4,7 @@
|
|||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2013-2017 Inria. All rights reserved.
|
* Copyright (c) 2013-2017 Inria. All rights reserved.
|
||||||
* Copyright (c) 2013-2015 Bull SAS. All rights reserved.
|
* Copyright (c) 2013-2015 Bull SAS. All rights reserved.
|
||||||
* Copyright (c) 2016 Cisco Systems, Inc. All rights reserved.
|
* Copyright (c) 2016-2018 Cisco Systems, Inc. All rights reserved.
|
||||||
* Copyright (c) 2017 Research Organization for Information Science
|
* Copyright (c) 2017 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
@ -42,10 +42,30 @@ writing 4x4 matrix to monitoring_avg.mat
|
|||||||
|
|
||||||
*/
|
*/
|
||||||
|
|
||||||
|
#include "ompi_config.h"
|
||||||
|
|
||||||
#include <stdio.h>
|
#include <stdio.h>
|
||||||
#include <stdlib.h>
|
#include <stdlib.h>
|
||||||
#include <mpi.h>
|
|
||||||
#include <string.h>
|
#include <string.h>
|
||||||
|
#include <stdbool.h>
|
||||||
|
|
||||||
|
#if OMPI_BUILD_FORTRAN_BINDINGS
|
||||||
|
// Set these #defines in the same way that
|
||||||
|
// ompi/mpi/fortran/mpif-h/Makefile.am does when compiling the real
|
||||||
|
// Fortran mpif.h bindings. They set behaviors in the Fortran header
|
||||||
|
// files so that we can compile properly.
|
||||||
|
#define OMPI_BUILD_MPI_PROFILING 0
|
||||||
|
#define OMPI_COMPILING_FORTRAN_WRAPPERS 1
|
||||||
|
#endif
|
||||||
|
|
||||||
|
#include "opal/threads/thread_usage.h"
|
||||||
|
|
||||||
|
#include "ompi/include/mpi.h"
|
||||||
|
#include "ompi/mpi/fortran/base/constants.h"
|
||||||
|
#include "ompi/mpi/fortran/base/fint_2_int.h"
|
||||||
|
#if OMPI_BUILD_FORTRAN_BINDINGS
|
||||||
|
#include "ompi/mpi/fortran/mpif-h/bindings.h"
|
||||||
|
#endif
|
||||||
|
|
||||||
static MPI_T_pvar_session session;
|
static MPI_T_pvar_session session;
|
||||||
static int comm_world_size;
|
static int comm_world_size;
|
||||||
@ -383,12 +403,6 @@ int write_mat(char * filename, size_t * mat, unsigned int dim)
|
|||||||
* MPI binding for fortran
|
* MPI binding for fortran
|
||||||
*/
|
*/
|
||||||
|
|
||||||
#include <stdbool.h>
|
|
||||||
#include "ompi_config.h"
|
|
||||||
#include "opal/threads/thread_usage.h"
|
|
||||||
#include "ompi/mpi/fortran/base/constants.h"
|
|
||||||
#include "ompi/mpi/fortran/base/fint_2_int.h"
|
|
||||||
|
|
||||||
void monitoring_prof_mpi_init_f2c( MPI_Fint * );
|
void monitoring_prof_mpi_init_f2c( MPI_Fint * );
|
||||||
void monitoring_prof_mpi_finalize_f2c( MPI_Fint * );
|
void monitoring_prof_mpi_finalize_f2c( MPI_Fint * );
|
||||||
|
|
||||||
@ -423,8 +437,6 @@ void monitoring_prof_mpi_finalize_f2c( MPI_Fint *ierr ) {
|
|||||||
#pragma weak MPI_Finalize_f = monitoring_prof_mpi_finalize_f2c
|
#pragma weak MPI_Finalize_f = monitoring_prof_mpi_finalize_f2c
|
||||||
#pragma weak MPI_Finalize_f08 = monitoring_prof_mpi_finalize_f2c
|
#pragma weak MPI_Finalize_f08 = monitoring_prof_mpi_finalize_f2c
|
||||||
#elif OMPI_BUILD_FORTRAN_BINDINGS
|
#elif OMPI_BUILD_FORTRAN_BINDINGS
|
||||||
#define OMPI_F77_PROTOTYPES_MPI_H
|
|
||||||
#include "ompi/mpi/fortran/mpif-h/bindings.h"
|
|
||||||
|
|
||||||
OMPI_GENERATE_F77_BINDINGS (MPI_INIT,
|
OMPI_GENERATE_F77_BINDINGS (MPI_INIT,
|
||||||
mpi_init,
|
mpi_init,
|
||||||
|
@ -34,7 +34,7 @@ static opal_mutex_t mca_common_ompio_cuda_mutex; /* lock for thread saf
|
|||||||
static mca_allocator_base_component_t* mca_common_ompio_allocator_component=NULL;
|
static mca_allocator_base_component_t* mca_common_ompio_allocator_component=NULL;
|
||||||
static mca_allocator_base_module_t* mca_common_ompio_allocator=NULL;
|
static mca_allocator_base_module_t* mca_common_ompio_allocator=NULL;
|
||||||
|
|
||||||
static int32_t mca_common_ompio_cuda_init = 0;
|
static opal_atomic_int32_t mca_common_ompio_cuda_init = 0;
|
||||||
static int32_t mca_common_ompio_pagesize=4096;
|
static int32_t mca_common_ompio_pagesize=4096;
|
||||||
static void* mca_common_ompio_cuda_alloc_seg ( void *ctx, size_t *size );
|
static void* mca_common_ompio_cuda_alloc_seg ( void *ctx, size_t *size );
|
||||||
static void mca_common_ompio_cuda_free_seg ( void *ctx, void *buf );
|
static void mca_common_ompio_cuda_free_seg ( void *ctx, void *buf );
|
||||||
|
@ -124,7 +124,7 @@ ompi_mtl_ofi_component_register(void)
|
|||||||
MCA_BASE_VAR_SCOPE_READONLY,
|
MCA_BASE_VAR_SCOPE_READONLY,
|
||||||
¶m_priority);
|
¶m_priority);
|
||||||
|
|
||||||
prov_include = "psm,psm2,gni";
|
prov_include = NULL;
|
||||||
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
||||||
"provider_include",
|
"provider_include",
|
||||||
"Comma-delimited list of OFI providers that are considered for use (e.g., \"psm,psm2\"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.",
|
"Comma-delimited list of OFI providers that are considered for use (e.g., \"psm,psm2\"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.",
|
||||||
@ -133,7 +133,7 @@ ompi_mtl_ofi_component_register(void)
|
|||||||
MCA_BASE_VAR_SCOPE_READONLY,
|
MCA_BASE_VAR_SCOPE_READONLY,
|
||||||
&prov_include);
|
&prov_include);
|
||||||
|
|
||||||
prov_exclude = NULL;
|
prov_exclude = "shm,sockets,tcp,udp,rstream";
|
||||||
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
||||||
"provider_exclude",
|
"provider_exclude",
|
||||||
"Comma-delimited list of OFI providers that are not considered for use (default: \"sockets,mxm\"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.",
|
"Comma-delimited list of OFI providers that are not considered for use (default: \"sockets,mxm\"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.",
|
||||||
|
@ -115,12 +115,12 @@ struct mca_mtl_portals4_module_t {
|
|||||||
opal_mutex_t short_block_mutex;
|
opal_mutex_t short_block_mutex;
|
||||||
|
|
||||||
/** number of send-side operations started */
|
/** number of send-side operations started */
|
||||||
uint64_t opcount;
|
opal_atomic_uint64_t opcount;
|
||||||
|
|
||||||
#if OPAL_ENABLE_DEBUG
|
#if OPAL_ENABLE_DEBUG
|
||||||
/** number of receive-side operations started. Used only for
|
/** number of receive-side operations started. Used only for
|
||||||
debugging */
|
debugging */
|
||||||
uint64_t recv_opcount;
|
opal_atomic_uint64_t recv_opcount;
|
||||||
#endif
|
#endif
|
||||||
|
|
||||||
#if OMPI_MTL_PORTALS4_FLOW_CONTROL
|
#if OMPI_MTL_PORTALS4_FLOW_CONTROL
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
*
|
*
|
||||||
|
@ -36,7 +36,7 @@ OBJ_CLASS_DECLARATION(ompi_mtl_portals4_pending_request_t);
|
|||||||
struct ompi_mtl_portals4_flowctl_t {
|
struct ompi_mtl_portals4_flowctl_t {
|
||||||
int32_t flowctl_active;
|
int32_t flowctl_active;
|
||||||
|
|
||||||
int32_t send_slots;
|
opal_atomic_int32_t send_slots;
|
||||||
int32_t max_send_slots;
|
int32_t max_send_slots;
|
||||||
opal_list_t pending_sends;
|
opal_list_t pending_sends;
|
||||||
opal_free_list_t pending_fl;
|
opal_free_list_t pending_fl;
|
||||||
@ -46,7 +46,7 @@ struct ompi_mtl_portals4_flowctl_t {
|
|||||||
|
|
||||||
/** Flow control epoch counter. Triggered events should be
|
/** Flow control epoch counter. Triggered events should be
|
||||||
based on epoch counter. */
|
based on epoch counter. */
|
||||||
int64_t epoch_counter;
|
opal_atomic_int64_t epoch_counter;
|
||||||
|
|
||||||
/** Flow control trigger CT. Only has meaning at root. */
|
/** Flow control trigger CT. Only has meaning at root. */
|
||||||
ptl_handle_ct_t trigger_ct_h;
|
ptl_handle_ct_t trigger_ct_h;
|
||||||
|
@ -54,8 +54,8 @@ struct ompi_mtl_portals4_isend_request_t {
|
|||||||
struct ompi_mtl_portals4_pending_request_t *pending;
|
struct ompi_mtl_portals4_pending_request_t *pending;
|
||||||
#endif
|
#endif
|
||||||
ptl_size_t length;
|
ptl_size_t length;
|
||||||
int32_t pending_get;
|
opal_atomic_int32_t pending_get;
|
||||||
uint32_t event_count;
|
opal_atomic_uint32_t event_count;
|
||||||
};
|
};
|
||||||
typedef struct ompi_mtl_portals4_isend_request_t ompi_mtl_portals4_isend_request_t;
|
typedef struct ompi_mtl_portals4_isend_request_t ompi_mtl_portals4_isend_request_t;
|
||||||
|
|
||||||
@ -76,7 +76,7 @@ struct ompi_mtl_portals4_recv_request_t {
|
|||||||
void *delivery_ptr;
|
void *delivery_ptr;
|
||||||
size_t delivery_len;
|
size_t delivery_len;
|
||||||
volatile bool req_started;
|
volatile bool req_started;
|
||||||
int32_t pending_reply;
|
opal_atomic_int32_t pending_reply;
|
||||||
#if OPAL_ENABLE_DEBUG
|
#if OPAL_ENABLE_DEBUG
|
||||||
uint64_t opcount;
|
uint64_t opcount;
|
||||||
ptl_hdr_data_t hdr_data;
|
ptl_hdr_data_t hdr_data;
|
||||||
|
@ -50,7 +50,7 @@
|
|||||||
OSC_MONITORING_SET_TEMPLATE_FCT_NAME(template) (ompi_osc_base_module_t*module) \
|
OSC_MONITORING_SET_TEMPLATE_FCT_NAME(template) (ompi_osc_base_module_t*module) \
|
||||||
{ \
|
{ \
|
||||||
/* Define the ompi_osc_monitoring_module_## template ##_init_done variable */ \
|
/* Define the ompi_osc_monitoring_module_## template ##_init_done variable */ \
|
||||||
static int32_t init_done = 0; \
|
opal_atomic_int32_t init_done = 0; \
|
||||||
/* Define and set the ompi_osc_monitoring_## template \
|
/* Define and set the ompi_osc_monitoring_## template \
|
||||||
* ##_template variable. The functions recorded here are \
|
* ##_template variable. The functions recorded here are \
|
||||||
* linked to the original functions of the original \
|
* linked to the original functions of the original \
|
||||||
|
@ -95,7 +95,7 @@ struct ompi_osc_portals4_module_t {
|
|||||||
ptl_handle_md_t req_md_h; /* memory descriptor with event completion used by this window */
|
ptl_handle_md_t req_md_h; /* memory descriptor with event completion used by this window */
|
||||||
ptl_handle_me_t data_me_h; /* data match list entry (MB are CID | OSC_PORTALS4_MB_DATA) */
|
ptl_handle_me_t data_me_h; /* data match list entry (MB are CID | OSC_PORTALS4_MB_DATA) */
|
||||||
ptl_handle_me_t control_me_h; /* match list entry for control data (node_state_t). Match bits are (CID | OSC_PORTALS4_MB_CONTROL). */
|
ptl_handle_me_t control_me_h; /* match list entry for control data (node_state_t). Match bits are (CID | OSC_PORTALS4_MB_CONTROL). */
|
||||||
int64_t opcount;
|
opal_atomic_int64_t opcount;
|
||||||
ptl_match_bits_t match_bits; /* match bits for module. Same as cid for comm in most cases. */
|
ptl_match_bits_t match_bits; /* match bits for module. Same as cid for comm in most cases. */
|
||||||
|
|
||||||
ptl_iovec_t *origin_iovec_list; /* list of memory segments that compose the noncontiguous region */
|
ptl_iovec_t *origin_iovec_list; /* list of memory segments that compose the noncontiguous region */
|
||||||
|
@ -189,7 +189,7 @@ number_of_fragments(ptl_size_t length, ptl_size_t maxlength)
|
|||||||
|
|
||||||
/* put in segments no larger than segment_length */
|
/* put in segments no larger than segment_length */
|
||||||
static int
|
static int
|
||||||
segmentedPut(int64_t *opcount,
|
segmentedPut(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t md_h,
|
ptl_handle_md_t md_h,
|
||||||
ptl_size_t origin_offset,
|
ptl_size_t origin_offset,
|
||||||
ptl_size_t put_length,
|
ptl_size_t put_length,
|
||||||
@ -236,7 +236,7 @@ segmentedPut(int64_t *opcount,
|
|||||||
|
|
||||||
/* get in segments no larger than segment_length */
|
/* get in segments no larger than segment_length */
|
||||||
static int
|
static int
|
||||||
segmentedGet(int64_t *opcount,
|
segmentedGet(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t md_h,
|
ptl_handle_md_t md_h,
|
||||||
ptl_size_t origin_offset,
|
ptl_size_t origin_offset,
|
||||||
ptl_size_t get_length,
|
ptl_size_t get_length,
|
||||||
@ -280,7 +280,7 @@ segmentedGet(int64_t *opcount,
|
|||||||
|
|
||||||
/* atomic op in segments no larger than segment_length */
|
/* atomic op in segments no larger than segment_length */
|
||||||
static int
|
static int
|
||||||
segmentedAtomic(int64_t *opcount,
|
segmentedAtomic(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t md_h,
|
ptl_handle_md_t md_h,
|
||||||
ptl_size_t origin_offset,
|
ptl_size_t origin_offset,
|
||||||
ptl_size_t length,
|
ptl_size_t length,
|
||||||
@ -329,7 +329,7 @@ segmentedAtomic(int64_t *opcount,
|
|||||||
|
|
||||||
/* atomic op in segments no larger than segment_length */
|
/* atomic op in segments no larger than segment_length */
|
||||||
static int
|
static int
|
||||||
segmentedFetchAtomic(int64_t *opcount,
|
segmentedFetchAtomic(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t result_md_h,
|
ptl_handle_md_t result_md_h,
|
||||||
ptl_size_t result_offset,
|
ptl_size_t result_offset,
|
||||||
ptl_handle_md_t origin_md_h,
|
ptl_handle_md_t origin_md_h,
|
||||||
@ -381,7 +381,7 @@ segmentedFetchAtomic(int64_t *opcount,
|
|||||||
|
|
||||||
/* swap in segments no larger than segment_length */
|
/* swap in segments no larger than segment_length */
|
||||||
static int
|
static int
|
||||||
segmentedSwap(int64_t *opcount,
|
segmentedSwap(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t result_md_h,
|
ptl_handle_md_t result_md_h,
|
||||||
ptl_size_t result_offset,
|
ptl_size_t result_offset,
|
||||||
ptl_handle_md_t origin_md_h,
|
ptl_handle_md_t origin_md_h,
|
||||||
@ -1187,7 +1187,7 @@ fetch_atomic_to_iovec(ompi_osc_portals4_module_t *module,
|
|||||||
|
|
||||||
/* put in the largest chunks possible given the noncontiguous restriction */
|
/* put in the largest chunks possible given the noncontiguous restriction */
|
||||||
static int
|
static int
|
||||||
put_to_noncontig(int64_t *opcount,
|
put_to_noncontig(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t md_h,
|
ptl_handle_md_t md_h,
|
||||||
const void *origin_address,
|
const void *origin_address,
|
||||||
int origin_count,
|
int origin_count,
|
||||||
@ -1521,7 +1521,7 @@ atomic_to_noncontig(ompi_osc_portals4_module_t *module,
|
|||||||
|
|
||||||
/* get from a noncontiguous remote to an (non)contiguous local */
|
/* get from a noncontiguous remote to an (non)contiguous local */
|
||||||
static int
|
static int
|
||||||
get_from_noncontig(int64_t *opcount,
|
get_from_noncontig(opal_atomic_int64_t *opcount,
|
||||||
ptl_handle_md_t md_h,
|
ptl_handle_md_t md_h,
|
||||||
const void *origin_address,
|
const void *origin_address,
|
||||||
int origin_count,
|
int origin_count,
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2011-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2011-2013 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
*
|
*
|
||||||
@ -18,7 +18,7 @@
|
|||||||
struct ompi_osc_portals4_request_t {
|
struct ompi_osc_portals4_request_t {
|
||||||
ompi_request_t super;
|
ompi_request_t super;
|
||||||
int32_t ops_expected;
|
int32_t ops_expected;
|
||||||
volatile int32_t ops_committed;
|
opal_atomic_int32_t ops_committed;
|
||||||
};
|
};
|
||||||
typedef struct ompi_osc_portals4_request_t ompi_osc_portals4_request_t;
|
typedef struct ompi_osc_portals4_request_t ompi_osc_portals4_request_t;
|
||||||
|
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
@ -110,7 +110,7 @@ struct ompi_osc_pt2pt_peer_t {
|
|||||||
int rank;
|
int rank;
|
||||||
|
|
||||||
/** pointer to the current send fragment for each outgoing target */
|
/** pointer to the current send fragment for each outgoing target */
|
||||||
struct ompi_osc_pt2pt_frag_t *active_frag;
|
opal_atomic_intptr_t active_frag;
|
||||||
|
|
||||||
/** lock for this peer */
|
/** lock for this peer */
|
||||||
opal_mutex_t lock;
|
opal_mutex_t lock;
|
||||||
@ -119,10 +119,10 @@ struct ompi_osc_pt2pt_peer_t {
|
|||||||
opal_list_t queued_frags;
|
opal_list_t queued_frags;
|
||||||
|
|
||||||
/** number of fragments incomming (negative - expected, positive - unsynchronized) */
|
/** number of fragments incomming (negative - expected, positive - unsynchronized) */
|
||||||
volatile int32_t passive_incoming_frag_count;
|
opal_atomic_int32_t passive_incoming_frag_count;
|
||||||
|
|
||||||
/** peer flags */
|
/** peer flags */
|
||||||
volatile int32_t flags;
|
opal_atomic_int32_t flags;
|
||||||
};
|
};
|
||||||
typedef struct ompi_osc_pt2pt_peer_t ompi_osc_pt2pt_peer_t;
|
typedef struct ompi_osc_pt2pt_peer_t ompi_osc_pt2pt_peer_t;
|
||||||
|
|
||||||
@ -208,16 +208,16 @@ struct ompi_osc_pt2pt_module_t {
|
|||||||
|
|
||||||
/** Nmber of communication fragments started for this epoch, by
|
/** Nmber of communication fragments started for this epoch, by
|
||||||
peer. Not in peer data to make fence more manageable. */
|
peer. Not in peer data to make fence more manageable. */
|
||||||
uint32_t *epoch_outgoing_frag_count;
|
opal_atomic_uint32_t *epoch_outgoing_frag_count;
|
||||||
|
|
||||||
/** cyclic counter for a unique tage for long messages. */
|
/** cyclic counter for a unique tage for long messages. */
|
||||||
volatile uint32_t tag_counter;
|
opal_atomic_uint32_t tag_counter;
|
||||||
|
|
||||||
/** number of outgoing fragments still to be completed */
|
/** number of outgoing fragments still to be completed */
|
||||||
volatile int32_t outgoing_frag_count;
|
opal_atomic_int32_t outgoing_frag_count;
|
||||||
|
|
||||||
/** number of incoming fragments */
|
/** number of incoming fragments */
|
||||||
volatile int32_t active_incoming_frag_count;
|
opal_atomic_int32_t active_incoming_frag_count;
|
||||||
|
|
||||||
/** Number of targets locked/being locked */
|
/** Number of targets locked/being locked */
|
||||||
unsigned int passive_target_access_epoch;
|
unsigned int passive_target_access_epoch;
|
||||||
@ -230,13 +230,13 @@ struct ompi_osc_pt2pt_module_t {
|
|||||||
|
|
||||||
/** Number of "count" messages from the remote complete group
|
/** Number of "count" messages from the remote complete group
|
||||||
we've received */
|
we've received */
|
||||||
volatile int32_t num_complete_msgs;
|
opal_atomic_int32_t num_complete_msgs;
|
||||||
|
|
||||||
/* ********************* LOCK data ************************ */
|
/* ********************* LOCK data ************************ */
|
||||||
|
|
||||||
/** Status of the local window lock. One of 0 (unlocked),
|
/** Status of the local window lock. One of 0 (unlocked),
|
||||||
MPI_LOCK_EXCLUSIVE, or MPI_LOCK_SHARED. */
|
MPI_LOCK_EXCLUSIVE, or MPI_LOCK_SHARED. */
|
||||||
int32_t lock_status;
|
opal_atomic_int32_t lock_status;
|
||||||
|
|
||||||
/** lock for locks_pending list */
|
/** lock for locks_pending list */
|
||||||
opal_mutex_t locks_pending_lock;
|
opal_mutex_t locks_pending_lock;
|
||||||
@ -526,7 +526,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
|
|||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||||
"mark_incoming_completion marking passive incoming complete. module %p, source = %d, count = %d",
|
"mark_incoming_completion marking passive incoming complete. module %p, source = %d, count = %d",
|
||||||
(void *) module, source, (int) peer->passive_incoming_frag_count + 1));
|
(void *) module, source, (int) peer->passive_incoming_frag_count + 1));
|
||||||
new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &peer->passive_incoming_frag_count, 1);
|
new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &peer->passive_incoming_frag_count, 1);
|
||||||
if (0 == new_value) {
|
if (0 == new_value) {
|
||||||
OPAL_THREAD_LOCK(&module->lock);
|
OPAL_THREAD_LOCK(&module->lock);
|
||||||
opal_condition_broadcast(&module->cond);
|
opal_condition_broadcast(&module->cond);
|
||||||
@ -550,7 +550,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
|
|||||||
*/
|
*/
|
||||||
static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
|
static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
|
||||||
{
|
{
|
||||||
int32_t new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, 1);
|
int32_t new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, 1);
|
||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||||
"mark_outgoing_completion: outgoing_frag_count = %d", new_value));
|
"mark_outgoing_completion: outgoing_frag_count = %d", new_value));
|
||||||
if (new_value >= 0) {
|
if (new_value >= 0) {
|
||||||
@ -574,12 +574,12 @@ static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
|
|||||||
*/
|
*/
|
||||||
static inline void ompi_osc_signal_outgoing (ompi_osc_pt2pt_module_t *module, int target, int count)
|
static inline void ompi_osc_signal_outgoing (ompi_osc_pt2pt_module_t *module, int target, int count)
|
||||||
{
|
{
|
||||||
OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, -count);
|
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, -count);
|
||||||
if (MPI_PROC_NULL != target) {
|
if (MPI_PROC_NULL != target) {
|
||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||||
"ompi_osc_signal_outgoing_passive: target = %d, count = %d, total = %d", target,
|
"ompi_osc_signal_outgoing_passive: target = %d, count = %d, total = %d", target,
|
||||||
count, module->epoch_outgoing_frag_count[target] + count));
|
count, module->epoch_outgoing_frag_count[target] + count));
|
||||||
OPAL_THREAD_ADD_FETCH32((int32_t *) (module->epoch_outgoing_frag_count + target), count);
|
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) (module->epoch_outgoing_frag_count + target), count);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -717,7 +717,7 @@ static inline int get_tag(ompi_osc_pt2pt_module_t *module)
|
|||||||
/* the LSB of the tag is used be the receiver to determine if the
|
/* the LSB of the tag is used be the receiver to determine if the
|
||||||
message is a passive or active target (ie, where to mark
|
message is a passive or active target (ie, where to mark
|
||||||
completion). */
|
completion). */
|
||||||
int32_t tmp = OPAL_THREAD_ADD_FETCH32((volatile int32_t *) &module->tag_counter, 4);
|
int32_t tmp = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->tag_counter, 4);
|
||||||
return (tmp & OSC_PT2PT_FRAG_MASK) | !!(module->passive_target_access_epoch);
|
return (tmp & OSC_PT2PT_FRAG_MASK) | !!(module->passive_target_access_epoch);
|
||||||
}
|
}
|
||||||
|
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
@ -166,17 +166,16 @@ int ompi_osc_pt2pt_fence(int assert, ompi_win_t *win)
|
|||||||
"osc pt2pt: fence done sending"));
|
"osc pt2pt: fence done sending"));
|
||||||
|
|
||||||
/* find out how much data everyone is going to send us. */
|
/* find out how much data everyone is going to send us. */
|
||||||
ret = module->comm->c_coll->coll_reduce_scatter_block (module->epoch_outgoing_frag_count,
|
ret = module->comm->c_coll->coll_reduce_scatter_block ((void *) module->epoch_outgoing_frag_count,
|
||||||
&incoming_reqs, 1, MPI_UINT32_T,
|
&incoming_reqs, 1, MPI_UINT32_T,
|
||||||
MPI_SUM, module->comm,
|
MPI_SUM, module->comm,
|
||||||
module->comm->c_coll->coll_reduce_scatter_block_module);
|
module->comm->c_coll->coll_reduce_scatter_block_module);
|
||||||
if (OMPI_SUCCESS != ret) {
|
if (OMPI_SUCCESS != ret) {
|
||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
|
|
||||||
OPAL_THREAD_LOCK(&module->lock);
|
OPAL_THREAD_LOCK(&module->lock);
|
||||||
bzero(module->epoch_outgoing_frag_count,
|
bzero ((void *) module->epoch_outgoing_frag_count, sizeof(uint32_t) * ompi_comm_size(module->comm));
|
||||||
sizeof(uint32_t) * ompi_comm_size(module->comm));
|
|
||||||
|
|
||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||||
"osc pt2pt: fence expects %d requests",
|
"osc pt2pt: fence expects %d requests",
|
||||||
@ -366,8 +365,11 @@ int ompi_osc_pt2pt_complete (ompi_win_t *win)
|
|||||||
|
|
||||||
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
||||||
* requests. once that is done this can be removed. */
|
* requests. once that is done this can be removed. */
|
||||||
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (complete_req))) {
|
if (peer->active_frag) {
|
||||||
++complete_req.frag_count;
|
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
|
if (active_frag->remain_len < sizeof (complete_req)) {
|
||||||
|
++complete_req.frag_count;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||||
|
@ -501,7 +501,7 @@ static void ompi_osc_pt2pt_peer_construct (ompi_osc_pt2pt_peer_t *peer)
|
|||||||
{
|
{
|
||||||
OBJ_CONSTRUCT(&peer->queued_frags, opal_list_t);
|
OBJ_CONSTRUCT(&peer->queued_frags, opal_list_t);
|
||||||
OBJ_CONSTRUCT(&peer->lock, opal_mutex_t);
|
OBJ_CONSTRUCT(&peer->lock, opal_mutex_t);
|
||||||
peer->active_frag = NULL;
|
peer->active_frag = 0;
|
||||||
peer->passive_incoming_frag_count = 0;
|
peer->passive_incoming_frag_count = 0;
|
||||||
peer->flags = 0;
|
peer->flags = 0;
|
||||||
}
|
}
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2009-2011 Oracle and/or its affiliates. All rights reserved.
|
* Copyright (c) 2009-2011 Oracle and/or its affiliates. All rights reserved.
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
@ -56,7 +56,7 @@ struct osc_pt2pt_accumulate_data_t {
|
|||||||
int peer;
|
int peer;
|
||||||
ompi_datatype_t *datatype;
|
ompi_datatype_t *datatype;
|
||||||
ompi_op_t *op;
|
ompi_op_t *op;
|
||||||
int request_count;
|
opal_atomic_int32_t request_count;
|
||||||
};
|
};
|
||||||
typedef struct osc_pt2pt_accumulate_data_t osc_pt2pt_accumulate_data_t;
|
typedef struct osc_pt2pt_accumulate_data_t osc_pt2pt_accumulate_data_t;
|
||||||
|
|
||||||
|
@ -1,7 +1,7 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2015 Research Organization for Information Science
|
* Copyright (c) 2015 Research Organization for Information Science
|
||||||
* and Technology (RIST). All rights reserved.
|
* and Technology (RIST). All rights reserved.
|
||||||
@ -65,7 +65,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
|
|||||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, frag->target);
|
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, frag->target);
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
assert(0 == frag->pending && peer->active_frag != frag);
|
assert(0 == frag->pending && peer->active_frag != (intptr_t) frag);
|
||||||
|
|
||||||
/* we need to signal now that a frag is outgoing to ensure the count sent
|
/* we need to signal now that a frag is outgoing to ensure the count sent
|
||||||
* with the unlock message is correct */
|
* with the unlock message is correct */
|
||||||
@ -93,7 +93,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
|
|||||||
|
|
||||||
static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, ompi_osc_pt2pt_peer_t *peer)
|
static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, ompi_osc_pt2pt_peer_t *peer)
|
||||||
{
|
{
|
||||||
ompi_osc_pt2pt_frag_t *active_frag = peer->active_frag;
|
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
int ret = OMPI_SUCCESS;
|
int ret = OMPI_SUCCESS;
|
||||||
|
|
||||||
if (NULL == active_frag) {
|
if (NULL == active_frag) {
|
||||||
@ -105,7 +105,7 @@ static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, om
|
|||||||
"osc pt2pt: flushing active fragment to target %d. pending: %d",
|
"osc pt2pt: flushing active fragment to target %d. pending: %d",
|
||||||
active_frag->target, active_frag->pending));
|
active_frag->target, active_frag->pending));
|
||||||
|
|
||||||
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &active_frag, NULL)) {
|
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &active_frag, 0)) {
|
||||||
if (0 != OPAL_THREAD_ADD_FETCH32(&active_frag->pending, -1)) {
|
if (0 != OPAL_THREAD_ADD_FETCH32(&active_frag->pending, -1)) {
|
||||||
/* communication going on while synchronizing; this is an rma usage bug */
|
/* communication going on while synchronizing; this is an rma usage bug */
|
||||||
return OMPI_ERR_RMA_SYNC;
|
return OMPI_ERR_RMA_SYNC;
|
||||||
|
@ -33,7 +33,7 @@ struct ompi_osc_pt2pt_frag_t {
|
|||||||
char *top;
|
char *top;
|
||||||
|
|
||||||
/* Number of operations which have started writing into the frag, but not yet completed doing so */
|
/* Number of operations which have started writing into the frag, but not yet completed doing so */
|
||||||
volatile int32_t pending;
|
opal_atomic_int32_t pending;
|
||||||
int32_t pending_long_sends;
|
int32_t pending_long_sends;
|
||||||
ompi_osc_pt2pt_frag_header_t *header;
|
ompi_osc_pt2pt_frag_header_t *header;
|
||||||
ompi_osc_pt2pt_module_t *module;
|
ompi_osc_pt2pt_module_t *module;
|
||||||
@ -66,8 +66,8 @@ static inline ompi_osc_pt2pt_frag_t *ompi_osc_pt2pt_frag_alloc_non_buffered (omp
|
|||||||
ompi_osc_pt2pt_frag_t *curr;
|
ompi_osc_pt2pt_frag_t *curr;
|
||||||
|
|
||||||
/* to ensure ordering flush the buffer on the peer */
|
/* to ensure ordering flush the buffer on the peer */
|
||||||
curr = peer->active_frag;
|
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &curr, NULL)) {
|
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &curr, 0)) {
|
||||||
/* If there's something pending, the pending finish will
|
/* If there's something pending, the pending finish will
|
||||||
start the buffer. Otherwise, we need to start it now. */
|
start the buffer. Otherwise, we need to start it now. */
|
||||||
int ret = ompi_osc_pt2pt_frag_finish (module, curr);
|
int ret = ompi_osc_pt2pt_frag_finish (module, curr);
|
||||||
@ -131,7 +131,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
|
|||||||
|
|
||||||
OPAL_THREAD_LOCK(&module->lock);
|
OPAL_THREAD_LOCK(&module->lock);
|
||||||
if (buffered) {
|
if (buffered) {
|
||||||
curr = peer->active_frag;
|
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
if (NULL == curr || curr->remain_len < request_len || (long_send && curr->pending_long_sends == 32)) {
|
if (NULL == curr || curr->remain_len < request_len || (long_send && curr->pending_long_sends == 32)) {
|
||||||
curr = ompi_osc_pt2pt_frag_alloc_non_buffered (module, peer, request_len);
|
curr = ompi_osc_pt2pt_frag_alloc_non_buffered (module, peer, request_len);
|
||||||
if (OPAL_UNLIKELY(NULL == curr)) {
|
if (OPAL_UNLIKELY(NULL == curr)) {
|
||||||
@ -140,7 +140,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
|
|||||||
}
|
}
|
||||||
|
|
||||||
curr->pending_long_sends = long_send;
|
curr->pending_long_sends = long_send;
|
||||||
peer->active_frag = curr;
|
peer->active_frag = (uintptr_t) curr;
|
||||||
} else {
|
} else {
|
||||||
OPAL_THREAD_ADD_FETCH32(&curr->header->num_ops, 1);
|
OPAL_THREAD_ADD_FETCH32(&curr->header->num_ops, 1);
|
||||||
curr->pending_long_sends += long_send;
|
curr->pending_long_sends += long_send;
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2015 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
||||||
* Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
* Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
||||||
@ -180,7 +180,7 @@ typedef struct ompi_osc_pt2pt_header_flush_ack_t ompi_osc_pt2pt_header_flush_ack
|
|||||||
struct ompi_osc_pt2pt_frag_header_t {
|
struct ompi_osc_pt2pt_frag_header_t {
|
||||||
ompi_osc_pt2pt_header_base_t base;
|
ompi_osc_pt2pt_header_base_t base;
|
||||||
uint32_t source; /* rank in window of source process */
|
uint32_t source; /* rank in window of source process */
|
||||||
int32_t num_ops; /* number of operations in this buffer */
|
opal_atomic_int32_t num_ops; /* number of operations in this buffer */
|
||||||
uint32_t pad; /* ensure the fragment header is a multiple of 8 bytes */
|
uint32_t pad; /* ensure the fragment header is a multiple of 8 bytes */
|
||||||
};
|
};
|
||||||
typedef struct ompi_osc_pt2pt_frag_header_t ompi_osc_pt2pt_frag_header_t;
|
typedef struct ompi_osc_pt2pt_frag_header_t ompi_osc_pt2pt_frag_header_t;
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
* Copyright (c) 2015 Research Organization for Information Science
|
* Copyright (c) 2015 Research Organization for Information Science
|
||||||
@ -104,13 +104,13 @@ int ompi_osc_pt2pt_free(ompi_win_t *win)
|
|||||||
free (module->recv_frags);
|
free (module->recv_frags);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (NULL != module->epoch_outgoing_frag_count) free(module->epoch_outgoing_frag_count);
|
free ((void *) module->epoch_outgoing_frag_count);
|
||||||
|
|
||||||
if (NULL != module->comm) {
|
if (NULL != module->comm) {
|
||||||
ompi_comm_free(&module->comm);
|
ompi_comm_free(&module->comm);
|
||||||
}
|
}
|
||||||
|
|
||||||
if (NULL != module->free_after) free(module->free_after);
|
free ((void *) module->free_after);
|
||||||
|
|
||||||
free (module);
|
free (module);
|
||||||
|
|
||||||
|
@ -8,7 +8,7 @@
|
|||||||
* University of Stuttgart. All rights reserved.
|
* University of Stuttgart. All rights reserved.
|
||||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||||
* All rights reserved.
|
* All rights reserved.
|
||||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
||||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||||
@ -157,7 +157,7 @@ int ompi_osc_pt2pt_lock_remote (ompi_osc_pt2pt_module_t *module, int target, omp
|
|||||||
|
|
||||||
static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module, int target, ompi_osc_pt2pt_sync_t *lock)
|
static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module, int target, ompi_osc_pt2pt_sync_t *lock)
|
||||||
{
|
{
|
||||||
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
||||||
int lock_type = lock->sync.lock.type;
|
int lock_type = lock->sync.lock.type;
|
||||||
ompi_osc_pt2pt_header_unlock_t unlock_req;
|
ompi_osc_pt2pt_header_unlock_t unlock_req;
|
||||||
@ -178,10 +178,13 @@ static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module,
|
|||||||
unlock_req.lock_ptr = (uint64_t) (uintptr_t) lock;
|
unlock_req.lock_ptr = (uint64_t) (uintptr_t) lock;
|
||||||
OSC_PT2PT_HTON(&unlock_req, module, target);
|
OSC_PT2PT_HTON(&unlock_req, module, target);
|
||||||
|
|
||||||
if (peer->active_frag && peer->active_frag->remain_len < sizeof (unlock_req)) {
|
if (peer->active_frag) {
|
||||||
/* the peer should expect one more packet */
|
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
++unlock_req.frag_count;
|
if (active_frag->remain_len < sizeof (unlock_req)) {
|
||||||
--module->epoch_outgoing_frag_count[target];
|
/* the peer should expect one more packet */
|
||||||
|
++unlock_req.frag_count;
|
||||||
|
--module->epoch_outgoing_frag_count[target];
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
OPAL_OUTPUT_VERBOSE((25, ompi_osc_base_framework.framework_output,
|
OPAL_OUTPUT_VERBOSE((25, ompi_osc_base_framework.framework_output,
|
||||||
@ -204,7 +207,7 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
|
|||||||
{
|
{
|
||||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
||||||
ompi_osc_pt2pt_header_flush_t flush_req;
|
ompi_osc_pt2pt_header_flush_t flush_req;
|
||||||
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||||
int ret;
|
int ret;
|
||||||
|
|
||||||
(void) OPAL_THREAD_ADD_FETCH32(&lock->sync_expected, 1);
|
(void) OPAL_THREAD_ADD_FETCH32(&lock->sync_expected, 1);
|
||||||
@ -218,10 +221,13 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
|
|||||||
|
|
||||||
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
||||||
* requests. once that is done this can be removed. */
|
* requests. once that is done this can be removed. */
|
||||||
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (flush_req))) {
|
if (peer->active_frag) {
|
||||||
/* the peer should expect one more packet */
|
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||||
++flush_req.frag_count;
|
if (active_frag->remain_len < sizeof (flush_req)) {
|
||||||
--module->epoch_outgoing_frag_count[target];
|
/* the peer should expect one more packet */
|
||||||
|
++flush_req.frag_count;
|
||||||
|
--module->epoch_outgoing_frag_count[target];
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output, "flushing to target %d, frag_count: %d",
|
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output, "flushing to target %d, frag_count: %d",
|
||||||
|
@ -28,7 +28,7 @@ struct ompi_osc_pt2pt_request_t {
|
|||||||
int origin_count;
|
int origin_count;
|
||||||
struct ompi_datatype_t *origin_dt;
|
struct ompi_datatype_t *origin_dt;
|
||||||
ompi_osc_pt2pt_module_t* module;
|
ompi_osc_pt2pt_module_t* module;
|
||||||
int32_t outstanding_requests;
|
opal_atomic_int32_t outstanding_requests;
|
||||||
bool internal;
|
bool internal;
|
||||||
};
|
};
|
||||||
typedef struct ompi_osc_pt2pt_request_t ompi_osc_pt2pt_request_t;
|
typedef struct ompi_osc_pt2pt_request_t ompi_osc_pt2pt_request_t;
|
||||||
|
@ -1,6 +1,6 @@
|
|||||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||||
/*
|
/*
|
||||||
* Copyright (c) 2015-2016 Los Alamos National Security, LLC. All rights
|
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||||
* reserved.
|
* reserved.
|
||||||
* $COPYRIGHT$
|
* $COPYRIGHT$
|
||||||
*
|
*
|
||||||
@ -74,7 +74,7 @@ struct ompi_osc_pt2pt_sync_t {
|
|||||||
int num_peers;
|
int num_peers;
|
||||||
|
|
||||||
/** number of synchronization messages expected */
|
/** number of synchronization messages expected */
|
||||||
volatile int32_t sync_expected;
|
opal_atomic_int32_t sync_expected;
|
||||||
|
|
||||||
/** eager sends are active to all peers in this access epoch */
|
/** eager sends are active to all peers in this access epoch */
|
||||||
volatile bool eager_send_active;
|
volatile bool eager_send_active;
|
||||||
|
@ -265,7 +265,7 @@ struct ompi_osc_rdma_module_t {
|
|||||||
unsigned long get_retry_count;
|
unsigned long get_retry_count;
|
||||||
|
|
||||||
/** outstanding atomic operations */
|
/** outstanding atomic operations */
|
||||||
volatile int32_t pending_ops;
|
opal_atomic_int32_t pending_ops;
|
||||||
};
|
};
|
||||||
typedef struct ompi_osc_rdma_module_t ompi_osc_rdma_module_t;
|
typedef struct ompi_osc_rdma_module_t ompi_osc_rdma_module_t;
|
||||||
OMPI_MODULE_DECLSPEC extern ompi_osc_rdma_component_t mca_osc_rdma_component;
|
OMPI_MODULE_DECLSPEC extern ompi_osc_rdma_component_t mca_osc_rdma_component;
|
||||||
|
@ -259,7 +259,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
|
|||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
post_index = ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) (intptr_t) target, 1) - 1;
|
post_index = ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) (intptr_t) target, 1) - 1;
|
||||||
}
|
}
|
||||||
|
|
||||||
post_index &= OMPI_OSC_RDMA_POST_PEER_MAX - 1;
|
post_index &= OMPI_OSC_RDMA_POST_PEER_MAX - 1;
|
||||||
@ -279,7 +279,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
|
|||||||
return ret;
|
return ret;
|
||||||
}
|
}
|
||||||
} else {
|
} else {
|
||||||
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_counter_t *) target, &_tmp_value,
|
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_atomic_counter_t *) target, &_tmp_value,
|
||||||
1 + (osc_rdma_counter_t) my_rank);
|
1 + (osc_rdma_counter_t) my_rank);
|
||||||
}
|
}
|
||||||
|
|
||||||
@ -491,7 +491,7 @@ int ompi_osc_rdma_complete_atomic (ompi_win_t *win)
|
|||||||
ret = ompi_osc_rdma_lock_btl_op (module, peer, target, MCA_BTL_ATOMIC_ADD, 1, true);
|
ret = ompi_osc_rdma_lock_btl_op (module, peer, target, MCA_BTL_ATOMIC_ADD, 1, true);
|
||||||
assert (OMPI_SUCCESS == ret);
|
assert (OMPI_SUCCESS == ret);
|
||||||
} else {
|
} else {
|
||||||
(void) ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) target, 1);
|
(void) ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) target, 1);
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше
Загрузка…
x
Ссылка в новой задаче
Block a user