Resolve merge conflicts
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
Этот коммит содержится в:
Коммит
9557fa087f
2
LICENSE
2
LICENSE
@ -53,7 +53,7 @@ Copyright (c) 2014-2015 Hewlett-Packard Development Company, LP. All
|
||||
rights reserved.
|
||||
Copyright (c) 2013-2017 Research Organization for Information Science (RIST).
|
||||
All rights reserved.
|
||||
Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
|
||||
Copyright (c) 2017-2018 Amazon.com, Inc. or its affiliates. All Rights
|
||||
reserved.
|
||||
Copyright (c) 2018 DataDirect Networks. All rights reserved.
|
||||
|
||||
|
47
NEWS
47
NEWS
@ -80,6 +80,53 @@ Master (not on release branches yet)
|
||||
Currently, this means the Open SHMEM layer will only build if
|
||||
a MXM or UCX library is found.
|
||||
|
||||
3.1.2 -- August, 2018
|
||||
------------------------
|
||||
|
||||
- A subtle race condition bug was discovered in the "vader" BTL
|
||||
(shared memory communications) that, in rare instances, can cause
|
||||
MPI processes to crash or incorrectly classify (or effectively drop)
|
||||
an MPI message sent via shared memory. If you are using the "ob1"
|
||||
PML with "vader" for shared memory communication (note that vader is
|
||||
the default for shared memory communication with ob1), you need to
|
||||
upgrade to v3.1.2 or later to fix this issue. You may also upgrade
|
||||
to the following versions to fix this issue:
|
||||
- Open MPI v2.1.5 (expected end of August, 2018) or later in the
|
||||
v2.1.x series
|
||||
- Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
|
||||
series
|
||||
- Assorted Portals 4.0 bug fixes.
|
||||
- Fix for possible data corruption in MPI_BSEND.
|
||||
- Move shared memory file for vader btl into /dev/shm on Linux.
|
||||
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
|
||||
- Upgrade PMIx to v2.1.3.
|
||||
- Numerous One-sided bug fixes.
|
||||
- Fix for race condition in uGNI BTL.
|
||||
- Improve handling of large number of interfaces with TCP BTL.
|
||||
- Numerous UCX bug fixes.
|
||||
|
||||
3.1.1 -- June, 2018
|
||||
-------------------
|
||||
|
||||
- Fix potential hang in UCX PML during MPI_FINALIZE
|
||||
- Update internal PMIx to v2.1.2rc2 to fix forward version compatibility.
|
||||
- Add new MCA parameter osc_sm_backing_store to allow users to specify
|
||||
where in the filesystem the backing file for the shared memory
|
||||
one-sided component should live. Defaults to /dev/shm on Linux.
|
||||
- Fix potential hang on non-x86 platforms when using builds with
|
||||
optimization flags turned off.
|
||||
- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
|
||||
race conditions in the component.
|
||||
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
|
||||
match the MPI standard. This may break applications which use
|
||||
name-based parameters in Fortran which used our internal names
|
||||
rather than those documented in the MPI standard.
|
||||
- Revamp Java detection to properly handle new Java versions which do
|
||||
not provide a javah wrapper.
|
||||
- Fix RMA function signatures for use-mpi-f08 bindings to have the
|
||||
asynchonous property on all buffers.
|
||||
- Improved configure logic for finding the UCX library.
|
||||
|
||||
3.1.0 -- May, 2018
|
||||
------------------
|
||||
|
||||
|
26
README
26
README
@ -8,7 +8,7 @@ Copyright (c) 2004-2008 High Performance Computing Center Stuttgart,
|
||||
University of Stuttgart. All rights reserved.
|
||||
Copyright (c) 2004-2007 The Regents of the University of California.
|
||||
All rights reserved.
|
||||
Copyright (c) 2006-2017 Cisco Systems, Inc. All rights reserved.
|
||||
Copyright (c) 2006-2018 Cisco Systems, Inc. All rights reserved.
|
||||
Copyright (c) 2006-2011 Mellanox Technologies. All rights reserved.
|
||||
Copyright (c) 2006-2012 Oracle and/or its affiliates. All rights reserved.
|
||||
Copyright (c) 2007 Myricom, Inc. All rights reserved.
|
||||
@ -605,7 +605,6 @@ Network Support
|
||||
- Loopback (send-to-self)
|
||||
- Shared memory
|
||||
- TCP
|
||||
- Intel Phi SCIF
|
||||
- SMCUDA
|
||||
- Cisco usNIC
|
||||
- uGNI (Cray Gemini, Aries)
|
||||
@ -768,6 +767,26 @@ Open MPI is unable to find relevant support for <foo>, configure will
|
||||
assume that it was unable to provide a feature that was specifically
|
||||
requested and will abort so that a human can resolve out the issue.
|
||||
|
||||
Additionally, if a search directory is specified in the form
|
||||
--with-<foo>=<dir>, Open MPI will:
|
||||
|
||||
1. Search for <foo>'s header files in <dir>/include.
|
||||
2. Search for <foo>'s library files:
|
||||
2a. If --with-<foo>-libdir=<libdir> was specified, search in
|
||||
<libdir>.
|
||||
2b. Otherwise, search in <dir>/lib, and if they are not found
|
||||
there, search again in <dir>/lib64.
|
||||
3. If both the relevant header files and libraries are found:
|
||||
3a. Open MPI will build support for <foo>.
|
||||
3b. If the root path where the <foo> libraries are found is neither
|
||||
"/usr" nor "/usr/local", Open MPI will compile itself with
|
||||
RPATH flags pointing to the directory where <foo>'s libraries
|
||||
are located. Open MPI does not RPATH /usr/lib[64] and
|
||||
/usr/local/lib[64] because many systems already search these
|
||||
directories for run-time libraries by default; adding RPATH for
|
||||
them could have unintended consequences for the search path
|
||||
ordering.
|
||||
|
||||
INSTALLATION OPTIONS
|
||||
|
||||
--prefix=<directory>
|
||||
@ -1000,9 +1019,6 @@ NETWORKING SUPPORT / OPTIONS
|
||||
covers most cases. This option is only needed for special
|
||||
configurations.
|
||||
|
||||
--with-scif=<dir>
|
||||
Look in directory for Intel SCIF support libraries
|
||||
|
||||
--with-verbs=<directory>
|
||||
Specify the directory where the verbs (also known as OpenFabrics
|
||||
verbs, or Linux verbs, and previously known as OpenIB) libraries and
|
||||
|
@ -61,7 +61,7 @@ my $include_list;
|
||||
my $exclude_list;
|
||||
|
||||
# Minimum versions
|
||||
my $ompi_automake_version = "1.12.2";
|
||||
my $ompi_automake_version = "1.13.4";
|
||||
my $ompi_autoconf_version = "2.69";
|
||||
my $ompi_libtool_version = "2.4.2";
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
# -*- shell-script -*-
|
||||
#
|
||||
# Copyright (c) 2009-2017 Cisco Systems, Inc. All rights reserved
|
||||
# Copyright (c) 2017 Research Organization for Information Science
|
||||
# Copyright (c) 2017-2018 Research Organization for Information Science
|
||||
# and Technology (RIST). All rights reserved.
|
||||
# Copyright (c) 2018 Los Alamos National Security, LLC. All rights
|
||||
# reserved.
|
||||
@ -38,6 +38,7 @@ AC_DEFUN([OMPI_CONFIG_FILES],[
|
||||
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-file-interfaces.h
|
||||
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-removed-interfaces.h
|
||||
ompi/mpi/fortran/use-mpi-f08/Makefile
|
||||
ompi/mpi/fortran/use-mpi-f08/bindings/Makefile
|
||||
ompi/mpi/fortran/use-mpi-f08/mod/Makefile
|
||||
ompi/mpi/fortran/mpiext-use-mpi/Makefile
|
||||
ompi/mpi/fortran/mpiext-use-mpi-f08/Makefile
|
||||
|
@ -347,7 +347,8 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
|
||||
], [])],
|
||||
[AC_MSG_RESULT([found])
|
||||
opal_external_pmix_version=4x
|
||||
opal_external_pmix_version_found=1],
|
||||
opal_external_pmix_version_found=1
|
||||
opal_external_pmix_happy=yes],
|
||||
[AC_MSG_RESULT([not found])])])
|
||||
|
||||
AS_IF([test "$opal_external_pmix_version_found" = "0"],
|
||||
@ -437,9 +438,11 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
|
||||
[Whether the external PMIx library is v1])
|
||||
AM_CONDITIONAL([OPAL_WANT_PRUN], [test "$opal_prun_happy" = "yes"])
|
||||
|
||||
AS_IF([test "$opal_external_pmix_version" = "1x"],
|
||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [1.2.x: WARNING - DYNAMIC OPS NOT SUPPORTED])],
|
||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [$opal_external_pmix_version])])
|
||||
AS_IF([test "$opal_external_pmix_happy" = "yes"],
|
||||
[AS_IF([test "$opal_external_pmix_version" = "1x"],
|
||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External (1.2.5) WARNING - DYNAMIC OPS NOT SUPPORTED])],
|
||||
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External ($opal_external_pmix_version)])])],
|
||||
[OPAL_SUMMARY_ADD([[Miscellaneous]], [[PMIx support]], [opal_pmix], [Internal])])
|
||||
|
||||
OPAL_VAR_SCOPE_POP
|
||||
])
|
||||
|
@ -13,7 +13,7 @@ dnl Copyright (c) 2008-2018 Cisco Systems, Inc. All rights reserved.
|
||||
dnl Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
||||
dnl Copyright (c) 2015-2017 Research Organization for Information Science
|
||||
dnl and Technology (RIST). All rights reserved.
|
||||
dnl Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
|
||||
dnl Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
|
||||
dnl reserved.
|
||||
dnl Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
|
||||
dnl reserved.
|
||||
@ -122,6 +122,57 @@ int main(int argc, char** argv)
|
||||
}
|
||||
]])
|
||||
|
||||
dnl This is a C test to see if 128-bit __atomic_compare_exchange_n()
|
||||
dnl actually works (e.g., it compiles and links successfully on
|
||||
dnl ARM64+clang, but returns incorrect answers as of August 2018).
|
||||
AC_DEFUN([OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE],[[
|
||||
#include <stdint.h>
|
||||
#include <stdbool.h>
|
||||
#include <stdlib.h>
|
||||
#include <stdatomic.h>
|
||||
|
||||
typedef union {
|
||||
uint64_t fake@<:@2@:>@;
|
||||
_Atomic __int128 real;
|
||||
} ompi128;
|
||||
|
||||
static void test1(void)
|
||||
{
|
||||
// As of Aug 2018, we could not figure out a way to assign 128-bit
|
||||
// constants -- the compilers would not accept it. So use a fake
|
||||
// union to assign 2 uin64_t's to make a single __int128.
|
||||
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
|
||||
ompi128 expected = { .fake = { 0x11EEDDCCBBAA0099, 0x88776655443322FF }};
|
||||
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
|
||||
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
|
||||
desired.real, true,
|
||||
atomic_relaxed, atomic_relaxed);
|
||||
if ( !(r == false && ptr.real == expected.real)) {
|
||||
exit(1);
|
||||
}
|
||||
}
|
||||
|
||||
static void test2(void)
|
||||
{
|
||||
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
|
||||
ompi128 expected = ptr;
|
||||
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
|
||||
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
|
||||
desired.real, true,
|
||||
atomic_relaxed, atomic_relaxed);
|
||||
if (!(r == true && ptr.real == desired.real)) {
|
||||
exit(2);
|
||||
}
|
||||
}
|
||||
|
||||
int main(int argc, char** argv)
|
||||
{
|
||||
test1();
|
||||
test2();
|
||||
return 0;
|
||||
}
|
||||
]])
|
||||
|
||||
dnl ------------------------------------------------------------------
|
||||
|
||||
dnl
|
||||
@ -329,6 +380,71 @@ __atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
|
||||
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
|
||||
])
|
||||
|
||||
AC_DEFUN([OPAL_CHECK_C11_CSWAP_INT128], [
|
||||
OPAL_VAR_SCOPE_PUSH([atomic_compare_exchange_result atomic_compare_exchange_CFLAGS_save atomic_compare_exchange_LIBS_save])
|
||||
|
||||
atomic_compare_exchange_CFLAGS_save=$CFLAGS
|
||||
atomic_compare_exchange_LIBS_save=$LIBS
|
||||
|
||||
# Do we have C11 atomics on 128-bit integers?
|
||||
# Use a special macro because we need to check with a few different
|
||||
# CFLAGS/LIBS.
|
||||
OPAL_ASM_CHECK_ATOMIC_FUNC([atomic_compare_exchange_strong_16],
|
||||
[AC_LANG_SOURCE(OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE)],
|
||||
[atomic_compare_exchange_result=1],
|
||||
[atomic_compare_exchange_result=0])
|
||||
|
||||
# If we have it and it works, check to make sure it is always lock
|
||||
# free.
|
||||
AS_IF([test $atomic_compare_exchange_result -eq 1],
|
||||
[AC_MSG_CHECKING([if C11 __int128 atomic compare-and-swap is always lock-free])
|
||||
AC_RUN_IFELSE([AC_LANG_PROGRAM([#include <stdatomic.h>], [_Atomic __int128_t x; if (!atomic_is_lock_free(&x)) { return 1; }])],
|
||||
[AC_MSG_RESULT([yes])],
|
||||
[atomic_compare_exchange_result=0
|
||||
# If this test fails, need to reset CFLAGS/LIBS (the
|
||||
# above tests atomically set CFLAGS/LIBS or not; this
|
||||
# test is running after the fact, so we have to undo
|
||||
# the side-effects of setting CFLAGS/LIBS if the above
|
||||
# tests passed).
|
||||
CFLAGS=$atomic_compare_exchange_CFLAGS_save
|
||||
LIBS=$atomic_compare_exchange_LIBS_save
|
||||
AC_MSG_RESULT([no])],
|
||||
[AC_MSG_RESULT([cannot test -- assume yes (cross compiling)])])
|
||||
])
|
||||
|
||||
AC_DEFINE_UNQUOTED([OPAL_HAVE_C11_CSWAP_INT128],
|
||||
[$atomic_compare_exchange_result],
|
||||
[Whether C11 atomic compare swap is both supported and lock-free on 128-bit values])
|
||||
|
||||
dnl If we could not find decent support for 128-bits atomic let's
|
||||
dnl try the GCC _sync
|
||||
AS_IF([test $atomic_compare_exchange_result -eq 0],
|
||||
[OPAL_CHECK_SYNC_BUILTIN_CSWAP_INT128])
|
||||
|
||||
OPAL_VAR_SCOPE_POP
|
||||
])
|
||||
|
||||
AC_DEFUN([OPAL_CHECK_GCC_ATOMIC_BUILTINS], [
|
||||
AC_MSG_CHECKING([for __atomic builtin atomics])
|
||||
|
||||
AC_TRY_LINK([
|
||||
#include <stdint.h>
|
||||
uint32_t tmp, old = 0;
|
||||
uint64_t tmp64, old64 = 0;], [
|
||||
__atomic_thread_fence(__ATOMIC_SEQ_CST);
|
||||
__atomic_compare_exchange_n(&tmp, &old, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||
__atomic_add_fetch(&tmp, 1, __ATOMIC_RELAXED);
|
||||
__atomic_compare_exchange_n(&tmp64, &old64, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||
__atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
|
||||
[AC_MSG_RESULT([yes])
|
||||
$1],
|
||||
[AC_MSG_RESULT([no])
|
||||
$2])
|
||||
|
||||
# Check for 128-bit support
|
||||
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
|
||||
])
|
||||
|
||||
|
||||
dnl #################################################################
|
||||
dnl
|
||||
@ -1020,17 +1136,27 @@ AC_DEFUN([OPAL_CONFIG_ASM],[
|
||||
AC_REQUIRE([OPAL_SETUP_CC])
|
||||
AC_REQUIRE([AM_PROG_AS])
|
||||
|
||||
AC_ARG_ENABLE([c11-atomics],[AC_HELP_STRING([--enable-c11-atomics],
|
||||
[Enable use of C11 atomics if available (default: enabled)])])
|
||||
|
||||
AC_ARG_ENABLE([builtin-atomics],
|
||||
[AC_HELP_STRING([--enable-builtin-atomics],
|
||||
[Enable use of __sync builtin atomics (default: enabled)])])
|
||||
[Enable use of __sync builtin atomics (default: disabled)])])
|
||||
|
||||
opal_cv_asm_builtin="BUILTIN_NO"
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
|
||||
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
|
||||
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
|
||||
OPAL_CHECK_C11_CSWAP_INT128
|
||||
|
||||
if test "x$enable_c11_atomics" != "xno" && test "$opal_cv_c11_supported" = "yes" ; then
|
||||
opal_cv_asm_builtin="BUILTIN_C11"
|
||||
OPAL_CHECK_C11_CSWAP_INT128
|
||||
else
|
||||
opal_cv_asm_builtin="BUILTIN_NO"
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
|
||||
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
|
||||
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
|
||||
fi
|
||||
|
||||
OPAL_CHECK_ASM_PROC
|
||||
OPAL_CHECK_ASM_TEXT
|
||||
|
@ -10,7 +10,7 @@ dnl Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
dnl University of Stuttgart. All rights reserved.
|
||||
dnl Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
dnl All rights reserved.
|
||||
dnl Copyright (c) 2014-2015 Intel, Inc. All rights reserved.
|
||||
dnl Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
|
||||
dnl Copyright (c) 2015 Cisco Systems, Inc. All rights reserved.
|
||||
dnl $COPYRIGHT$
|
||||
dnl
|
||||
@ -60,6 +60,8 @@ do
|
||||
;;
|
||||
-with-platform=* | --with-platform=*)
|
||||
;;
|
||||
-with*=internal)
|
||||
;;
|
||||
*)
|
||||
case $subdir_arg in
|
||||
*\'*) subdir_arg=`echo "$subdir_arg" | sed "s/'/'\\\\\\\\''/g"` ;;
|
||||
|
@ -100,7 +100,7 @@ OPAL_VAR_SCOPE_POP
|
||||
#
|
||||
# Init automake
|
||||
#
|
||||
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.12.2 tar-ustar])
|
||||
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.13.4 tar-ustar])
|
||||
|
||||
# SILENT_RULES is new in AM 1.11, but we require 1.11 or higher via
|
||||
# autogen. Limited testing shows that calling SILENT_RULES directly
|
||||
@ -858,7 +858,7 @@ OPAL_SEARCH_LIBS_CORE([ceil], [m])
|
||||
# -lrt might be needed for clock_gettime
|
||||
OPAL_SEARCH_LIBS_CORE([clock_gettime], [rt])
|
||||
|
||||
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair strncpy_s usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
|
||||
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
|
||||
|
||||
# Sanity check: ensure that we got at least one of statfs or statvfs.
|
||||
if test $ac_cv_func_statfs = no && test $ac_cv_func_statvfs = no; then
|
||||
|
@ -88,12 +88,8 @@ EXTRA_DIST = \
|
||||
platform/lanl/darwin/mic-common \
|
||||
platform/lanl/darwin/debug \
|
||||
platform/lanl/darwin/debug.conf \
|
||||
platform/lanl/darwin/debug-mic \
|
||||
platform/lanl/darwin/debug-mic.conf \
|
||||
platform/lanl/darwin/optimized \
|
||||
platform/lanl/darwin/optimized.conf \
|
||||
platform/lanl/darwin/optimized-mic \
|
||||
platform/lanl/darwin/optimized-mic.conf \
|
||||
platform/snl/portals4-m5 \
|
||||
platform/snl/portals4-orte \
|
||||
platform/ibm/debug-ppc32-gcc \
|
||||
|
@ -10,7 +10,7 @@
|
||||
|
||||
m4=1.4.16
|
||||
ac=2.69
|
||||
am=1.12.2
|
||||
am=1.13.4
|
||||
lt=2.4.2
|
||||
flex=2.5.35
|
||||
|
||||
|
1
contrib/dist/linux/buildrpm.sh
поставляемый
1
contrib/dist/linux/buildrpm.sh
поставляемый
@ -267,7 +267,6 @@ fi
|
||||
# Find where the top RPM-building directory is
|
||||
#
|
||||
|
||||
rpmtopdir=
|
||||
file=~/.rpmmacros
|
||||
if test -r $file; then
|
||||
rpmtopdir=${rpmtopdir:-"`grep %_topdir $file | awk '{ print $2 }'`"}
|
||||
|
@ -1,100 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||
# University Research and Technology
|
||||
# Corporation. All rights reserved.
|
||||
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
||||
# of Tennessee Research Foundation. All rights
|
||||
# reserved.
|
||||
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
# University of Stuttgart. All rights reserved.
|
||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
# All rights reserved.
|
||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
|
||||
# All rights reserved.
|
||||
# $COPYRIGHT$
|
||||
#
|
||||
# Additional copyrights may follow
|
||||
#
|
||||
# $HEADER$
|
||||
#
|
||||
|
||||
# This is the default system-wide MCA parameters defaults file.
|
||||
# Specifically, the MCA parameter "mca_param_files" defaults to a
|
||||
# value of
|
||||
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
|
||||
# (this file is the latter of the two). So if the default value of
|
||||
# mca_param_files is not changed, this file is used to set system-wide
|
||||
# MCA parameters. This file can therefore be used to set system-wide
|
||||
# default MCA parameters for all users. Of course, users can override
|
||||
# these values if they want, but this file is an excellent location
|
||||
# for setting system-specific MCA parameters for those users who don't
|
||||
# know / care enough to investigate the proper values for them.
|
||||
|
||||
# Note that this file is only applicable where it is visible (in a
|
||||
# filesystem sense). Specifically, MPI processes each read this file
|
||||
# during their startup to determine what default values for MCA
|
||||
# parameters should be used. mpirun does not bundle up the values in
|
||||
# this file from the node where it was run and send them to all nodes;
|
||||
# the default value decisions are effectively distributed. Hence,
|
||||
# these values are only applicable on nodes that "see" this file. If
|
||||
# $sysconf is a directory on a local disk, it is likely that changes
|
||||
# to this file will need to be propagated to other nodes. If $sysconf
|
||||
# is a directory that is shared via a networked filesystem, changes to
|
||||
# this file will be visible to all nodes that share this $sysconf.
|
||||
|
||||
# The format is straightforward: one per line, mca_param_name =
|
||||
# rvalue. Quoting is ignored (so if you use quotes or escape
|
||||
# characters, they'll be included as part of the value). For example:
|
||||
|
||||
# Disable run-time MPI parameter checking
|
||||
# mpi_param_check = 0
|
||||
|
||||
# Note that the value "~/" will be expanded to the current user's home
|
||||
# directory. For example:
|
||||
|
||||
# Change component loading path
|
||||
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
|
||||
|
||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
||||
# parameters available and their default values.
|
||||
#
|
||||
|
||||
# Basic behavior to smooth startup
|
||||
mca_base_component_show_load_errors = 0
|
||||
opal_set_max_sys_limits = 1
|
||||
orte_report_launch_progress = 1
|
||||
|
||||
# Define timeout for daemons to report back during launch
|
||||
orte_startup_timeout = 10000
|
||||
|
||||
## Protect the shared file systems
|
||||
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
|
||||
orte_tmpdir_base = /tmp
|
||||
|
||||
## Require an allocation to run - protects the frontend
|
||||
## from inadvertent job executions
|
||||
orte_allocation_required = 1
|
||||
|
||||
## Add the interface for out-of-band communication
|
||||
## and set it up
|
||||
oob_tcp_if_include=mic0
|
||||
oob_tcp_peer_retries = 1000
|
||||
oob_tcp_sndbuf = 32768
|
||||
oob_tcp_rcvbuf = 32768
|
||||
|
||||
## Define the MPI interconnects
|
||||
btl = sm,scif,openib,self
|
||||
|
||||
## Setup OpenIB - just in case
|
||||
btl_openib_want_fork_support = 0
|
||||
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
|
||||
|
||||
## Enable cpu affinity
|
||||
hwloc_base_binding_policy = core
|
||||
|
||||
## Setup MPI options
|
||||
mpi_show_handle_leaks = 1
|
||||
mpi_warn_on_fork = 1
|
||||
#mpi_abort_print_stack = 1
|
||||
|
@ -10,7 +10,7 @@
|
||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
# All rights reserved.
|
||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
|
||||
# Copyright (c) 2011-2018 Los Alamos National Security, LLC.
|
||||
# All rights reserved.
|
||||
# $COPYRIGHT$
|
||||
#
|
||||
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
|
||||
oob_tcp_rcvbuf = 32768
|
||||
|
||||
## Define the MPI interconnects
|
||||
btl = sm,scif,openib,self
|
||||
btl = sm,openib,self
|
||||
|
||||
## Setup OpenIB - just in case
|
||||
btl_openib_want_fork_support = 0
|
||||
|
@ -1,100 +0,0 @@
|
||||
#
|
||||
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||
# University Research and Technology
|
||||
# Corporation. All rights reserved.
|
||||
# Copyright (c) 2004-2005 The University of Tennessee and The University
|
||||
# of Tennessee Research Foundation. All rights
|
||||
# reserved.
|
||||
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
# University of Stuttgart. All rights reserved.
|
||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
# All rights reserved.
|
||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
|
||||
# reserved.
|
||||
# $COPYRIGHT$
|
||||
#
|
||||
# Additional copyrights may follow
|
||||
#
|
||||
# $HEADER$
|
||||
#
|
||||
|
||||
# This is the default system-wide MCA parameters defaults file.
|
||||
# Specifically, the MCA parameter "mca_param_files" defaults to a
|
||||
# value of
|
||||
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
|
||||
# (this file is the latter of the two). So if the default value of
|
||||
# mca_param_files is not changed, this file is used to set system-wide
|
||||
# MCA parameters. This file can therefore be used to set system-wide
|
||||
# default MCA parameters for all users. Of course, users can override
|
||||
# these values if they want, but this file is an excellent location
|
||||
# for setting system-specific MCA parameters for those users who don't
|
||||
# know / care enough to investigate the proper values for them.
|
||||
|
||||
# Note that this file is only applicable where it is visible (in a
|
||||
# filesystem sense). Specifically, MPI processes each read this file
|
||||
# during their startup to determine what default values for MCA
|
||||
# parameters should be used. mpirun does not bundle up the values in
|
||||
# this file from the node where it was run and send them to all nodes;
|
||||
# the default value decisions are effectively distributed. Hence,
|
||||
# these values are only applicable on nodes that "see" this file. If
|
||||
# $sysconf is a directory on a local disk, it is likely that changes
|
||||
# to this file will need to be propagated to other nodes. If $sysconf
|
||||
# is a directory that is shared via a networked filesystem, changes to
|
||||
# this file will be visible to all nodes that share this $sysconf.
|
||||
|
||||
# The format is straightforward: one per line, mca_param_name =
|
||||
# rvalue. Quoting is ignored (so if you use quotes or escape
|
||||
# characters, they'll be included as part of the value). For example:
|
||||
|
||||
# Disable run-time MPI parameter checking
|
||||
# mpi_param_check = 0
|
||||
|
||||
# Note that the value "~/" will be expanded to the current user's home
|
||||
# directory. For example:
|
||||
|
||||
# Change component loading path
|
||||
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
|
||||
|
||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
||||
# parameters available and their default values.
|
||||
#
|
||||
|
||||
# Basic behavior to smooth startup
|
||||
mca_base_component_show_load_errors = 0
|
||||
opal_set_max_sys_limits = 1
|
||||
orte_report_launch_progress = 1
|
||||
|
||||
# Define timeout for daemons to report back during launch
|
||||
orte_startup_timeout = 10000
|
||||
|
||||
## Protect the shared file systems
|
||||
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
|
||||
orte_tmpdir_base = /tmp
|
||||
|
||||
## Require an allocation to run - protects the frontend
|
||||
## from inadvertent job executions
|
||||
orte_allocation_required = 1
|
||||
|
||||
## Add the interface for out-of-band communication
|
||||
## and set it up
|
||||
oob_tcp_if_include = mic0
|
||||
oob_tcp_peer_retries = 1000
|
||||
oob_tcp_sndbuf = 32768
|
||||
oob_tcp_rcvbuf = 32768
|
||||
|
||||
## Define the MPI interconnects
|
||||
btl = sm,scif,openib,self
|
||||
|
||||
## Setup OpenIB - just in case
|
||||
btl_openib_want_fork_support = 0
|
||||
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
|
||||
|
||||
## Enable cpu affinity
|
||||
hwloc_base_binding_policy = core
|
||||
|
||||
## Setup MPI options
|
||||
mpi_show_handle_leaks = 0
|
||||
mpi_warn_on_fork = 1
|
||||
#mpi_abort_print_stack = 0
|
||||
|
@ -10,7 +10,7 @@
|
||||
# Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
# All rights reserved.
|
||||
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
|
||||
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
|
||||
# Copyright (c) 2011-2018 Los Alamos National Security, LLC. All rights
|
||||
# reserved.
|
||||
# $COPYRIGHT$
|
||||
#
|
||||
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
|
||||
oob_tcp_rcvbuf = 32768
|
||||
|
||||
## Define the MPI interconnects
|
||||
btl = sm,scif,openib,self
|
||||
btl = sm,openib,self
|
||||
|
||||
## Setup OpenIB - just in case
|
||||
btl_openib_want_fork_support = 0
|
||||
|
@ -23,26 +23,11 @@ if [ "$mellanox_autodetect" == "yes" ]; then
|
||||
with_ucx=$ucx_dir
|
||||
fi
|
||||
|
||||
mxm_dir=${mxm_dir:="$(pkg-config --variable=prefix mxm)"}
|
||||
if [ -d $mxm_dir ]; then
|
||||
with_mxm=$mxm_dir
|
||||
fi
|
||||
|
||||
fca_dir=${fca_dir:="$(pkg-config --variable=prefix fca)"}
|
||||
if [ -d $fca_dir ]; then
|
||||
with_fca=$fca_dir
|
||||
fi
|
||||
|
||||
hcoll_dir=${hcoll_dir:="$(pkg-config --variable=prefix hcoll)"}
|
||||
if [ -d $hcoll_dir ]; then
|
||||
with_hcoll=$hcoll_dir
|
||||
fi
|
||||
|
||||
knem_dir=${knem_dir:="$(pkg-config --variable=prefix knem)"}
|
||||
if [ -d $knem_dir ]; then
|
||||
with_knem=$knem_dir
|
||||
fi
|
||||
|
||||
slurm_dir=${slurm_dir:="/usr"}
|
||||
if [ -f $slurm_dir/include/slurm/slurm.h ]; then
|
||||
with_slurm=$slurm_dir
|
||||
|
@ -56,12 +56,10 @@
|
||||
|
||||
# See "ompi_info --param all all" for a full listing of Open MPI MCA
|
||||
# parameters available and their default values.
|
||||
coll_fca_enable = 0
|
||||
scoll_fca_enable = 0
|
||||
#rmaps_base_mapping_policy = dist:auto
|
||||
coll = ^ml
|
||||
hwloc_base_binding_policy = core
|
||||
btl = vader,openib,self
|
||||
btl = self
|
||||
# Basic behavior to smooth startup
|
||||
mca_base_component_show_load_errors = 0
|
||||
orte_abort_timeout = 10
|
||||
@ -77,3 +75,6 @@ oob_tcp_sndbuf = 32768
|
||||
oob_tcp_rcvbuf = 32768
|
||||
|
||||
opal_event_include=epoll
|
||||
|
||||
bml_r2_show_unreach_errors = 0
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
# Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
# reserved.
|
||||
# Copyright (c) 2015-2017 Intel, Inc. All rights reserved.
|
||||
# Copyright (c) 2015-2017 Research Organization for Information Science
|
||||
# Copyright (c) 2015-2018 Research Organization for Information Science
|
||||
# and Technology (RIST). All rights reserved.
|
||||
# Copyright (c) 2016 IBM Corporation. All rights reserved.
|
||||
# Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
@ -93,6 +93,7 @@ SUBDIRS = \
|
||||
$(OMPI_FORTRAN_USEMPI_DIR) \
|
||||
mpi/fortran/mpiext-use-mpi \
|
||||
mpi/fortran/use-mpi-f08/mod \
|
||||
mpi/fortran/use-mpi-f08/bindings \
|
||||
$(OMPI_MPIEXT_USEMPIF08_DIRS) \
|
||||
mpi/fortran/use-mpi-f08 \
|
||||
mpi/fortran/mpiext-use-mpi-f08 \
|
||||
@ -124,6 +125,7 @@ DIST_SUBDIRS = \
|
||||
mpi/fortran/mpiext-use-mpi \
|
||||
mpi/fortran/use-mpi-f08 \
|
||||
mpi/fortran/use-mpi-f08/mod \
|
||||
mpi/fortran/use-mpi-f08/bindings \
|
||||
mpi/fortran/mpiext-use-mpi-f08 \
|
||||
mpi/java \
|
||||
$(OMPI_MPIEXT_ALL_SUBDIRS) \
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2013-2016 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2013-2018 Los Alamos National Security, LLC. All rights
|
||||
* reseved.
|
||||
* Copyright (c) 2015 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
@ -99,7 +99,7 @@ int ompi_comm_request_schedule_append (ompi_comm_request_t *request, ompi_comm_r
|
||||
static int ompi_comm_request_progress (void)
|
||||
{
|
||||
ompi_comm_request_t *request, *next;
|
||||
static int32_t progressing = 0;
|
||||
static opal_atomic_int32_t progressing = 0;
|
||||
|
||||
/* don't allow re-entry */
|
||||
if (opal_atomic_swap_32 (&progressing, 1)) {
|
||||
|
@ -75,7 +75,7 @@ struct ompi_datatype_t {
|
||||
struct opal_hash_table_t *d_keyhash; /**< Attribute fields */
|
||||
|
||||
void* args; /**< Data description for the user */
|
||||
void* packed_description; /**< Packed description of the datatype */
|
||||
opal_atomic_intptr_t packed_description; /**< Packed description of the datatype */
|
||||
uint64_t pml_data; /**< PML-specific information */
|
||||
/* --- cacheline 6 boundary (384 bytes) --- */
|
||||
char name[MPI_MAX_OBJECT_NAME];/**< Externally visible name */
|
||||
|
@ -45,7 +45,7 @@ __ompi_datatype_create_from_args( int32_t* i, ptrdiff_t * a,
|
||||
ompi_datatype_t** d, int32_t type );
|
||||
|
||||
typedef struct __dt_args {
|
||||
int32_t ref_count;
|
||||
opal_atomic_int32_t ref_count;
|
||||
int32_t create_type;
|
||||
size_t total_pack_size;
|
||||
int32_t ci;
|
||||
@ -104,7 +104,7 @@ typedef struct __dt_args {
|
||||
pArgs->total_pack_size = (4 + (IC) + (DC)) * sizeof(int) + \
|
||||
(AC) * sizeof(ptrdiff_t); \
|
||||
(PDATA)->args = (void*)pArgs; \
|
||||
(PDATA)->packed_description = NULL; \
|
||||
(PDATA)->packed_description = 0; \
|
||||
} while(0)
|
||||
|
||||
|
||||
@ -483,12 +483,12 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
||||
{
|
||||
ompi_datatype_args_t* args = (ompi_datatype_args_t*)datatype->args;
|
||||
int next_index = OMPI_DATATYPE_MAX_PREDEFINED;
|
||||
void *packed_description = datatype->packed_description;
|
||||
void *packed_description = (void *) datatype->packed_description;
|
||||
void* recursive_buffer;
|
||||
|
||||
if (NULL == packed_description) {
|
||||
void *_tmp_ptr = NULL;
|
||||
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (void *) &_tmp_ptr, (void *) 1)) {
|
||||
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (intptr_t *) &_tmp_ptr, 1)) {
|
||||
if( ompi_datatype_is_predefined(datatype) ) {
|
||||
packed_description = malloc(2 * sizeof(int));
|
||||
} else if( NULL == args ) {
|
||||
@ -510,10 +510,10 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
||||
}
|
||||
|
||||
opal_atomic_wmb ();
|
||||
datatype->packed_description = packed_description;
|
||||
datatype->packed_description = (intptr_t) packed_description;
|
||||
} else {
|
||||
/* another thread beat us to it */
|
||||
packed_description = datatype->packed_description;
|
||||
packed_description = (void *) datatype->packed_description;
|
||||
}
|
||||
}
|
||||
|
||||
@ -521,11 +521,11 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
||||
struct timespec interval = {.tv_sec = 0, .tv_nsec = 1000};
|
||||
|
||||
/* wait until the packed description is updated */
|
||||
while ((void *) 1 == datatype->packed_description) {
|
||||
while (1 == datatype->packed_description) {
|
||||
nanosleep (&interval, NULL);
|
||||
}
|
||||
|
||||
packed_description = datatype->packed_description;
|
||||
packed_description = (void *) datatype->packed_description;
|
||||
}
|
||||
|
||||
*packed_buffer = (const void *) packed_description;
|
||||
@ -534,7 +534,7 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
|
||||
|
||||
size_t ompi_datatype_pack_description_length( ompi_datatype_t* datatype )
|
||||
{
|
||||
void *packed_description = datatype->packed_description;
|
||||
void *packed_description = (void *) datatype->packed_description;
|
||||
|
||||
if( ompi_datatype_is_predefined(datatype) ) {
|
||||
return 2 * sizeof(int);
|
||||
|
@ -36,7 +36,7 @@ static void __ompi_datatype_allocate( ompi_datatype_t* datatype )
|
||||
datatype->id = -1;
|
||||
datatype->d_keyhash = NULL;
|
||||
datatype->name[0] = '\0';
|
||||
datatype->packed_description = NULL;
|
||||
datatype->packed_description = 0;
|
||||
datatype->pml_data = 0;
|
||||
}
|
||||
|
||||
@ -46,10 +46,10 @@ static void __ompi_datatype_release(ompi_datatype_t * datatype)
|
||||
ompi_datatype_release_args( datatype );
|
||||
datatype->args = NULL;
|
||||
}
|
||||
if( NULL != datatype->packed_description ) {
|
||||
free( datatype->packed_description );
|
||||
datatype->packed_description = NULL;
|
||||
}
|
||||
|
||||
free ((void *) datatype->packed_description );
|
||||
datatype->packed_description = 0;
|
||||
|
||||
if( datatype->d_f_to_c_index >= 0 ) {
|
||||
opal_pointer_array_set_item( &ompi_datatype_f_to_c_table, datatype->d_f_to_c_index, NULL );
|
||||
datatype->d_f_to_c_index = -1;
|
||||
|
@ -406,7 +406,7 @@ extern const ompi_datatype_t* ompi_datatype_basicDatatypes[OMPI_DATATYPE_MPI_MAX
|
||||
.d_f_to_c_index = -1, \
|
||||
.d_keyhash = NULL, \
|
||||
.args = NULL, \
|
||||
.packed_description = NULL, \
|
||||
.packed_description = 0, \
|
||||
.name = "MPI_" # NAME
|
||||
|
||||
#define OMPI_DATATYPE_INITIALIZER_UNAVAILABLE(FLAGS) \
|
||||
|
@ -383,7 +383,7 @@ opal_pointer_array_t ompi_datatype_f_to_c_table = {{0}};
|
||||
(PDST)->super.desc = (PSRC)->super.desc; \
|
||||
(PDST)->super.opt_desc = (PSRC)->super.opt_desc; \
|
||||
(PDST)->packed_description = (PSRC)->packed_description; \
|
||||
(PSRC)->packed_description = NULL; \
|
||||
(PSRC)->packed_description = 0; \
|
||||
/* transfer the ptypes */ \
|
||||
(PDST)->super.ptypes = (PSRC)->super.ptypes; \
|
||||
(PSRC)->super.ptypes = NULL; \
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2007-2016 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2007-2018 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2004-2010 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
@ -1157,8 +1157,18 @@ static int fetch_request( mqs_process *proc, mpi_process_info *p_info,
|
||||
mqs_fetch_data( proc, ompi_datatype + i_info->ompi_datatype_t.offset.name,
|
||||
64, data_name );
|
||||
if( '\0' != data_name[0] ) {
|
||||
snprintf( (char*)res->extra_text[1], 64, "Data: %d * %s",
|
||||
(int)res->desired_length, data_name );
|
||||
// res->extra_text[x] is only 64 chars long -- same as
|
||||
// data_name. If you try to snprintf it into
|
||||
// res->extra_text with additional text, some compilers
|
||||
// will warn that we might truncate the string (because it
|
||||
// can see the static char array lengths). So just put
|
||||
// data_name in res->extra_text[2] (vs. extra_text[1]),
|
||||
// where it is guaranteed to fit.
|
||||
data_name[4] = '\0';
|
||||
snprintf( (char*)res->extra_text[1], 64, "Data: %d",
|
||||
(int)res->desired_length);
|
||||
snprintf( (char*)res->extra_text[2], 64, "%s",
|
||||
data_name );
|
||||
}
|
||||
/* And now compute the real length as specified by the user */
|
||||
res->desired_length *=
|
||||
|
@ -202,7 +202,7 @@ ompi_errhandler_t *ompi_errhandler_create(ompi_errhandler_type_t object_type,
|
||||
new_errhandler->eh_comm_fn = (MPI_Comm_errhandler_function *)func;
|
||||
break;
|
||||
case (OMPI_ERRHANDLER_TYPE_FILE):
|
||||
new_errhandler->eh_file_fn = (ompi_file_errhandler_fn *)func;
|
||||
new_errhandler->eh_file_fn = (ompi_file_errhandler_function *)func;
|
||||
break;
|
||||
case (OMPI_ERRHANDLER_TYPE_WIN):
|
||||
new_errhandler->eh_win_fn = (MPI_Win_errhandler_function *)func;
|
||||
|
@ -117,7 +117,7 @@ struct ompi_errhandler_t {
|
||||
can be invoked on any MPI object type, so we need callbacks for
|
||||
all of three. */
|
||||
MPI_Comm_errhandler_function *eh_comm_fn;
|
||||
ompi_file_errhandler_fn *eh_file_fn;
|
||||
ompi_file_errhandler_function *eh_file_fn;
|
||||
MPI_Win_errhandler_function *eh_win_fn;
|
||||
ompi_errhandler_fortran_handler_fn_t *eh_fort_fn;
|
||||
|
||||
|
@ -356,7 +356,8 @@ static inline struct ompi_proc_t *ompi_group_dense_lookup (ompi_group_t *group,
|
||||
ompi_proc_t *real_proc =
|
||||
(ompi_proc_t *) ompi_proc_for_name (ompi_proc_sentinel_to_name ((uintptr_t) proc));
|
||||
|
||||
if (opal_atomic_compare_exchange_strong_ptr (group->grp_proc_pointers + peer_id, &proc, real_proc)) {
|
||||
if (opal_atomic_compare_exchange_strong_ptr ((opal_atomic_intptr_t *)(group->grp_proc_pointers + peer_id),
|
||||
(intptr_t *) &proc, (intptr_t) real_proc)) {
|
||||
OBJ_RETAIN(real_proc);
|
||||
}
|
||||
|
||||
|
@ -385,11 +385,11 @@ typedef int (MPI_Datarep_conversion_function)(void *, MPI_Datatype,
|
||||
typedef void (MPI_Comm_errhandler_function)(MPI_Comm *, int *, ...);
|
||||
|
||||
/* This is a little hackish, but errhandler.h needs space for a
|
||||
MPI_File_errhandler_fn. While it could just be removed, this
|
||||
MPI_File_errhandler_function. While it could just be removed, this
|
||||
allows us to maintain a stable ABI within OMPI, at least for
|
||||
apps that don't use MPI I/O. */
|
||||
typedef void (ompi_file_errhandler_fn)(MPI_File *, int *, ...);
|
||||
typedef ompi_file_errhandler_fn MPI_File_errhandler_function;
|
||||
typedef void (ompi_file_errhandler_function)(MPI_File *, int *, ...);
|
||||
typedef ompi_file_errhandler_function MPI_File_errhandler_function;
|
||||
typedef void (MPI_Win_errhandler_function)(MPI_Win *, int *, ...);
|
||||
typedef void (MPI_User_function)(void *, void *, int *, MPI_Datatype *);
|
||||
typedef int (MPI_Comm_copy_attr_function)(MPI_Comm, int, void *,
|
||||
@ -412,7 +412,7 @@ typedef int (MPI_Grequest_cancel_function)(void *, int);
|
||||
*/
|
||||
typedef MPI_Comm_errhandler_function MPI_Comm_errhandler_fn
|
||||
__mpi_interface_removed__("MPI_Comm_errhandler_fn was removed in MPI-3.0; use MPI_Comm_errhandler_function instead");
|
||||
typedef ompi_file_errhandler_fn MPI_File_errhandler_fn
|
||||
typedef ompi_file_errhandler_function MPI_File_errhandler_fn
|
||||
__mpi_interface_removed__("MPI_File_errhandler_fn was removed in MPI-3.0; use MPI_File_errhandler_function instead");
|
||||
typedef MPI_Win_errhandler_function MPI_Win_errhandler_fn
|
||||
__mpi_interface_removed__("MPI_Win_errhandler_fn was removed in MPI-3.0; use MPI_Win_errhandler_function instead");
|
||||
@ -1088,8 +1088,13 @@ OMPI_DECLSPEC extern struct ompi_predefined_datatype_t ompi_mpi_ub __mpi_interfa
|
||||
#define MPI_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_int)
|
||||
#define MPI_SHORT_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_short_int)
|
||||
#define MPI_2INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_2int)
|
||||
#if !OMPI_OMIT_MPI1_COMPAT_DECLS
|
||||
/*
|
||||
* Removed datatypes
|
||||
*/
|
||||
#define MPI_UB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_ub)
|
||||
#define MPI_LB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_lb)
|
||||
#endif
|
||||
#define MPI_WCHAR OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_wchar)
|
||||
#if OPAL_HAVE_LONG_LONG
|
||||
#define MPI_LONG_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_long_int)
|
||||
|
@ -90,7 +90,7 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
|
||||
mca_coll_base_module_t *module)
|
||||
{
|
||||
int line = -1, rank, size, sendto, recvfrom, distance, blockcount, err = 0;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -99,9 +99,6 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgather_intra_bruck rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -262,7 +259,7 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
|
||||
{
|
||||
int line = -1, rank, size, pow2size, err;
|
||||
int remote, distance, sendblocklocation;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -289,9 +286,6 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
|
||||
"coll:base:allgather_intra_recursivedoubling rank %d, size %d",
|
||||
rank, size));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -369,7 +363,7 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
|
||||
mca_coll_base_module_t *module)
|
||||
{
|
||||
int line = -1, rank, size, err, sendto, recvfrom, i, recvdatafrom, senddatafrom;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -378,9 +372,6 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgather_intra_ring rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -499,7 +490,7 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
|
||||
{
|
||||
int line = -1, rank, size, i, even_rank, err;
|
||||
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -517,9 +508,6 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgather_intra_neighborexchange rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -616,7 +604,7 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
|
||||
{
|
||||
int line = -1, err, rank, remote;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
ptrdiff_t sext, rext, lb;
|
||||
ptrdiff_t rext, lb;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
|
||||
@ -627,9 +615,6 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
|
||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||
}
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
|
@ -100,7 +100,7 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
|
||||
{
|
||||
int line = -1, err = 0, rank, size, sendto, recvfrom, distance, blockcount, i;
|
||||
int *new_rcounts = NULL, *new_rdispls = NULL, *new_scounts = NULL, *new_sdispls = NULL;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
||||
|
||||
@ -110,9 +110,6 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgather_intra_bruck rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -229,7 +226,7 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
|
||||
mca_coll_base_module_t *module)
|
||||
{
|
||||
int line = -1, rank, size, sendto, recvfrom, i, recvdatafrom, senddatafrom, err = 0;
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -238,9 +235,6 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgatherv_intra_ring rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -361,7 +355,7 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
|
||||
int line = -1, rank, size, i, even_rank, err = 0;
|
||||
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
|
||||
int new_scounts[2], new_sdispls[2], new_rcounts[2], new_rdispls[2];
|
||||
ptrdiff_t slb, rlb, sext, rext;
|
||||
ptrdiff_t rlb, rext;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
struct ompi_datatype_t *new_rdtype, *new_sdtype;
|
||||
|
||||
@ -381,9 +375,6 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"coll:base:allgatherv_intra_neighborexchange rank %d", rank));
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
@ -509,7 +500,7 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
|
||||
{
|
||||
int line = -1, err = 0, rank, remote;
|
||||
char *tmpsend = NULL, *tmprecv = NULL;
|
||||
ptrdiff_t sext, rext, lb;
|
||||
ptrdiff_t rext, lb;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
|
||||
@ -520,9 +511,6 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
|
||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||
}
|
||||
|
||||
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
|
@ -350,7 +350,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
|
||||
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
||||
ptrdiff_t true_lb, true_extent, lb, extent;
|
||||
ptrdiff_t block_offset, max_real_segsize;
|
||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
||||
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
rank = ompi_comm_rank(comm);
|
||||
@ -528,6 +528,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
|
||||
error_hndl:
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
||||
__FILE__, line, rank, ret));
|
||||
ompi_coll_base_free_reqs(reqs, 2);
|
||||
(void)line; // silence compiler warning
|
||||
if (NULL != inbuf[0]) free(inbuf[0]);
|
||||
if (NULL != inbuf[1]) free(inbuf[1]);
|
||||
@ -627,7 +628,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
|
||||
size_t typelng;
|
||||
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
|
||||
ptrdiff_t block_offset, max_real_segsize;
|
||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
||||
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||
ptrdiff_t lb, extent, gap;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
@ -847,6 +848,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
|
||||
error_hndl:
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
|
||||
__FILE__, line, rank, ret));
|
||||
ompi_coll_base_free_reqs(reqs, 2);
|
||||
(void)line; // silence compiler warning
|
||||
if (NULL != inbuf[0]) free(inbuf[0]);
|
||||
if (NULL != inbuf[1]) free(inbuf[1]);
|
||||
|
@ -393,6 +393,7 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
|
||||
if (0 < total_reqs) {
|
||||
reqs = ompi_coll_base_comm_get_reqs(module->base_data, 2 * total_reqs);
|
||||
if (NULL == reqs) { error = -1; line = __LINE__; goto error_hndl; }
|
||||
reqs[0] = reqs[1] = MPI_REQUEST_NULL;
|
||||
}
|
||||
|
||||
prcv = (char *) rbuf;
|
||||
@ -468,6 +469,15 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
|
||||
return MPI_SUCCESS;
|
||||
|
||||
error_hndl:
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == error) {
|
||||
for( ri = 0; ri < nreqs; ri++ ) {
|
||||
if (MPI_REQUEST_NULL == reqs[ri]) continue;
|
||||
if (MPI_ERR_PENDING == reqs[ri]->req_status.MPI_ERROR) continue;
|
||||
error = reqs[ri]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"%s:%4d\tError occurred %d, rank %2d", __FILE__, line, error,
|
||||
rank));
|
||||
@ -661,7 +671,16 @@ int ompi_coll_base_alltoall_intra_basic_linear(const void *sbuf, int scount,
|
||||
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
|
||||
|
||||
err_hndl:
|
||||
if( MPI_SUCCESS != err ) {
|
||||
if (MPI_SUCCESS != err) {
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == err) {
|
||||
for( i = 0; i < nreqs; i++ ) {
|
||||
if (MPI_REQUEST_NULL == req[i]) continue;
|
||||
if (MPI_ERR_PENDING == req[i]->req_status.MPI_ERROR) continue;
|
||||
err = req[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||
__FILE__, line, err, rank) );
|
||||
(void)line; // silence compiler warning
|
||||
|
@ -3,7 +3,7 @@
|
||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||
* University Research and Technology
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
||||
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
@ -276,6 +276,15 @@ ompi_coll_base_alltoallv_intra_basic_linear(const void *sbuf, const int *scounts
|
||||
err = ompi_request_wait_all(nreqs, reqs, MPI_STATUSES_IGNORE);
|
||||
|
||||
err_hndl:
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == err) {
|
||||
for( i = 0; i < nreqs; i++ ) {
|
||||
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||
err = reqs[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
/* Free the requests in all cases as they are persistent */
|
||||
ompi_coll_base_free_reqs(reqs, nreqs);
|
||||
|
||||
|
@ -3,7 +3,7 @@
|
||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||
* University Research and Technology
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
||||
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
@ -102,8 +102,10 @@ int ompi_coll_base_barrier_intra_doublering(struct ompi_communicator_t *comm,
|
||||
{
|
||||
int rank, size, err = 0, line = 0, left, right;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return OMPI_SUCCESS;
|
||||
rank = ompi_comm_rank(comm);
|
||||
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,"ompi_coll_base_barrier_intra_doublering rank %d", rank));
|
||||
|
||||
@ -172,8 +174,10 @@ int ompi_coll_base_barrier_intra_recursivedoubling(struct ompi_communicator_t *c
|
||||
{
|
||||
int rank, size, adjsize, err, line, mask, remote;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return OMPI_SUCCESS;
|
||||
rank = ompi_comm_rank(comm);
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"ompi_coll_base_barrier_intra_recursivedoubling rank %d",
|
||||
rank));
|
||||
@ -251,8 +255,10 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
|
||||
{
|
||||
int rank, size, distance, to, from, err, line = 0;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return MPI_SUCCESS;
|
||||
rank = ompi_comm_rank(comm);
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"ompi_coll_base_barrier_intra_bruck rank %d", rank));
|
||||
|
||||
@ -285,16 +291,19 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
|
||||
int ompi_coll_base_barrier_intra_two_procs(struct ompi_communicator_t *comm,
|
||||
mca_coll_base_module_t *module)
|
||||
{
|
||||
int remote, err;
|
||||
int remote, size, err;
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return MPI_SUCCESS;
|
||||
if( 2 != ompi_comm_size(comm) ) {
|
||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||
}
|
||||
|
||||
remote = ompi_comm_rank(comm);
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"ompi_coll_base_barrier_intra_two_procs rank %d", remote));
|
||||
|
||||
if (2 != ompi_comm_size(comm)) {
|
||||
return MPI_ERR_UNSUPPORTED_OPERATION;
|
||||
}
|
||||
|
||||
remote = (remote + 1) & 0x1;
|
||||
|
||||
err = ompi_coll_base_sendrecv_zero(remote, MCA_COLL_BASE_TAG_BARRIER,
|
||||
@ -324,8 +333,10 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
|
||||
int i, err, rank, size, line;
|
||||
ompi_request_t** requests = NULL;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return MPI_SUCCESS;
|
||||
rank = ompi_comm_rank(comm);
|
||||
|
||||
/* All non-root send & receive zero-length message. */
|
||||
if (rank > 0) {
|
||||
@ -367,11 +378,21 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
|
||||
/* All done */
|
||||
return MPI_SUCCESS;
|
||||
err_hndl:
|
||||
if( NULL != requests ) {
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == err) {
|
||||
for( i = 0; i < size; i++ ) {
|
||||
if (MPI_REQUEST_NULL == requests[i]) continue;
|
||||
if (MPI_ERR_PENDING == requests[i]->req_status.MPI_ERROR) continue;
|
||||
err = requests[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs(requests, size);
|
||||
}
|
||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||
__FILE__, line, err, rank) );
|
||||
(void)line; // silence compiler warning
|
||||
if( NULL != requests )
|
||||
ompi_coll_base_free_reqs(requests, size);
|
||||
return err;
|
||||
}
|
||||
/* copied function (with appropriate renaming) ends here */
|
||||
@ -385,8 +406,10 @@ int ompi_coll_base_barrier_intra_tree(struct ompi_communicator_t *comm,
|
||||
{
|
||||
int rank, size, depth, err, jump, partner;
|
||||
|
||||
rank = ompi_comm_rank(comm);
|
||||
size = ompi_comm_size(comm);
|
||||
if( 1 == size )
|
||||
return MPI_SUCCESS;
|
||||
rank = ompi_comm_rank(comm);
|
||||
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
|
||||
"ompi_coll_base_barrier_intra_tree %d",
|
||||
rank));
|
||||
|
@ -3,7 +3,7 @@
|
||||
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
|
||||
* University Research and Technology
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2004-2016 The University of Tennessee and The University
|
||||
* Copyright (c) 2004-2017 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
|
||||
@ -214,13 +214,29 @@ ompi_coll_base_bcast_intra_generic( void* buffer,
|
||||
return (MPI_SUCCESS);
|
||||
|
||||
error_hndl:
|
||||
if (MPI_ERR_IN_STATUS == err) {
|
||||
for( req_index = 0; req_index < 2; req_index++ ) {
|
||||
if (MPI_REQUEST_NULL == recv_reqs[req_index]) continue;
|
||||
if (MPI_ERR_PENDING == recv_reqs[req_index]->req_status.MPI_ERROR) continue;
|
||||
err = recv_reqs[req_index]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs( recv_reqs, 2);
|
||||
if( NULL != send_reqs ) {
|
||||
if (MPI_ERR_IN_STATUS == err) {
|
||||
for( req_index = 0; req_index < tree->tree_nextsize; req_index++ ) {
|
||||
if (MPI_REQUEST_NULL == send_reqs[req_index]) continue;
|
||||
if (MPI_ERR_PENDING == send_reqs[req_index]->req_status.MPI_ERROR) continue;
|
||||
err = send_reqs[req_index]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
|
||||
}
|
||||
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
|
||||
__FILE__, line, err, rank) );
|
||||
(void)line; // silence compiler warnings
|
||||
ompi_coll_base_free_reqs( recv_reqs, 2);
|
||||
if( NULL != send_reqs ) {
|
||||
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
|
||||
}
|
||||
|
||||
return err;
|
||||
}
|
||||
@ -649,12 +665,21 @@ ompi_coll_base_bcast_intra_basic_linear(void *buff, int count,
|
||||
* care what the error was -- just that there *was* an error. The
|
||||
* PML will finish all requests, even if one or more of them fail.
|
||||
* i.e., by the end of this call, all the requests are free-able.
|
||||
* So free them anyway -- even if there was an error, and return
|
||||
* the error after we free everything. */
|
||||
* So free them anyway -- even if there was an error.
|
||||
* Note we still need to get the actual error, as collective
|
||||
* operations cannot return MPI_ERR_IN_STATUS.
|
||||
*/
|
||||
|
||||
err = ompi_request_wait_all(i, reqs, MPI_STATUSES_IGNORE);
|
||||
err_hndl:
|
||||
if( MPI_SUCCESS != err ) { /* Free the reqs */
|
||||
/* first find the real error code */
|
||||
for( preq = reqs; preq < reqs+i; preq++ ) {
|
||||
if (MPI_REQUEST_NULL == *preq) continue;
|
||||
if (MPI_ERR_PENDING == (*preq)->req_status.MPI_ERROR) continue;
|
||||
err = (*preq)->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
ompi_coll_base_free_reqs(reqs, i);
|
||||
}
|
||||
|
||||
|
@ -326,6 +326,15 @@ ompi_coll_base_gather_intra_linear_sync(const void *sbuf, int scount,
|
||||
return MPI_SUCCESS;
|
||||
error_hndl:
|
||||
if (NULL != reqs) {
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == ret) {
|
||||
for( i = 0; i < size; i++ ) {
|
||||
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||
ret = reqs[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs(reqs, size);
|
||||
}
|
||||
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
||||
|
@ -338,16 +338,34 @@ int ompi_coll_base_reduce_generic( const void* sendbuf, void* recvbuf, int origi
|
||||
return OMPI_SUCCESS;
|
||||
|
||||
error_hndl: /* error handler */
|
||||
/* find a real error code */
|
||||
if (MPI_ERR_IN_STATUS == ret) {
|
||||
for( i = 0; i < 2; i++ ) {
|
||||
if (MPI_REQUEST_NULL == reqs[i]) continue;
|
||||
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
|
||||
ret = reqs[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs(reqs, 2);
|
||||
if( NULL != sreq ) {
|
||||
if (MPI_ERR_IN_STATUS == ret) {
|
||||
for( i = 0; i < max_outstanding_reqs; i++ ) {
|
||||
if (MPI_REQUEST_NULL == sreq[i]) continue;
|
||||
if (MPI_ERR_PENDING == sreq[i]->req_status.MPI_ERROR) continue;
|
||||
ret = sreq[i]->req_status.MPI_ERROR;
|
||||
break;
|
||||
}
|
||||
}
|
||||
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
|
||||
}
|
||||
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
|
||||
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
|
||||
if( accumbuf_free != NULL ) free(accumbuf);
|
||||
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
|
||||
"ERROR_HNDL: node %d file %s line %d error %d\n",
|
||||
rank, __FILE__, line, ret ));
|
||||
(void)line; // silence compiler warning
|
||||
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
|
||||
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
|
||||
if( accumbuf_free != NULL ) free(accumbuf);
|
||||
if( NULL != sreq ) {
|
||||
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
|
@ -464,7 +464,7 @@ ompi_coll_base_reduce_scatter_intra_ring( const void *sbuf, void *rbuf, const in
|
||||
char *tmpsend = NULL, *tmprecv = NULL, *accumbuf = NULL, *accumbuf_free = NULL;
|
||||
char *inbuf_free[2] = {NULL, NULL}, *inbuf[2] = {NULL, NULL};
|
||||
ptrdiff_t extent, max_real_segsize, dsize, gap = 0;
|
||||
ompi_request_t *reqs[2] = {NULL, NULL};
|
||||
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
|
||||
|
||||
size = ompi_comm_size(comm);
|
||||
rank = ompi_comm_rank(comm);
|
||||
|
@ -41,7 +41,7 @@ int ompi_coll_base_sendrecv_actual( const void* sendbuf, size_t scount,
|
||||
{ /* post receive first, then send, then wait... should be fast (I hope) */
|
||||
int err, line = 0;
|
||||
size_t rtypesize, stypesize;
|
||||
ompi_request_t *req;
|
||||
ompi_request_t *req = MPI_REQUEST_NULL;
|
||||
ompi_status_public_t rstatus;
|
||||
|
||||
/* post new irecv */
|
||||
|
@ -71,12 +71,13 @@ BEGIN_C_DECLS
|
||||
|
||||
extern bool libnbc_ibcast_skip_dt_decision;
|
||||
extern int libnbc_iexscan_algorithm;
|
||||
extern int libnbc_iscan_algorithm;
|
||||
|
||||
struct ompi_coll_libnbc_component_t {
|
||||
mca_coll_base_component_2_0_0_t super;
|
||||
opal_free_list_t requests;
|
||||
opal_list_t active_requests;
|
||||
int32_t active_comms;
|
||||
opal_atomic_int32_t active_comms;
|
||||
opal_mutex_t lock; /* protect access to the active_requests list */
|
||||
};
|
||||
typedef struct ompi_coll_libnbc_component_t ompi_coll_libnbc_component_t;
|
||||
|
@ -54,6 +54,14 @@ static mca_base_var_enum_value_t iexscan_algorithms[] = {
|
||||
{0, NULL}
|
||||
};
|
||||
|
||||
int libnbc_iscan_algorithm = 0; /* iscan user forced algorithm */
|
||||
static mca_base_var_enum_value_t iscan_algorithms[] = {
|
||||
{0, "ignore"},
|
||||
{1, "linear"},
|
||||
{2, "recursive_doubling"},
|
||||
{0, NULL}
|
||||
};
|
||||
|
||||
static int libnbc_open(void);
|
||||
static int libnbc_close(void);
|
||||
static int libnbc_register(void);
|
||||
@ -177,6 +185,16 @@ libnbc_register(void)
|
||||
&libnbc_iexscan_algorithm);
|
||||
OBJ_RELEASE(new_enum);
|
||||
|
||||
libnbc_iscan_algorithm = 0;
|
||||
(void) mca_base_var_enum_create("coll_libnbc_iscan_algorithms", iscan_algorithms, &new_enum);
|
||||
mca_base_component_var_register(&mca_coll_libnbc_component.super.collm_version,
|
||||
"iscan_algorithm",
|
||||
"Which iscan algorithm is used: 0 ignore, 1 linear, 2 recursive_doubling",
|
||||
MCA_BASE_VAR_TYPE_INT, new_enum, 0, MCA_BASE_VAR_FLAG_SETTABLE,
|
||||
OPAL_INFO_LVL_5, MCA_BASE_VAR_SCOPE_ALL,
|
||||
&libnbc_iscan_algorithm);
|
||||
OBJ_RELEASE(new_enum);
|
||||
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
|
@ -62,7 +62,6 @@ struct dict {
|
||||
int (*_insert) __P((void *obj, void *k, void *d, int ow));
|
||||
int (*_probe) __P((void *obj, void *key, void **dat));
|
||||
void *(*_search) __P((void *obj, const void *k));
|
||||
const void *(*_csearch) __P((const void *obj, const void *k));
|
||||
int (*_remove) __P((void *obj, const void *key, int del));
|
||||
void (*_walk) __P((void *obj, dict_vis_func func));
|
||||
unsigned (*_count) __P((const void *obj));
|
||||
@ -75,7 +74,6 @@ struct dict {
|
||||
#define dict_insert(dct,k,d,o) (dct)->_insert((dct)->_object, (k), (d), (o))
|
||||
#define dict_probe(dct,k,d) (dct)->_probe((dct)->_object, (k), (d))
|
||||
#define dict_search(dct,k) (dct)->_search((dct)->_object, (k))
|
||||
#define dict_csearch(dct,k) (dct)->_csearch((dct)->_object, (k))
|
||||
#define dict_remove(dct,k,del) (dct)->_remove((dct)->_object, (k), (del))
|
||||
#define dict_walk(dct,f) (dct)->_walk((dct)->_object, (f))
|
||||
#define dict_count(dct) (dct)->_count((dct)->_object)
|
||||
|
@ -15,7 +15,6 @@
|
||||
typedef int (*insert_func) __P((void *, void *k, void *d, int o));
|
||||
typedef int (*probe_func) __P((void *, void *k, void **d));
|
||||
typedef void *(*search_func) __P((void *, const void *k));
|
||||
typedef const void *(*csearch_func) __P((const void *, const void *k));
|
||||
typedef int (*remove_func) __P((void *, const void *k, int d));
|
||||
typedef void (*walk_func) __P((void *, dict_vis_func visit));
|
||||
typedef unsigned (*count_func) __P((const void *));
|
||||
|
@ -90,7 +90,6 @@ hb_dict_new(dict_cmp_func key_cmp, dict_del_func key_del,
|
||||
dct->_insert = (insert_func)hb_tree_insert;
|
||||
dct->_probe = (probe_func)hb_tree_probe;
|
||||
dct->_search = (search_func)hb_tree_search;
|
||||
dct->_csearch = (csearch_func)hb_tree_csearch;
|
||||
dct->_remove = (remove_func)hb_tree_remove;
|
||||
dct->_empty = (empty_func)hb_tree_empty;
|
||||
dct->_walk = (walk_func)hb_tree_walk;
|
||||
@ -170,12 +169,6 @@ hb_tree_search(hb_tree *tree, const void *key)
|
||||
return NULL;
|
||||
}
|
||||
|
||||
const void *
|
||||
hb_tree_csearch(const hb_tree *tree, const void *key)
|
||||
{
|
||||
return hb_tree_csearch((hb_tree *)tree, key);
|
||||
}
|
||||
|
||||
int
|
||||
hb_tree_insert(hb_tree *tree, void *key, void *dat, int overwrite)
|
||||
{
|
||||
|
@ -26,7 +26,6 @@ void hb_tree_destroy __P((hb_tree *tree, int del));
|
||||
int hb_tree_insert __P((hb_tree *tree, void *key, void *dat, int overwrite));
|
||||
int hb_tree_probe __P((hb_tree *tree, void *key, void **dat));
|
||||
void *hb_tree_search __P((hb_tree *tree, const void *key));
|
||||
const void *hb_tree_csearch __P((const hb_tree *tree, const void *key));
|
||||
int hb_tree_remove __P((hb_tree *tree, const void *key, int del));
|
||||
void hb_tree_empty __P((hb_tree *tree, int del));
|
||||
void hb_tree_walk __P((hb_tree *tree, dict_vis_func visit));
|
||||
|
@ -11,8 +11,8 @@
|
||||
* Copyright (c) 2012 Oracle and/or its affiliates. All rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -130,7 +130,7 @@ int ompi_coll_libnbc_iallgatherv(const void* sendbuf, int sendcount, MPI_Datatyp
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -209,7 +209,7 @@ int ompi_coll_libnbc_iallgatherv_inter(const void* sendbuf, int sendcount, MPI_D
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -7,8 +7,8 @@
|
||||
* rights reserved.
|
||||
* Copyright (c) 2013-2017 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -206,7 +206,7 @@ int ompi_coll_libnbc_iallreduce(const void* sendbuf, void* recvbuf, int count, M
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -289,7 +289,7 @@ int ompi_coll_libnbc_iallreduce_inter(const void* sendbuf, void* recvbuf, int co
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -292,7 +292,7 @@ int ompi_coll_libnbc_ialltoall(const void* sendbuf, int sendcount, MPI_Datatype
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -376,7 +376,7 @@ int ompi_coll_libnbc_ialltoall_inter (const void* sendbuf, int sendcount, MPI_Da
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -153,7 +153,7 @@ int ompi_coll_libnbc_ialltoallv(const void* sendbuf, const int *sendcounts, cons
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -241,7 +241,7 @@ int ompi_coll_libnbc_ialltoallv_inter (const void* sendbuf, const int *sendcount
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -139,7 +139,7 @@ int ompi_coll_libnbc_ialltoallw(const void* sendbuf, const int *sendcounts, cons
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -214,7 +214,7 @@ int ompi_coll_libnbc_ialltoallw_inter(const void* sendbuf, const int *sendcounts
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -7,8 +7,8 @@
|
||||
* rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
@ -108,7 +108,7 @@ int ompi_coll_libnbc_ibarrier(struct ompi_communicator_t *comm, ompi_request_t *
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -195,7 +195,7 @@ int ompi_coll_libnbc_ibarrier_inter(struct ompi_communicator_t *comm, ompi_reque
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
|
||||
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ibcast(void *buffer, int count, MPI_Datatype datatype, int
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -405,7 +405,7 @@ int ompi_coll_libnbc_ibcast_inter(void *buffer, int count, MPI_Datatype datatype
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -7,8 +7,8 @@
|
||||
* rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -176,7 +176,7 @@ int ompi_coll_libnbc_iexscan(const void* sendbuf, void* recvbuf, int count, MPI_
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -8,8 +8,8 @@
|
||||
* Copyright (c) 2013 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -185,7 +185,7 @@ int ompi_coll_libnbc_igather(const void* sendbuf, int sendcount, MPI_Datatype se
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -265,7 +265,7 @@ int ompi_coll_libnbc_igather_inter(const void* sendbuf, int sendcount, MPI_Datat
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -8,8 +8,8 @@
|
||||
* Copyright (c) 2013 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
|
||||
@ -117,7 +117,7 @@ int ompi_coll_libnbc_igatherv(const void* sendbuf, int sendcount, MPI_Datatype s
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -197,7 +197,7 @@ int ompi_coll_libnbc_igatherv_inter(const void* sendbuf, int sendcount, MPI_Data
|
||||
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -173,7 +173,7 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -181,157 +181,6 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
/* better binomial bcast
|
||||
* working principle:
|
||||
* - each node gets a virtual rank vrank
|
||||
* - the 'root' node get vrank 0
|
||||
* - node 0 gets the vrank of the 'root'
|
||||
* - all other ranks stay identical (they do not matter)
|
||||
*
|
||||
* Algorithm:
|
||||
* - each node with vrank > 2^r and vrank < 2^r+1 receives from node
|
||||
* vrank - 2^r (vrank=1 receives from 0, vrank 0 receives never)
|
||||
* - each node sends each round r to node vrank + 2^r
|
||||
* - a node stops to send if 2^r > commsize
|
||||
*/
|
||||
#define RANK2VRANK(rank, vrank, root) \
|
||||
{ \
|
||||
vrank = rank; \
|
||||
if (rank == 0) vrank = root; \
|
||||
if (rank == root) vrank = 0; \
|
||||
}
|
||||
#define VRANK2RANK(rank, vrank, root) \
|
||||
{ \
|
||||
rank = vrank; \
|
||||
if (vrank == 0) rank = root; \
|
||||
if (vrank == root) rank = 0; \
|
||||
}
|
||||
static inline int bcast_sched_binomial(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
|
||||
int maxr, vrank, peer, res;
|
||||
|
||||
maxr = (int)ceil((log((double)p)/LOG2));
|
||||
|
||||
RANK2VRANK(rank, vrank, root);
|
||||
|
||||
/* receive from the right hosts */
|
||||
if (vrank != 0) {
|
||||
for (int r = 0 ; r < maxr ; ++r) {
|
||||
if ((vrank >= (1 << r)) && (vrank < (1 << (r + 1)))) {
|
||||
VRANK2RANK(peer, vrank - (1 << r), root);
|
||||
res = NBC_Sched_recv (buffer, false, count, datatype, peer, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
res = NBC_Sched_barrier (schedule);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
/* now send to the right hosts */
|
||||
for (int r = 0 ; r < maxr ; ++r) {
|
||||
if (((vrank + (1 << r) < p) && (vrank < (1 << r))) || (vrank == 0)) {
|
||||
VRANK2RANK(peer, vrank + (1 << r), root);
|
||||
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
/* simple linear MPI_Ibcast */
|
||||
static inline int bcast_sched_linear(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
|
||||
int res;
|
||||
|
||||
/* send to all others */
|
||||
if(rank == root) {
|
||||
for (int peer = 0 ; peer < p ; ++peer) {
|
||||
if (peer != root) {
|
||||
/* send msg to peer */
|
||||
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
}
|
||||
} else {
|
||||
/* recv msg from root */
|
||||
res = NBC_Sched_recv (buffer, false, count, datatype, root, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
/* simple chained MPI_Ibcast */
|
||||
static inline int bcast_sched_chain(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype, int fragsize, size_t size) {
|
||||
int res, vrank, rpeer, speer, numfrag, fragcount, thiscount;
|
||||
MPI_Aint ext;
|
||||
char *buf;
|
||||
|
||||
RANK2VRANK(rank, vrank, root);
|
||||
VRANK2RANK(rpeer, vrank-1, root);
|
||||
VRANK2RANK(speer, vrank+1, root);
|
||||
res = ompi_datatype_type_extent(datatype, &ext);
|
||||
if (MPI_SUCCESS != res) {
|
||||
NBC_Error("MPI Error in ompi_datatype_type_extent() (%i)", res);
|
||||
return res;
|
||||
}
|
||||
|
||||
if (count == 0) {
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
numfrag = count * size/fragsize;
|
||||
if ((count * size) % fragsize != 0) {
|
||||
numfrag++;
|
||||
}
|
||||
|
||||
fragcount = count/numfrag;
|
||||
|
||||
for (int fragnum = 0 ; fragnum < numfrag ; ++fragnum) {
|
||||
buf = (char *) buffer + fragnum * fragcount * ext;
|
||||
thiscount = fragcount;
|
||||
if (fragnum == numfrag-1) {
|
||||
/* last fragment may not be full */
|
||||
thiscount = count - fragcount * fragnum;
|
||||
}
|
||||
|
||||
/* root does not receive */
|
||||
if (vrank != 0) {
|
||||
res = NBC_Sched_recv (buf, false, thiscount, datatype, rpeer, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
/* last rank does not send */
|
||||
if (vrank != p-1) {
|
||||
res = NBC_Sched_send (buf, false, thiscount, datatype, speer, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
|
||||
/* this barrier here seems awaward but isn't!!!! */
|
||||
if (vrank == 0) {
|
||||
res = NBC_Sched_barrier (schedule);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
return res;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
int ompi_coll_libnbc_neighbor_allgather_init(const void *sbuf, int scount, MPI_Datatype stype, void *rbuf,
|
||||
int rcount, MPI_Datatype rtype, struct ompi_communicator_t *comm,
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -175,7 +175,7 @@ int ompi_coll_libnbc_ineighbor_allgatherv(const void *sbuf, int scount, MPI_Data
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -177,7 +177,7 @@ int ompi_coll_libnbc_ineighbor_alltoall(const void *sbuf, int scount, MPI_Dataty
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ineighbor_alltoallv(const void *sbuf, const int *scounts, c
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -167,7 +167,7 @@ int ompi_coll_libnbc_ineighbor_alltoallw(const void *sbuf, const int *scounts, c
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -516,6 +516,11 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
|
||||
int res;
|
||||
ptrdiff_t ext, lb;
|
||||
|
||||
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
|
||||
if (OMPI_SUCCESS != res) {
|
||||
NBC_Error ("MPI Error in ompi_datatype_pack_external_size() (%i)", res);
|
||||
return res;
|
||||
}
|
||||
#if OPAL_CUDA_SUPPORT
|
||||
if(NBC_Type_intrinsic(srctype) && !(opal_cuda_check_bufs((char *)tgt, (char *)src))) {
|
||||
#else
|
||||
@ -523,7 +528,6 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
|
||||
#endif /* OPAL_CUDA_SUPPORT */
|
||||
/* if we have the same types and they are contiguous (intrinsic
|
||||
* types are contiguous), we can just use a single memcpy */
|
||||
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
|
||||
res = ompi_datatype_get_extent (srctype, &lb, &ext);
|
||||
if (OMPI_SUCCESS != res) {
|
||||
NBC_Error ("MPI Error in MPI_Type_extent() (%i)", res);
|
||||
|
@ -7,8 +7,8 @@
|
||||
* rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -218,7 +218,7 @@ int ompi_coll_libnbc_ireduce(const void* sendbuf, void* recvbuf, int count, MPI_
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -284,7 +284,7 @@ int ompi_coll_libnbc_ireduce_inter(const void* sendbuf, void* recvbuf, int count
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -7,8 +7,8 @@
|
||||
* rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
@ -219,7 +219,7 @@ int ompi_coll_libnbc_ireduce_scatter (const void* sendbuf, void* recvbuf, const
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -361,7 +361,7 @@ int ompi_coll_libnbc_ireduce_scatter_inter (const void* sendbuf, void* recvbuf,
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -8,8 +8,8 @@
|
||||
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -217,7 +217,7 @@ int ompi_coll_libnbc_ireduce_scatter_block(const void* sendbuf, void* recvbuf, i
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -356,7 +356,7 @@ int ompi_coll_libnbc_ireduce_scatter_block_inter(const void* sendbuf, void* recv
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -5,8 +5,8 @@
|
||||
* Corporation. All rights reserved.
|
||||
* Copyright (c) 2006 The Technical University of Chemnitz. All
|
||||
* rights reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
@ -18,8 +18,20 @@
|
||||
* Author(s): Torsten Hoefler <htor@cs.indiana.edu>
|
||||
*
|
||||
*/
|
||||
#include "opal/include/opal/align.h"
|
||||
#include "ompi/op/op.h"
|
||||
|
||||
#include "nbc_internal.h"
|
||||
|
||||
static inline int scan_sched_linear(
|
||||
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
|
||||
void *tmpbuf);
|
||||
static inline int scan_sched_recursivedoubling(
|
||||
int rank, int comm_size, const void *sendbuf, void *recvbuf,
|
||||
int count, MPI_Datatype datatype, MPI_Op op, char inplace,
|
||||
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2);
|
||||
|
||||
#ifdef NBC_CACHE_SCHEDULE
|
||||
/* tree comparison function for schedule cache */
|
||||
int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
|
||||
@ -39,27 +51,41 @@ int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
|
||||
}
|
||||
#endif
|
||||
|
||||
/* linear iscan
|
||||
* working principle:
|
||||
* 1. each node (but node 0) receives from left neighbor
|
||||
* 2. performs op
|
||||
* 3. all but rank p-1 do sends to it's right neighbor and exits
|
||||
*
|
||||
*/
|
||||
static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
||||
struct ompi_communicator_t *comm, ompi_request_t ** request,
|
||||
struct mca_coll_base_module_2_3_0_t *module, bool persistent) {
|
||||
int rank, p, res;
|
||||
ptrdiff_t gap, span;
|
||||
NBC_Schedule *schedule;
|
||||
void *tmpbuf = NULL;
|
||||
char inplace;
|
||||
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
|
||||
int rank, p, res;
|
||||
ptrdiff_t gap, span;
|
||||
NBC_Schedule *schedule;
|
||||
void *tmpbuf = NULL, *tmpbuf1 = NULL, *tmpbuf2 = NULL;
|
||||
enum { NBC_SCAN_LINEAR, NBC_SCAN_RDBL } alg;
|
||||
char inplace;
|
||||
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
|
||||
|
||||
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
|
||||
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
|
||||
|
||||
rank = ompi_comm_rank (comm);
|
||||
p = ompi_comm_size (comm);
|
||||
rank = ompi_comm_rank (comm);
|
||||
p = ompi_comm_size (comm);
|
||||
|
||||
if (count == 0) {
|
||||
return nbc_get_noop_request(persistent, request);
|
||||
}
|
||||
|
||||
span = opal_datatype_span(&datatype->super, count, &gap);
|
||||
if (libnbc_iscan_algorithm == 2) {
|
||||
alg = NBC_SCAN_RDBL;
|
||||
ptrdiff_t span_align = OPAL_ALIGN(span, datatype->super.align, ptrdiff_t);
|
||||
tmpbuf = malloc(span_align + span);
|
||||
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
|
||||
tmpbuf1 = (void *)(-gap);
|
||||
tmpbuf2 = (char *)(span_align) - gap;
|
||||
} else {
|
||||
alg = NBC_SCAN_LINEAR;
|
||||
if (rank > 0) {
|
||||
tmpbuf = malloc(span);
|
||||
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
|
||||
}
|
||||
}
|
||||
|
||||
#ifdef NBC_CACHE_SCHEDULE
|
||||
NBC_Scan_args *args, *found, search;
|
||||
@ -75,60 +101,28 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
|
||||
#endif
|
||||
schedule = OBJ_NEW(NBC_Schedule);
|
||||
if (OPAL_UNLIKELY(NULL == schedule)) {
|
||||
return OMPI_ERR_OUT_OF_RESOURCE;
|
||||
}
|
||||
|
||||
if (!inplace) {
|
||||
/* copy data to receivebuf */
|
||||
res = NBC_Sched_copy ((void *)sendbuf, false, count, datatype,
|
||||
recvbuf, false, count, datatype, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
if(rank != 0) {
|
||||
span = opal_datatype_span(&datatype->super, count, &gap);
|
||||
tmpbuf = malloc (span);
|
||||
if (NULL == tmpbuf) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return OMPI_ERR_OUT_OF_RESOURCE;
|
||||
}
|
||||
|
||||
/* we have to wait until we have the data */
|
||||
res = NBC_Sched_recv ((void *)(-gap), true, count, datatype, rank-1, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
|
||||
/* perform the reduce in my local buffer */
|
||||
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
|
||||
res = NBC_Sched_op ((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
|
||||
true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
}
|
||||
|
||||
if (rank != p-1) {
|
||||
res = NBC_Sched_send (recvbuf, false, count, datatype, rank+1, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
if (alg == NBC_SCAN_LINEAR) {
|
||||
res = scan_sched_linear(rank, p, sendbuf, recvbuf, count, datatype,
|
||||
op, inplace, schedule, tmpbuf);
|
||||
} else {
|
||||
res = scan_sched_recursivedoubling(rank, p, sendbuf, recvbuf, count,
|
||||
datatype, op, inplace, schedule, tmpbuf1, tmpbuf2);
|
||||
}
|
||||
|
||||
res = NBC_Sched_commit (schedule);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
|
||||
res = NBC_Sched_commit(schedule);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
|
||||
#ifdef NBC_CACHE_SCHEDULE
|
||||
@ -162,14 +156,160 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
|
||||
}
|
||||
#endif
|
||||
|
||||
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
OBJ_RELEASE(schedule);
|
||||
free(tmpbuf);
|
||||
return res;
|
||||
}
|
||||
|
||||
return OMPI_SUCCESS;
|
||||
return OMPI_SUCCESS;
|
||||
}
|
||||
|
||||
/*
|
||||
* scan_sched_linear:
|
||||
*
|
||||
* Function: Linear algorithm for inclusive scan.
|
||||
* Accepts: Same as MPI_Iscan
|
||||
* Returns: MPI_SUCCESS or error code
|
||||
*
|
||||
* Working principle:
|
||||
* 1. Each process (but process 0) receives from left neighbor
|
||||
* 2. Performs op
|
||||
* 3. All but rank p-1 do sends to it's right neighbor and exits
|
||||
*
|
||||
* Schedule length: O(1)
|
||||
*/
|
||||
static inline int scan_sched_linear(
|
||||
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
|
||||
void *tmpbuf)
|
||||
{
|
||||
int res = OMPI_SUCCESS;
|
||||
|
||||
if (!inplace) {
|
||||
/* Copy data to recvbuf */
|
||||
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
|
||||
recvbuf, false, count, datatype, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
}
|
||||
|
||||
if (rank > 0) {
|
||||
ptrdiff_t gap;
|
||||
opal_datatype_span(&datatype->super, count, &gap);
|
||||
/* We have to wait until we have the data */
|
||||
res = NBC_Sched_recv((void *)(-gap), true, count, datatype, rank - 1, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
|
||||
/* Perform the reduce in my local buffer */
|
||||
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
|
||||
res = NBC_Sched_op((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
|
||||
true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
}
|
||||
|
||||
if (rank != comm_size - 1) {
|
||||
res = NBC_Sched_send(recvbuf, false, count, datatype, rank + 1, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
}
|
||||
|
||||
cleanup_and_return:
|
||||
return res;
|
||||
}
|
||||
|
||||
/*
|
||||
* scan_sched_recursivedoubling:
|
||||
*
|
||||
* Function: Recursive doubling algorithm for inclusive scan.
|
||||
* Accepts: Same as MPI_Iscan
|
||||
* Returns: MPI_SUCCESS or error code
|
||||
*
|
||||
* Description: Implements recursive doubling algorithm for MPI_Iscan.
|
||||
* The algorithm preserves order of operations so it can
|
||||
* be used both by commutative and non-commutative operations.
|
||||
*
|
||||
* Example for 5 processes and commutative operation MPI_SUM:
|
||||
* Process: 0 1 2 3 4
|
||||
* recvbuf: [0] [1] [2] [3] [4]
|
||||
* psend: [0] [1] [2] [3] [4]
|
||||
*
|
||||
* Step 1:
|
||||
* recvbuf: [0] [0+1] [2] [2+3] [4]
|
||||
* psend: [1+0] [0+1] [3+2] [2+3] [4]
|
||||
*
|
||||
* Step 2:
|
||||
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [4]
|
||||
* psend: [(3+2)+(1+0)] [(2+3)+(0+1)] [(1+0)+(3+2)] [(1+0)+(2+3)] [4]
|
||||
*
|
||||
* Step 3:
|
||||
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [((3+2)+(1+0))+4]
|
||||
* psend: [4+((3+2)+(1+0))] [((3+2)+(1+0))+4]
|
||||
*
|
||||
* Time complexity (worst case): \ceil(\log_2(p))(2\alpha + 2m\beta + 2m\gamma)
|
||||
* Memory requirements (per process): 2 * count * typesize = O(count)
|
||||
* Limitations: intra-communicators only
|
||||
* Schedule length: O(log(p))
|
||||
*/
|
||||
static inline int scan_sched_recursivedoubling(
|
||||
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
|
||||
MPI_Datatype datatype, MPI_Op op, char inplace,
|
||||
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2)
|
||||
{
|
||||
int res = OMPI_SUCCESS;
|
||||
|
||||
if (!inplace) {
|
||||
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
|
||||
recvbuf, false, count, datatype, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
}
|
||||
if (comm_size < 2)
|
||||
goto cleanup_and_return;
|
||||
|
||||
char *psend = (char *)tmpbuf1;
|
||||
char *precv = (char *)tmpbuf2;
|
||||
res = NBC_Sched_copy(recvbuf, false, count, datatype,
|
||||
psend, true, count, datatype, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
|
||||
int is_commute = ompi_op_is_commute(op);
|
||||
for (int mask = 1; mask < comm_size; mask <<= 1) {
|
||||
int remote = rank ^ mask;
|
||||
if (remote < comm_size) {
|
||||
res = NBC_Sched_send(psend, true, count, datatype, remote, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
res = NBC_Sched_recv(precv, true, count, datatype, remote, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
|
||||
if (rank > remote) {
|
||||
/* Accumulate prefix reduction: recvbuf = precv <op> recvbuf */
|
||||
res = NBC_Sched_op(precv, true, recvbuf, false, count,
|
||||
datatype, op, schedule, false);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
/* Partial result: psend = precv <op> psend */
|
||||
res = NBC_Sched_op(precv, true, psend, true, count,
|
||||
datatype, op, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
} else {
|
||||
if (is_commute) {
|
||||
/* psend = precv <op> psend */
|
||||
res = NBC_Sched_op(precv, true, psend, true, count,
|
||||
datatype, op, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
} else {
|
||||
/* precv = psend <op> precv */
|
||||
res = NBC_Sched_op(psend, true, precv, true, count,
|
||||
datatype, op, schedule, true);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
|
||||
char *tmp = psend;
|
||||
psend = precv;
|
||||
precv = tmp;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
cleanup_and_return:
|
||||
return res;
|
||||
}
|
||||
|
||||
int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
|
||||
@ -182,7 +322,7 @@ int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Da
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -10,8 +10,8 @@
|
||||
* Copyright (c) 2013 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -179,7 +179,7 @@ int ompi_coll_libnbc_iscatter (const void* sendbuf, int sendcount, MPI_Datatype
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -258,7 +258,7 @@ int ompi_coll_libnbc_iscatter_inter (const void* sendbuf, int sendcount, MPI_Dat
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -10,8 +10,8 @@
|
||||
* Copyright (c) 2013 The University of Tennessee and The University
|
||||
* of Tennessee Research Foundation. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2014-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2014-2018 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -114,7 +114,7 @@ int ompi_coll_libnbc_iscatterv(const void* sendbuf, const int *sendcounts, const
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
@ -192,7 +192,7 @@ int ompi_coll_libnbc_iscatterv_inter(const void* sendbuf, const int *sendcounts,
|
||||
}
|
||||
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
|
||||
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
|
||||
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
|
||||
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
|
||||
*request = &ompi_request_null.request;
|
||||
return res;
|
||||
}
|
||||
|
@ -36,7 +36,7 @@ struct mca_coll_monitoring_module_t {
|
||||
mca_coll_base_module_t super;
|
||||
mca_coll_base_comm_coll_t real;
|
||||
mca_monitoring_coll_data_t*data;
|
||||
int32_t is_initialized;
|
||||
opal_atomic_int32_t is_initialized;
|
||||
};
|
||||
typedef struct mca_coll_monitoring_module_t mca_coll_monitoring_module_t;
|
||||
OMPI_DECLSPEC OBJ_CLASS_DECLARATION(mca_coll_monitoring_module_t);
|
||||
|
@ -1,6 +1,7 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2013-2015 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
||||
* Copyright (c) 2015 Research Organization for Information Science
|
||||
@ -91,7 +92,7 @@ typedef struct ompi_coll_portals4_tree_t {
|
||||
|
||||
struct mca_coll_portals4_module_t {
|
||||
mca_coll_base_module_t super;
|
||||
size_t coll_count;
|
||||
opal_atomic_size_t coll_count;
|
||||
|
||||
/* record handlers dedicated to fallback if offloaded operations are not supported */
|
||||
mca_coll_base_module_reduce_fn_t previous_reduce;
|
||||
|
@ -114,7 +114,7 @@ BEGIN_C_DECLS
|
||||
typedef struct mca_coll_sm_in_use_flag_t {
|
||||
/** Number of processes currently using this set of
|
||||
segments */
|
||||
volatile uint32_t mcsiuf_num_procs_using;
|
||||
opal_atomic_uint32_t mcsiuf_num_procs_using;
|
||||
/** Must match data->mcb_count */
|
||||
volatile uint32_t mcsiuf_operation_count;
|
||||
} mca_coll_sm_in_use_flag_t;
|
||||
@ -152,7 +152,7 @@ BEGIN_C_DECLS
|
||||
/** Pointer to my parent's barrier control pages (will be NULL
|
||||
for communicator rank 0; odd index pages are "in", even
|
||||
index pages are "out") */
|
||||
uint32_t *mcb_barrier_control_parent;
|
||||
opal_atomic_uint32_t *mcb_barrier_control_parent;
|
||||
|
||||
/** Pointers to my childrens' barrier control pages (they're
|
||||
contiguous in memory, so we only point to the base -- the
|
||||
|
@ -56,7 +56,8 @@ int mca_coll_sm_barrier_intra(struct ompi_communicator_t *comm,
|
||||
int rank, buffer_set;
|
||||
mca_coll_sm_comm_t *data;
|
||||
uint32_t i, num_children;
|
||||
volatile uint32_t *me_in, *me_out, *parent, *children = NULL;
|
||||
volatile uint32_t *me_in, *me_out, *children = NULL;
|
||||
opal_atomic_uint32_t *parent;
|
||||
int uint_control_size;
|
||||
mca_coll_sm_module_t *sm_module = (mca_coll_sm_module_t*) module;
|
||||
|
||||
|
@ -372,7 +372,7 @@ int ompi_coll_sm_lazy_enable(mca_coll_base_module_t *module,
|
||||
data->mcb_barrier_control_me = (uint32_t*)
|
||||
(base + (rank * control_size * num_barrier_buffers * 2));
|
||||
if (data->mcb_tree[rank].mcstn_parent) {
|
||||
data->mcb_barrier_control_parent = (uint32_t*)
|
||||
data->mcb_barrier_control_parent = (opal_atomic_uint32_t*)
|
||||
(base +
|
||||
(data->mcb_tree[rank].mcstn_parent->mcstn_id * control_size *
|
||||
num_barrier_buffers * 2));
|
||||
|
@ -7,7 +7,7 @@
|
||||
* Copyright (c) 2015 Bull SAS. All rights reserved.
|
||||
* Copyright (c) 2016-2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* Copyright (c) 2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2017-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* $COPYRIGHT$
|
||||
*
|
||||
@ -34,7 +34,7 @@
|
||||
|
||||
/*** Monitoring specific variables ***/
|
||||
/* Keep tracks of how many components are currently using the common part */
|
||||
static int32_t mca_common_monitoring_hold = 0;
|
||||
static opal_atomic_int32_t mca_common_monitoring_hold = 0;
|
||||
/* Output parameters */
|
||||
int mca_common_monitoring_output_stream_id = -1;
|
||||
static opal_output_stream_t mca_common_monitoring_output_stream_obj = {
|
||||
@ -61,18 +61,18 @@ static char* mca_common_monitoring_initial_filename = "";
|
||||
static char* mca_common_monitoring_current_filename = NULL;
|
||||
|
||||
/* array for stroring monitoring data*/
|
||||
static size_t* pml_data = NULL;
|
||||
static size_t* pml_count = NULL;
|
||||
static size_t* filtered_pml_data = NULL;
|
||||
static size_t* filtered_pml_count = NULL;
|
||||
static size_t* osc_data_s = NULL;
|
||||
static size_t* osc_count_s = NULL;
|
||||
static size_t* osc_data_r = NULL;
|
||||
static size_t* osc_count_r = NULL;
|
||||
static size_t* coll_data = NULL;
|
||||
static size_t* coll_count = NULL;
|
||||
static opal_atomic_size_t* pml_data = NULL;
|
||||
static opal_atomic_size_t* pml_count = NULL;
|
||||
static opal_atomic_size_t* filtered_pml_data = NULL;
|
||||
static opal_atomic_size_t* filtered_pml_count = NULL;
|
||||
static opal_atomic_size_t* osc_data_s = NULL;
|
||||
static opal_atomic_size_t* osc_count_s = NULL;
|
||||
static opal_atomic_size_t* osc_data_r = NULL;
|
||||
static opal_atomic_size_t* osc_count_r = NULL;
|
||||
static opal_atomic_size_t* coll_data = NULL;
|
||||
static opal_atomic_size_t* coll_count = NULL;
|
||||
|
||||
static size_t* size_histogram = NULL;
|
||||
static opal_atomic_size_t* size_histogram = NULL;
|
||||
static const int max_size_histogram = 66;
|
||||
static double log10_2 = 0.;
|
||||
|
||||
@ -241,7 +241,7 @@ void mca_common_monitoring_finalize( void )
|
||||
opal_output_close(mca_common_monitoring_output_stream_id);
|
||||
free(mca_common_monitoring_output_stream_obj.lds_prefix);
|
||||
/* Free internal data structure */
|
||||
free(pml_data); /* a single allocation */
|
||||
free((void *) pml_data); /* a single allocation */
|
||||
opal_hash_table_remove_all( common_monitoring_translation_ht );
|
||||
OBJ_RELEASE(common_monitoring_translation_ht);
|
||||
mca_common_monitoring_coll_finalize();
|
||||
@ -446,7 +446,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
|
||||
|
||||
if( NULL == pml_data ) {
|
||||
int array_size = (10 + max_size_histogram) * nprocs_world;
|
||||
pml_data = (size_t*)calloc(array_size, sizeof(size_t));
|
||||
pml_data = (opal_atomic_size_t*)calloc(array_size, sizeof(size_t));
|
||||
pml_count = pml_data + nprocs_world;
|
||||
filtered_pml_data = pml_count + nprocs_world;
|
||||
filtered_pml_count = filtered_pml_data + nprocs_world;
|
||||
@ -493,7 +493,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
|
||||
static void mca_common_monitoring_reset( void )
|
||||
{
|
||||
int array_size = (10 + max_size_histogram) * nprocs_world;
|
||||
memset(pml_data, 0, array_size * sizeof(size_t));
|
||||
memset((void *) pml_data, 0, array_size * sizeof(size_t));
|
||||
mca_common_monitoring_coll_reset();
|
||||
}
|
||||
|
||||
|
@ -30,12 +30,12 @@ struct mca_monitoring_coll_data_t {
|
||||
int world_rank;
|
||||
int is_released;
|
||||
ompi_communicator_t*p_comm;
|
||||
size_t o2a_count;
|
||||
size_t o2a_size;
|
||||
size_t a2o_count;
|
||||
size_t a2o_size;
|
||||
size_t a2a_count;
|
||||
size_t a2a_size;
|
||||
opal_atomic_size_t o2a_count;
|
||||
opal_atomic_size_t o2a_size;
|
||||
opal_atomic_size_t a2o_count;
|
||||
opal_atomic_size_t a2o_size;
|
||||
opal_atomic_size_t a2a_count;
|
||||
opal_atomic_size_t a2a_size;
|
||||
};
|
||||
|
||||
/* Collectives operation monitoring */
|
||||
|
@ -4,7 +4,7 @@
|
||||
* reserved.
|
||||
* Copyright (c) 2013-2017 Inria. All rights reserved.
|
||||
* Copyright (c) 2013-2015 Bull SAS. All rights reserved.
|
||||
* Copyright (c) 2016 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2016-2018 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2017 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
* $COPYRIGHT$
|
||||
@ -42,10 +42,30 @@ writing 4x4 matrix to monitoring_avg.mat
|
||||
|
||||
*/
|
||||
|
||||
#include "ompi_config.h"
|
||||
|
||||
#include <stdio.h>
|
||||
#include <stdlib.h>
|
||||
#include <mpi.h>
|
||||
#include <string.h>
|
||||
#include <stdbool.h>
|
||||
|
||||
#if OMPI_BUILD_FORTRAN_BINDINGS
|
||||
// Set these #defines in the same way that
|
||||
// ompi/mpi/fortran/mpif-h/Makefile.am does when compiling the real
|
||||
// Fortran mpif.h bindings. They set behaviors in the Fortran header
|
||||
// files so that we can compile properly.
|
||||
#define OMPI_BUILD_MPI_PROFILING 0
|
||||
#define OMPI_COMPILING_FORTRAN_WRAPPERS 1
|
||||
#endif
|
||||
|
||||
#include "opal/threads/thread_usage.h"
|
||||
|
||||
#include "ompi/include/mpi.h"
|
||||
#include "ompi/mpi/fortran/base/constants.h"
|
||||
#include "ompi/mpi/fortran/base/fint_2_int.h"
|
||||
#if OMPI_BUILD_FORTRAN_BINDINGS
|
||||
#include "ompi/mpi/fortran/mpif-h/bindings.h"
|
||||
#endif
|
||||
|
||||
static MPI_T_pvar_session session;
|
||||
static int comm_world_size;
|
||||
@ -383,12 +403,6 @@ int write_mat(char * filename, size_t * mat, unsigned int dim)
|
||||
* MPI binding for fortran
|
||||
*/
|
||||
|
||||
#include <stdbool.h>
|
||||
#include "ompi_config.h"
|
||||
#include "opal/threads/thread_usage.h"
|
||||
#include "ompi/mpi/fortran/base/constants.h"
|
||||
#include "ompi/mpi/fortran/base/fint_2_int.h"
|
||||
|
||||
void monitoring_prof_mpi_init_f2c( MPI_Fint * );
|
||||
void monitoring_prof_mpi_finalize_f2c( MPI_Fint * );
|
||||
|
||||
@ -423,8 +437,6 @@ void monitoring_prof_mpi_finalize_f2c( MPI_Fint *ierr ) {
|
||||
#pragma weak MPI_Finalize_f = monitoring_prof_mpi_finalize_f2c
|
||||
#pragma weak MPI_Finalize_f08 = monitoring_prof_mpi_finalize_f2c
|
||||
#elif OMPI_BUILD_FORTRAN_BINDINGS
|
||||
#define OMPI_F77_PROTOTYPES_MPI_H
|
||||
#include "ompi/mpi/fortran/mpif-h/bindings.h"
|
||||
|
||||
OMPI_GENERATE_F77_BINDINGS (MPI_INIT,
|
||||
mpi_init,
|
||||
|
@ -34,7 +34,7 @@ static opal_mutex_t mca_common_ompio_cuda_mutex; /* lock for thread saf
|
||||
static mca_allocator_base_component_t* mca_common_ompio_allocator_component=NULL;
|
||||
static mca_allocator_base_module_t* mca_common_ompio_allocator=NULL;
|
||||
|
||||
static int32_t mca_common_ompio_cuda_init = 0;
|
||||
static opal_atomic_int32_t mca_common_ompio_cuda_init = 0;
|
||||
static int32_t mca_common_ompio_pagesize=4096;
|
||||
static void* mca_common_ompio_cuda_alloc_seg ( void *ctx, size_t *size );
|
||||
static void mca_common_ompio_cuda_free_seg ( void *ctx, void *buf );
|
||||
|
@ -124,7 +124,7 @@ ompi_mtl_ofi_component_register(void)
|
||||
MCA_BASE_VAR_SCOPE_READONLY,
|
||||
¶m_priority);
|
||||
|
||||
prov_include = "psm,psm2,gni";
|
||||
prov_include = NULL;
|
||||
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
||||
"provider_include",
|
||||
"Comma-delimited list of OFI providers that are considered for use (e.g., \"psm,psm2\"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.",
|
||||
@ -133,7 +133,7 @@ ompi_mtl_ofi_component_register(void)
|
||||
MCA_BASE_VAR_SCOPE_READONLY,
|
||||
&prov_include);
|
||||
|
||||
prov_exclude = NULL;
|
||||
prov_exclude = "shm,sockets,tcp,udp,rstream";
|
||||
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
|
||||
"provider_exclude",
|
||||
"Comma-delimited list of OFI providers that are not considered for use (default: \"sockets,mxm\"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.",
|
||||
|
@ -115,12 +115,12 @@ struct mca_mtl_portals4_module_t {
|
||||
opal_mutex_t short_block_mutex;
|
||||
|
||||
/** number of send-side operations started */
|
||||
uint64_t opcount;
|
||||
opal_atomic_uint64_t opcount;
|
||||
|
||||
#if OPAL_ENABLE_DEBUG
|
||||
/** number of receive-side operations started. Used only for
|
||||
debugging */
|
||||
uint64_t recv_opcount;
|
||||
opal_atomic_uint64_t recv_opcount;
|
||||
#endif
|
||||
|
||||
#if OMPI_MTL_PORTALS4_FLOW_CONTROL
|
||||
|
@ -1,7 +1,7 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* $COPYRIGHT$
|
||||
*
|
||||
|
@ -36,7 +36,7 @@ OBJ_CLASS_DECLARATION(ompi_mtl_portals4_pending_request_t);
|
||||
struct ompi_mtl_portals4_flowctl_t {
|
||||
int32_t flowctl_active;
|
||||
|
||||
int32_t send_slots;
|
||||
opal_atomic_int32_t send_slots;
|
||||
int32_t max_send_slots;
|
||||
opal_list_t pending_sends;
|
||||
opal_free_list_t pending_fl;
|
||||
@ -46,7 +46,7 @@ struct ompi_mtl_portals4_flowctl_t {
|
||||
|
||||
/** Flow control epoch counter. Triggered events should be
|
||||
based on epoch counter. */
|
||||
int64_t epoch_counter;
|
||||
opal_atomic_int64_t epoch_counter;
|
||||
|
||||
/** Flow control trigger CT. Only has meaning at root. */
|
||||
ptl_handle_ct_t trigger_ct_h;
|
||||
|
@ -54,8 +54,8 @@ struct ompi_mtl_portals4_isend_request_t {
|
||||
struct ompi_mtl_portals4_pending_request_t *pending;
|
||||
#endif
|
||||
ptl_size_t length;
|
||||
int32_t pending_get;
|
||||
uint32_t event_count;
|
||||
opal_atomic_int32_t pending_get;
|
||||
opal_atomic_uint32_t event_count;
|
||||
};
|
||||
typedef struct ompi_mtl_portals4_isend_request_t ompi_mtl_portals4_isend_request_t;
|
||||
|
||||
@ -76,7 +76,7 @@ struct ompi_mtl_portals4_recv_request_t {
|
||||
void *delivery_ptr;
|
||||
size_t delivery_len;
|
||||
volatile bool req_started;
|
||||
int32_t pending_reply;
|
||||
opal_atomic_int32_t pending_reply;
|
||||
#if OPAL_ENABLE_DEBUG
|
||||
uint64_t opcount;
|
||||
ptl_hdr_data_t hdr_data;
|
||||
|
@ -50,7 +50,7 @@
|
||||
OSC_MONITORING_SET_TEMPLATE_FCT_NAME(template) (ompi_osc_base_module_t*module) \
|
||||
{ \
|
||||
/* Define the ompi_osc_monitoring_module_## template ##_init_done variable */ \
|
||||
static int32_t init_done = 0; \
|
||||
opal_atomic_int32_t init_done = 0; \
|
||||
/* Define and set the ompi_osc_monitoring_## template \
|
||||
* ##_template variable. The functions recorded here are \
|
||||
* linked to the original functions of the original \
|
||||
|
@ -95,7 +95,7 @@ struct ompi_osc_portals4_module_t {
|
||||
ptl_handle_md_t req_md_h; /* memory descriptor with event completion used by this window */
|
||||
ptl_handle_me_t data_me_h; /* data match list entry (MB are CID | OSC_PORTALS4_MB_DATA) */
|
||||
ptl_handle_me_t control_me_h; /* match list entry for control data (node_state_t). Match bits are (CID | OSC_PORTALS4_MB_CONTROL). */
|
||||
int64_t opcount;
|
||||
opal_atomic_int64_t opcount;
|
||||
ptl_match_bits_t match_bits; /* match bits for module. Same as cid for comm in most cases. */
|
||||
|
||||
ptl_iovec_t *origin_iovec_list; /* list of memory segments that compose the noncontiguous region */
|
||||
|
@ -189,7 +189,7 @@ number_of_fragments(ptl_size_t length, ptl_size_t maxlength)
|
||||
|
||||
/* put in segments no larger than segment_length */
|
||||
static int
|
||||
segmentedPut(int64_t *opcount,
|
||||
segmentedPut(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t md_h,
|
||||
ptl_size_t origin_offset,
|
||||
ptl_size_t put_length,
|
||||
@ -236,7 +236,7 @@ segmentedPut(int64_t *opcount,
|
||||
|
||||
/* get in segments no larger than segment_length */
|
||||
static int
|
||||
segmentedGet(int64_t *opcount,
|
||||
segmentedGet(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t md_h,
|
||||
ptl_size_t origin_offset,
|
||||
ptl_size_t get_length,
|
||||
@ -280,7 +280,7 @@ segmentedGet(int64_t *opcount,
|
||||
|
||||
/* atomic op in segments no larger than segment_length */
|
||||
static int
|
||||
segmentedAtomic(int64_t *opcount,
|
||||
segmentedAtomic(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t md_h,
|
||||
ptl_size_t origin_offset,
|
||||
ptl_size_t length,
|
||||
@ -329,7 +329,7 @@ segmentedAtomic(int64_t *opcount,
|
||||
|
||||
/* atomic op in segments no larger than segment_length */
|
||||
static int
|
||||
segmentedFetchAtomic(int64_t *opcount,
|
||||
segmentedFetchAtomic(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t result_md_h,
|
||||
ptl_size_t result_offset,
|
||||
ptl_handle_md_t origin_md_h,
|
||||
@ -381,7 +381,7 @@ segmentedFetchAtomic(int64_t *opcount,
|
||||
|
||||
/* swap in segments no larger than segment_length */
|
||||
static int
|
||||
segmentedSwap(int64_t *opcount,
|
||||
segmentedSwap(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t result_md_h,
|
||||
ptl_size_t result_offset,
|
||||
ptl_handle_md_t origin_md_h,
|
||||
@ -1187,7 +1187,7 @@ fetch_atomic_to_iovec(ompi_osc_portals4_module_t *module,
|
||||
|
||||
/* put in the largest chunks possible given the noncontiguous restriction */
|
||||
static int
|
||||
put_to_noncontig(int64_t *opcount,
|
||||
put_to_noncontig(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t md_h,
|
||||
const void *origin_address,
|
||||
int origin_count,
|
||||
@ -1521,7 +1521,7 @@ atomic_to_noncontig(ompi_osc_portals4_module_t *module,
|
||||
|
||||
/* get from a noncontiguous remote to an (non)contiguous local */
|
||||
static int
|
||||
get_from_noncontig(int64_t *opcount,
|
||||
get_from_noncontig(opal_atomic_int64_t *opcount,
|
||||
ptl_handle_md_t md_h,
|
||||
const void *origin_address,
|
||||
int origin_count,
|
||||
|
@ -1,7 +1,7 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2011-2013 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* $COPYRIGHT$
|
||||
*
|
||||
@ -18,7 +18,7 @@
|
||||
struct ompi_osc_portals4_request_t {
|
||||
ompi_request_t super;
|
||||
int32_t ops_expected;
|
||||
volatile int32_t ops_committed;
|
||||
opal_atomic_int32_t ops_committed;
|
||||
};
|
||||
typedef struct ompi_osc_portals4_request_t ompi_osc_portals4_request_t;
|
||||
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
@ -110,7 +110,7 @@ struct ompi_osc_pt2pt_peer_t {
|
||||
int rank;
|
||||
|
||||
/** pointer to the current send fragment for each outgoing target */
|
||||
struct ompi_osc_pt2pt_frag_t *active_frag;
|
||||
opal_atomic_intptr_t active_frag;
|
||||
|
||||
/** lock for this peer */
|
||||
opal_mutex_t lock;
|
||||
@ -119,10 +119,10 @@ struct ompi_osc_pt2pt_peer_t {
|
||||
opal_list_t queued_frags;
|
||||
|
||||
/** number of fragments incomming (negative - expected, positive - unsynchronized) */
|
||||
volatile int32_t passive_incoming_frag_count;
|
||||
opal_atomic_int32_t passive_incoming_frag_count;
|
||||
|
||||
/** peer flags */
|
||||
volatile int32_t flags;
|
||||
opal_atomic_int32_t flags;
|
||||
};
|
||||
typedef struct ompi_osc_pt2pt_peer_t ompi_osc_pt2pt_peer_t;
|
||||
|
||||
@ -208,16 +208,16 @@ struct ompi_osc_pt2pt_module_t {
|
||||
|
||||
/** Nmber of communication fragments started for this epoch, by
|
||||
peer. Not in peer data to make fence more manageable. */
|
||||
uint32_t *epoch_outgoing_frag_count;
|
||||
opal_atomic_uint32_t *epoch_outgoing_frag_count;
|
||||
|
||||
/** cyclic counter for a unique tage for long messages. */
|
||||
volatile uint32_t tag_counter;
|
||||
opal_atomic_uint32_t tag_counter;
|
||||
|
||||
/** number of outgoing fragments still to be completed */
|
||||
volatile int32_t outgoing_frag_count;
|
||||
opal_atomic_int32_t outgoing_frag_count;
|
||||
|
||||
/** number of incoming fragments */
|
||||
volatile int32_t active_incoming_frag_count;
|
||||
opal_atomic_int32_t active_incoming_frag_count;
|
||||
|
||||
/** Number of targets locked/being locked */
|
||||
unsigned int passive_target_access_epoch;
|
||||
@ -230,13 +230,13 @@ struct ompi_osc_pt2pt_module_t {
|
||||
|
||||
/** Number of "count" messages from the remote complete group
|
||||
we've received */
|
||||
volatile int32_t num_complete_msgs;
|
||||
opal_atomic_int32_t num_complete_msgs;
|
||||
|
||||
/* ********************* LOCK data ************************ */
|
||||
|
||||
/** Status of the local window lock. One of 0 (unlocked),
|
||||
MPI_LOCK_EXCLUSIVE, or MPI_LOCK_SHARED. */
|
||||
int32_t lock_status;
|
||||
opal_atomic_int32_t lock_status;
|
||||
|
||||
/** lock for locks_pending list */
|
||||
opal_mutex_t locks_pending_lock;
|
||||
@ -526,7 +526,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||
"mark_incoming_completion marking passive incoming complete. module %p, source = %d, count = %d",
|
||||
(void *) module, source, (int) peer->passive_incoming_frag_count + 1));
|
||||
new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &peer->passive_incoming_frag_count, 1);
|
||||
new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &peer->passive_incoming_frag_count, 1);
|
||||
if (0 == new_value) {
|
||||
OPAL_THREAD_LOCK(&module->lock);
|
||||
opal_condition_broadcast(&module->cond);
|
||||
@ -550,7 +550,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
|
||||
*/
|
||||
static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
|
||||
{
|
||||
int32_t new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, 1);
|
||||
int32_t new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, 1);
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||
"mark_outgoing_completion: outgoing_frag_count = %d", new_value));
|
||||
if (new_value >= 0) {
|
||||
@ -574,12 +574,12 @@ static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
|
||||
*/
|
||||
static inline void ompi_osc_signal_outgoing (ompi_osc_pt2pt_module_t *module, int target, int count)
|
||||
{
|
||||
OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, -count);
|
||||
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, -count);
|
||||
if (MPI_PROC_NULL != target) {
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||
"ompi_osc_signal_outgoing_passive: target = %d, count = %d, total = %d", target,
|
||||
count, module->epoch_outgoing_frag_count[target] + count));
|
||||
OPAL_THREAD_ADD_FETCH32((int32_t *) (module->epoch_outgoing_frag_count + target), count);
|
||||
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) (module->epoch_outgoing_frag_count + target), count);
|
||||
}
|
||||
}
|
||||
|
||||
@ -717,7 +717,7 @@ static inline int get_tag(ompi_osc_pt2pt_module_t *module)
|
||||
/* the LSB of the tag is used be the receiver to determine if the
|
||||
message is a passive or active target (ie, where to mark
|
||||
completion). */
|
||||
int32_t tmp = OPAL_THREAD_ADD_FETCH32((volatile int32_t *) &module->tag_counter, 4);
|
||||
int32_t tmp = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->tag_counter, 4);
|
||||
return (tmp & OSC_PT2PT_FRAG_MASK) | !!(module->passive_target_access_epoch);
|
||||
}
|
||||
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
@ -166,17 +166,16 @@ int ompi_osc_pt2pt_fence(int assert, ompi_win_t *win)
|
||||
"osc pt2pt: fence done sending"));
|
||||
|
||||
/* find out how much data everyone is going to send us. */
|
||||
ret = module->comm->c_coll->coll_reduce_scatter_block (module->epoch_outgoing_frag_count,
|
||||
&incoming_reqs, 1, MPI_UINT32_T,
|
||||
MPI_SUM, module->comm,
|
||||
module->comm->c_coll->coll_reduce_scatter_block_module);
|
||||
ret = module->comm->c_coll->coll_reduce_scatter_block ((void *) module->epoch_outgoing_frag_count,
|
||||
&incoming_reqs, 1, MPI_UINT32_T,
|
||||
MPI_SUM, module->comm,
|
||||
module->comm->c_coll->coll_reduce_scatter_block_module);
|
||||
if (OMPI_SUCCESS != ret) {
|
||||
return ret;
|
||||
}
|
||||
|
||||
OPAL_THREAD_LOCK(&module->lock);
|
||||
bzero(module->epoch_outgoing_frag_count,
|
||||
sizeof(uint32_t) * ompi_comm_size(module->comm));
|
||||
bzero ((void *) module->epoch_outgoing_frag_count, sizeof(uint32_t) * ompi_comm_size(module->comm));
|
||||
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||
"osc pt2pt: fence expects %d requests",
|
||||
@ -366,8 +365,11 @@ int ompi_osc_pt2pt_complete (ompi_win_t *win)
|
||||
|
||||
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
||||
* requests. once that is done this can be removed. */
|
||||
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (complete_req))) {
|
||||
++complete_req.frag_count;
|
||||
if (peer->active_frag) {
|
||||
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
if (active_frag->remain_len < sizeof (complete_req)) {
|
||||
++complete_req.frag_count;
|
||||
}
|
||||
}
|
||||
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
|
||||
|
@ -501,7 +501,7 @@ static void ompi_osc_pt2pt_peer_construct (ompi_osc_pt2pt_peer_t *peer)
|
||||
{
|
||||
OBJ_CONSTRUCT(&peer->queued_frags, opal_list_t);
|
||||
OBJ_CONSTRUCT(&peer->lock, opal_mutex_t);
|
||||
peer->active_frag = NULL;
|
||||
peer->active_frag = 0;
|
||||
peer->passive_incoming_frag_count = 0;
|
||||
peer->flags = 0;
|
||||
}
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2009-2011 Oracle and/or its affiliates. All rights reserved.
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
@ -56,7 +56,7 @@ struct osc_pt2pt_accumulate_data_t {
|
||||
int peer;
|
||||
ompi_datatype_t *datatype;
|
||||
ompi_op_t *op;
|
||||
int request_count;
|
||||
opal_atomic_int32_t request_count;
|
||||
};
|
||||
typedef struct osc_pt2pt_accumulate_data_t osc_pt2pt_accumulate_data_t;
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2015 Research Organization for Information Science
|
||||
* and Technology (RIST). All rights reserved.
|
||||
@ -65,7 +65,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
|
||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, frag->target);
|
||||
int ret;
|
||||
|
||||
assert(0 == frag->pending && peer->active_frag != frag);
|
||||
assert(0 == frag->pending && peer->active_frag != (intptr_t) frag);
|
||||
|
||||
/* we need to signal now that a frag is outgoing to ensure the count sent
|
||||
* with the unlock message is correct */
|
||||
@ -93,7 +93,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
|
||||
|
||||
static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, ompi_osc_pt2pt_peer_t *peer)
|
||||
{
|
||||
ompi_osc_pt2pt_frag_t *active_frag = peer->active_frag;
|
||||
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
int ret = OMPI_SUCCESS;
|
||||
|
||||
if (NULL == active_frag) {
|
||||
@ -105,7 +105,7 @@ static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, om
|
||||
"osc pt2pt: flushing active fragment to target %d. pending: %d",
|
||||
active_frag->target, active_frag->pending));
|
||||
|
||||
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &active_frag, NULL)) {
|
||||
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &active_frag, 0)) {
|
||||
if (0 != OPAL_THREAD_ADD_FETCH32(&active_frag->pending, -1)) {
|
||||
/* communication going on while synchronizing; this is an rma usage bug */
|
||||
return OMPI_ERR_RMA_SYNC;
|
||||
|
@ -33,7 +33,7 @@ struct ompi_osc_pt2pt_frag_t {
|
||||
char *top;
|
||||
|
||||
/* Number of operations which have started writing into the frag, but not yet completed doing so */
|
||||
volatile int32_t pending;
|
||||
opal_atomic_int32_t pending;
|
||||
int32_t pending_long_sends;
|
||||
ompi_osc_pt2pt_frag_header_t *header;
|
||||
ompi_osc_pt2pt_module_t *module;
|
||||
@ -66,8 +66,8 @@ static inline ompi_osc_pt2pt_frag_t *ompi_osc_pt2pt_frag_alloc_non_buffered (omp
|
||||
ompi_osc_pt2pt_frag_t *curr;
|
||||
|
||||
/* to ensure ordering flush the buffer on the peer */
|
||||
curr = peer->active_frag;
|
||||
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &curr, NULL)) {
|
||||
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &curr, 0)) {
|
||||
/* If there's something pending, the pending finish will
|
||||
start the buffer. Otherwise, we need to start it now. */
|
||||
int ret = ompi_osc_pt2pt_frag_finish (module, curr);
|
||||
@ -131,7 +131,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
|
||||
|
||||
OPAL_THREAD_LOCK(&module->lock);
|
||||
if (buffered) {
|
||||
curr = peer->active_frag;
|
||||
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
if (NULL == curr || curr->remain_len < request_len || (long_send && curr->pending_long_sends == 32)) {
|
||||
curr = ompi_osc_pt2pt_frag_alloc_non_buffered (module, peer, request_len);
|
||||
if (OPAL_UNLIKELY(NULL == curr)) {
|
||||
@ -140,7 +140,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
|
||||
}
|
||||
|
||||
curr->pending_long_sends = long_send;
|
||||
peer->active_frag = curr;
|
||||
peer->active_frag = (uintptr_t) curr;
|
||||
} else {
|
||||
OPAL_THREAD_ADD_FETCH32(&curr->header->num_ops, 1);
|
||||
curr->pending_long_sends += long_send;
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2015 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
|
||||
* Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
|
||||
@ -180,7 +180,7 @@ typedef struct ompi_osc_pt2pt_header_flush_ack_t ompi_osc_pt2pt_header_flush_ack
|
||||
struct ompi_osc_pt2pt_frag_header_t {
|
||||
ompi_osc_pt2pt_header_base_t base;
|
||||
uint32_t source; /* rank in window of source process */
|
||||
int32_t num_ops; /* number of operations in this buffer */
|
||||
opal_atomic_int32_t num_ops; /* number of operations in this buffer */
|
||||
uint32_t pad; /* ensure the fragment header is a multiple of 8 bytes */
|
||||
};
|
||||
typedef struct ompi_osc_pt2pt_frag_header_t ompi_osc_pt2pt_frag_header_t;
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
* Copyright (c) 2015 Research Organization for Information Science
|
||||
@ -104,13 +104,13 @@ int ompi_osc_pt2pt_free(ompi_win_t *win)
|
||||
free (module->recv_frags);
|
||||
}
|
||||
|
||||
if (NULL != module->epoch_outgoing_frag_count) free(module->epoch_outgoing_frag_count);
|
||||
free ((void *) module->epoch_outgoing_frag_count);
|
||||
|
||||
if (NULL != module->comm) {
|
||||
ompi_comm_free(&module->comm);
|
||||
}
|
||||
|
||||
if (NULL != module->free_after) free(module->free_after);
|
||||
free ((void *) module->free_after);
|
||||
|
||||
free (module);
|
||||
|
||||
|
@ -8,7 +8,7 @@
|
||||
* University of Stuttgart. All rights reserved.
|
||||
* Copyright (c) 2004-2005 The Regents of the University of California.
|
||||
* All rights reserved.
|
||||
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
|
||||
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
|
||||
@ -157,7 +157,7 @@ int ompi_osc_pt2pt_lock_remote (ompi_osc_pt2pt_module_t *module, int target, omp
|
||||
|
||||
static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module, int target, ompi_osc_pt2pt_sync_t *lock)
|
||||
{
|
||||
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
||||
int lock_type = lock->sync.lock.type;
|
||||
ompi_osc_pt2pt_header_unlock_t unlock_req;
|
||||
@ -178,10 +178,13 @@ static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module,
|
||||
unlock_req.lock_ptr = (uint64_t) (uintptr_t) lock;
|
||||
OSC_PT2PT_HTON(&unlock_req, module, target);
|
||||
|
||||
if (peer->active_frag && peer->active_frag->remain_len < sizeof (unlock_req)) {
|
||||
/* the peer should expect one more packet */
|
||||
++unlock_req.frag_count;
|
||||
--module->epoch_outgoing_frag_count[target];
|
||||
if (peer->active_frag) {
|
||||
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
if (active_frag->remain_len < sizeof (unlock_req)) {
|
||||
/* the peer should expect one more packet */
|
||||
++unlock_req.frag_count;
|
||||
--module->epoch_outgoing_frag_count[target];
|
||||
}
|
||||
}
|
||||
|
||||
OPAL_OUTPUT_VERBOSE((25, ompi_osc_base_framework.framework_output,
|
||||
@ -204,7 +207,7 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
|
||||
{
|
||||
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
|
||||
ompi_osc_pt2pt_header_flush_t flush_req;
|
||||
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
|
||||
int ret;
|
||||
|
||||
(void) OPAL_THREAD_ADD_FETCH32(&lock->sync_expected, 1);
|
||||
@ -218,10 +221,13 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
|
||||
|
||||
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
|
||||
* requests. once that is done this can be removed. */
|
||||
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (flush_req))) {
|
||||
/* the peer should expect one more packet */
|
||||
++flush_req.frag_count;
|
||||
--module->epoch_outgoing_frag_count[target];
|
||||
if (peer->active_frag) {
|
||||
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
|
||||
if (active_frag->remain_len < sizeof (flush_req)) {
|
||||
/* the peer should expect one more packet */
|
||||
++flush_req.frag_count;
|
||||
--module->epoch_outgoing_frag_count[target];
|
||||
}
|
||||
}
|
||||
|
||||
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output, "flushing to target %d, frag_count: %d",
|
||||
|
@ -28,7 +28,7 @@ struct ompi_osc_pt2pt_request_t {
|
||||
int origin_count;
|
||||
struct ompi_datatype_t *origin_dt;
|
||||
ompi_osc_pt2pt_module_t* module;
|
||||
int32_t outstanding_requests;
|
||||
opal_atomic_int32_t outstanding_requests;
|
||||
bool internal;
|
||||
};
|
||||
typedef struct ompi_osc_pt2pt_request_t ompi_osc_pt2pt_request_t;
|
||||
|
@ -1,6 +1,6 @@
|
||||
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
|
||||
/*
|
||||
* Copyright (c) 2015-2016 Los Alamos National Security, LLC. All rights
|
||||
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
|
||||
* reserved.
|
||||
* $COPYRIGHT$
|
||||
*
|
||||
@ -74,7 +74,7 @@ struct ompi_osc_pt2pt_sync_t {
|
||||
int num_peers;
|
||||
|
||||
/** number of synchronization messages expected */
|
||||
volatile int32_t sync_expected;
|
||||
opal_atomic_int32_t sync_expected;
|
||||
|
||||
/** eager sends are active to all peers in this access epoch */
|
||||
volatile bool eager_send_active;
|
||||
|
@ -265,7 +265,7 @@ struct ompi_osc_rdma_module_t {
|
||||
unsigned long get_retry_count;
|
||||
|
||||
/** outstanding atomic operations */
|
||||
volatile int32_t pending_ops;
|
||||
opal_atomic_int32_t pending_ops;
|
||||
};
|
||||
typedef struct ompi_osc_rdma_module_t ompi_osc_rdma_module_t;
|
||||
OMPI_MODULE_DECLSPEC extern ompi_osc_rdma_component_t mca_osc_rdma_component;
|
||||
|
@ -259,7 +259,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
|
||||
return ret;
|
||||
}
|
||||
} else {
|
||||
post_index = ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) (intptr_t) target, 1) - 1;
|
||||
post_index = ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) (intptr_t) target, 1) - 1;
|
||||
}
|
||||
|
||||
post_index &= OMPI_OSC_RDMA_POST_PEER_MAX - 1;
|
||||
@ -279,7 +279,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
|
||||
return ret;
|
||||
}
|
||||
} else {
|
||||
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_counter_t *) target, &_tmp_value,
|
||||
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_atomic_counter_t *) target, &_tmp_value,
|
||||
1 + (osc_rdma_counter_t) my_rank);
|
||||
}
|
||||
|
||||
@ -491,7 +491,7 @@ int ompi_osc_rdma_complete_atomic (ompi_win_t *win)
|
||||
ret = ompi_osc_rdma_lock_btl_op (module, peer, target, MCA_BTL_ATOMIC_ADD, 1, true);
|
||||
assert (OMPI_SUCCESS == ret);
|
||||
} else {
|
||||
(void) ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) target, 1);
|
||||
(void) ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) target, 1);
|
||||
}
|
||||
}
|
||||
|
||||
|
Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше
Загрузка…
x
Ссылка в новой задаче
Block a user