1
1
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
Этот коммит содержится в:
Mikhail Kurnosov 2018-10-05 21:40:27 +07:00
родитель dfe203e167 5f1c940c8b
Коммит 9557fa087f
1457 изменённых файлов: 7515 добавлений и 18232 удалений

Просмотреть файл

@ -53,7 +53,7 @@ Copyright (c) 2014-2015 Hewlett-Packard Development Company, LP. All
rights reserved.
Copyright (c) 2013-2017 Research Organization for Information Science (RIST).
All rights reserved.
Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
Copyright (c) 2017-2018 Amazon.com, Inc. or its affiliates. All Rights
reserved.
Copyright (c) 2018 DataDirect Networks. All rights reserved.

47
NEWS
Просмотреть файл

@ -80,6 +80,53 @@ Master (not on release branches yet)
Currently, this means the Open SHMEM layer will only build if
a MXM or UCX library is found.
3.1.2 -- August, 2018
------------------------
- A subtle race condition bug was discovered in the "vader" BTL
(shared memory communications) that, in rare instances, can cause
MPI processes to crash or incorrectly classify (or effectively drop)
an MPI message sent via shared memory. If you are using the "ob1"
PML with "vader" for shared memory communication (note that vader is
the default for shared memory communication with ob1), you need to
upgrade to v3.1.2 or later to fix this issue. You may also upgrade
to the following versions to fix this issue:
- Open MPI v2.1.5 (expected end of August, 2018) or later in the
v2.1.x series
- Open MPI v3.0.1 (released March, 2018) or later in the v3.0.x
series
- Assorted Portals 4.0 bug fixes.
- Fix for possible data corruption in MPI_BSEND.
- Move shared memory file for vader btl into /dev/shm on Linux.
- Fix for MPI_ISCATTER/MPI_ISCATTERV Fortran interfaces with MPI_IN_PLACE.
- Upgrade PMIx to v2.1.3.
- Numerous One-sided bug fixes.
- Fix for race condition in uGNI BTL.
- Improve handling of large number of interfaces with TCP BTL.
- Numerous UCX bug fixes.
3.1.1 -- June, 2018
-------------------
- Fix potential hang in UCX PML during MPI_FINALIZE
- Update internal PMIx to v2.1.2rc2 to fix forward version compatibility.
- Add new MCA parameter osc_sm_backing_store to allow users to specify
where in the filesystem the backing file for the shared memory
one-sided component should live. Defaults to /dev/shm on Linux.
- Fix potential hang on non-x86 platforms when using builds with
optimization flags turned off.
- Disable osc/pt2pt when using MPI_THREAD_MULTIPLE due to numerous
race conditions in the component.
- Fix dummy variable names for the mpi and mpi_f08 Fortran bindings to
match the MPI standard. This may break applications which use
name-based parameters in Fortran which used our internal names
rather than those documented in the MPI standard.
- Revamp Java detection to properly handle new Java versions which do
not provide a javah wrapper.
- Fix RMA function signatures for use-mpi-f08 bindings to have the
asynchonous property on all buffers.
- Improved configure logic for finding the UCX library.
3.1.0 -- May, 2018
------------------

26
README
Просмотреть файл

@ -8,7 +8,7 @@ Copyright (c) 2004-2008 High Performance Computing Center Stuttgart,
University of Stuttgart. All rights reserved.
Copyright (c) 2004-2007 The Regents of the University of California.
All rights reserved.
Copyright (c) 2006-2017 Cisco Systems, Inc. All rights reserved.
Copyright (c) 2006-2018 Cisco Systems, Inc. All rights reserved.
Copyright (c) 2006-2011 Mellanox Technologies. All rights reserved.
Copyright (c) 2006-2012 Oracle and/or its affiliates. All rights reserved.
Copyright (c) 2007 Myricom, Inc. All rights reserved.
@ -605,7 +605,6 @@ Network Support
- Loopback (send-to-self)
- Shared memory
- TCP
- Intel Phi SCIF
- SMCUDA
- Cisco usNIC
- uGNI (Cray Gemini, Aries)
@ -768,6 +767,26 @@ Open MPI is unable to find relevant support for <foo>, configure will
assume that it was unable to provide a feature that was specifically
requested and will abort so that a human can resolve out the issue.
Additionally, if a search directory is specified in the form
--with-<foo>=<dir>, Open MPI will:
1. Search for <foo>'s header files in <dir>/include.
2. Search for <foo>'s library files:
2a. If --with-<foo>-libdir=<libdir> was specified, search in
<libdir>.
2b. Otherwise, search in <dir>/lib, and if they are not found
there, search again in <dir>/lib64.
3. If both the relevant header files and libraries are found:
3a. Open MPI will build support for <foo>.
3b. If the root path where the <foo> libraries are found is neither
"/usr" nor "/usr/local", Open MPI will compile itself with
RPATH flags pointing to the directory where <foo>'s libraries
are located. Open MPI does not RPATH /usr/lib[64] and
/usr/local/lib[64] because many systems already search these
directories for run-time libraries by default; adding RPATH for
them could have unintended consequences for the search path
ordering.
INSTALLATION OPTIONS
--prefix=<directory>
@ -1000,9 +1019,6 @@ NETWORKING SUPPORT / OPTIONS
covers most cases. This option is only needed for special
configurations.
--with-scif=<dir>
Look in directory for Intel SCIF support libraries
--with-verbs=<directory>
Specify the directory where the verbs (also known as OpenFabrics
verbs, or Linux verbs, and previously known as OpenIB) libraries and

Просмотреть файл

@ -61,7 +61,7 @@ my $include_list;
my $exclude_list;
# Minimum versions
my $ompi_automake_version = "1.12.2";
my $ompi_automake_version = "1.13.4";
my $ompi_autoconf_version = "2.69";
my $ompi_libtool_version = "2.4.2";

Просмотреть файл

@ -1,7 +1,7 @@
# -*- shell-script -*-
#
# Copyright (c) 2009-2017 Cisco Systems, Inc. All rights reserved
# Copyright (c) 2017 Research Organization for Information Science
# Copyright (c) 2017-2018 Research Organization for Information Science
# and Technology (RIST). All rights reserved.
# Copyright (c) 2018 Los Alamos National Security, LLC. All rights
# reserved.
@ -38,6 +38,7 @@ AC_DEFUN([OMPI_CONFIG_FILES],[
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-file-interfaces.h
ompi/mpi/fortran/use-mpi-ignore-tkr/mpi-ignore-tkr-removed-interfaces.h
ompi/mpi/fortran/use-mpi-f08/Makefile
ompi/mpi/fortran/use-mpi-f08/bindings/Makefile
ompi/mpi/fortran/use-mpi-f08/mod/Makefile
ompi/mpi/fortran/mpiext-use-mpi/Makefile
ompi/mpi/fortran/mpiext-use-mpi-f08/Makefile

Просмотреть файл

@ -347,7 +347,8 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
], [])],
[AC_MSG_RESULT([found])
opal_external_pmix_version=4x
opal_external_pmix_version_found=1],
opal_external_pmix_version_found=1
opal_external_pmix_happy=yes],
[AC_MSG_RESULT([not found])])])
AS_IF([test "$opal_external_pmix_version_found" = "0"],
@ -437,9 +438,11 @@ AC_DEFUN([OPAL_CHECK_PMIX],[
[Whether the external PMIx library is v1])
AM_CONDITIONAL([OPAL_WANT_PRUN], [test "$opal_prun_happy" = "yes"])
AS_IF([test "$opal_external_pmix_version" = "1x"],
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [1.2.x: WARNING - DYNAMIC OPS NOT SUPPORTED])],
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [$opal_external_pmix_version])])
AS_IF([test "$opal_external_pmix_happy" = "yes"],
[AS_IF([test "$opal_external_pmix_version" = "1x"],
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External (1.2.5) WARNING - DYNAMIC OPS NOT SUPPORTED])],
[OPAL_SUMMARY_ADD([[Miscellaneous]],[[PMIx support]], [opal_pmix], [External ($opal_external_pmix_version)])])],
[OPAL_SUMMARY_ADD([[Miscellaneous]], [[PMIx support]], [opal_pmix], [Internal])])
OPAL_VAR_SCOPE_POP
])

Просмотреть файл

@ -13,7 +13,7 @@ dnl Copyright (c) 2008-2018 Cisco Systems, Inc. All rights reserved.
dnl Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
dnl Copyright (c) 2015-2017 Research Organization for Information Science
dnl and Technology (RIST). All rights reserved.
dnl Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
dnl Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
dnl reserved.
dnl Copyright (c) 2017 Amazon.com, Inc. or its affiliates. All Rights
dnl reserved.
@ -122,6 +122,57 @@ int main(int argc, char** argv)
}
]])
dnl This is a C test to see if 128-bit __atomic_compare_exchange_n()
dnl actually works (e.g., it compiles and links successfully on
dnl ARM64+clang, but returns incorrect answers as of August 2018).
AC_DEFUN([OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE],[[
#include <stdint.h>
#include <stdbool.h>
#include <stdlib.h>
#include <stdatomic.h>
typedef union {
uint64_t fake@<:@2@:>@;
_Atomic __int128 real;
} ompi128;
static void test1(void)
{
// As of Aug 2018, we could not figure out a way to assign 128-bit
// constants -- the compilers would not accept it. So use a fake
// union to assign 2 uin64_t's to make a single __int128.
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
ompi128 expected = { .fake = { 0x11EEDDCCBBAA0099, 0x88776655443322FF }};
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
desired.real, true,
atomic_relaxed, atomic_relaxed);
if ( !(r == false && ptr.real == expected.real)) {
exit(1);
}
}
static void test2(void)
{
ompi128 ptr = { .fake = { 0xFFEEDDCCBBAA0099, 0x8877665544332211 }};
ompi128 expected = ptr;
ompi128 desired = { .fake = { 0x1122DDCCBBAA0099, 0x887766554433EEFF }};
bool r = atomic_compare_exchange_strong (&ptr.real, &expected.real,
desired.real, true,
atomic_relaxed, atomic_relaxed);
if (!(r == true && ptr.real == desired.real)) {
exit(2);
}
}
int main(int argc, char** argv)
{
test1();
test2();
return 0;
}
]])
dnl ------------------------------------------------------------------
dnl
@ -329,6 +380,71 @@ __atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
])
AC_DEFUN([OPAL_CHECK_C11_CSWAP_INT128], [
OPAL_VAR_SCOPE_PUSH([atomic_compare_exchange_result atomic_compare_exchange_CFLAGS_save atomic_compare_exchange_LIBS_save])
atomic_compare_exchange_CFLAGS_save=$CFLAGS
atomic_compare_exchange_LIBS_save=$LIBS
# Do we have C11 atomics on 128-bit integers?
# Use a special macro because we need to check with a few different
# CFLAGS/LIBS.
OPAL_ASM_CHECK_ATOMIC_FUNC([atomic_compare_exchange_strong_16],
[AC_LANG_SOURCE(OPAL_ATOMIC_COMPARE_EXCHANGE_STRONG_TEST_SOURCE)],
[atomic_compare_exchange_result=1],
[atomic_compare_exchange_result=0])
# If we have it and it works, check to make sure it is always lock
# free.
AS_IF([test $atomic_compare_exchange_result -eq 1],
[AC_MSG_CHECKING([if C11 __int128 atomic compare-and-swap is always lock-free])
AC_RUN_IFELSE([AC_LANG_PROGRAM([#include <stdatomic.h>], [_Atomic __int128_t x; if (!atomic_is_lock_free(&x)) { return 1; }])],
[AC_MSG_RESULT([yes])],
[atomic_compare_exchange_result=0
# If this test fails, need to reset CFLAGS/LIBS (the
# above tests atomically set CFLAGS/LIBS or not; this
# test is running after the fact, so we have to undo
# the side-effects of setting CFLAGS/LIBS if the above
# tests passed).
CFLAGS=$atomic_compare_exchange_CFLAGS_save
LIBS=$atomic_compare_exchange_LIBS_save
AC_MSG_RESULT([no])],
[AC_MSG_RESULT([cannot test -- assume yes (cross compiling)])])
])
AC_DEFINE_UNQUOTED([OPAL_HAVE_C11_CSWAP_INT128],
[$atomic_compare_exchange_result],
[Whether C11 atomic compare swap is both supported and lock-free on 128-bit values])
dnl If we could not find decent support for 128-bits atomic let's
dnl try the GCC _sync
AS_IF([test $atomic_compare_exchange_result -eq 0],
[OPAL_CHECK_SYNC_BUILTIN_CSWAP_INT128])
OPAL_VAR_SCOPE_POP
])
AC_DEFUN([OPAL_CHECK_GCC_ATOMIC_BUILTINS], [
AC_MSG_CHECKING([for __atomic builtin atomics])
AC_TRY_LINK([
#include <stdint.h>
uint32_t tmp, old = 0;
uint64_t tmp64, old64 = 0;], [
__atomic_thread_fence(__ATOMIC_SEQ_CST);
__atomic_compare_exchange_n(&tmp, &old, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
__atomic_add_fetch(&tmp, 1, __ATOMIC_RELAXED);
__atomic_compare_exchange_n(&tmp64, &old64, 1, 0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
__atomic_add_fetch(&tmp64, 1, __ATOMIC_RELAXED);],
[AC_MSG_RESULT([yes])
$1],
[AC_MSG_RESULT([no])
$2])
# Check for 128-bit support
OPAL_CHECK_GCC_BUILTIN_CSWAP_INT128
])
dnl #################################################################
dnl
@ -1020,17 +1136,27 @@ AC_DEFUN([OPAL_CONFIG_ASM],[
AC_REQUIRE([OPAL_SETUP_CC])
AC_REQUIRE([AM_PROG_AS])
AC_ARG_ENABLE([c11-atomics],[AC_HELP_STRING([--enable-c11-atomics],
[Enable use of C11 atomics if available (default: enabled)])])
AC_ARG_ENABLE([builtin-atomics],
[AC_HELP_STRING([--enable-builtin-atomics],
[Enable use of __sync builtin atomics (default: enabled)])])
[Enable use of __sync builtin atomics (default: disabled)])])
opal_cv_asm_builtin="BUILTIN_NO"
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" != "no"],
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
OPAL_CHECK_C11_CSWAP_INT128
if test "x$enable_c11_atomics" != "xno" && test "$opal_cv_c11_supported" = "yes" ; then
opal_cv_asm_builtin="BUILTIN_C11"
OPAL_CHECK_C11_CSWAP_INT128
else
opal_cv_asm_builtin="BUILTIN_NO"
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
[OPAL_CHECK_GCC_ATOMIC_BUILTINS([opal_cv_asm_builtin="BUILTIN_GCC"], [])])
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
[OPAL_CHECK_SYNC_BUILTINS([opal_cv_asm_builtin="BUILTIN_SYNC"], [])])
AS_IF([test "$opal_cv_asm_builtin" = "BUILTIN_NO" && test "$enable_builtin_atomics" = "yes"],
[AC_MSG_ERROR([__sync builtin atomics requested but not found.])])
fi
OPAL_CHECK_ASM_PROC
OPAL_CHECK_ASM_TEXT

Просмотреть файл

@ -10,7 +10,7 @@ dnl Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
dnl University of Stuttgart. All rights reserved.
dnl Copyright (c) 2004-2005 The Regents of the University of California.
dnl All rights reserved.
dnl Copyright (c) 2014-2015 Intel, Inc. All rights reserved.
dnl Copyright (c) 2014-2018 Intel, Inc. All rights reserved.
dnl Copyright (c) 2015 Cisco Systems, Inc. All rights reserved.
dnl $COPYRIGHT$
dnl
@ -60,6 +60,8 @@ do
;;
-with-platform=* | --with-platform=*)
;;
-with*=internal)
;;
*)
case $subdir_arg in
*\'*) subdir_arg=`echo "$subdir_arg" | sed "s/'/'\\\\\\\\''/g"` ;;

Просмотреть файл

@ -100,7 +100,7 @@ OPAL_VAR_SCOPE_POP
#
# Init automake
#
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.12.2 tar-ustar])
AM_INIT_AUTOMAKE([foreign dist-bzip2 subdir-objects no-define 1.13.4 tar-ustar])
# SILENT_RULES is new in AM 1.11, but we require 1.11 or higher via
# autogen. Limited testing shows that calling SILENT_RULES directly
@ -858,7 +858,7 @@ OPAL_SEARCH_LIBS_CORE([ceil], [m])
# -lrt might be needed for clock_gettime
OPAL_SEARCH_LIBS_CORE([clock_gettime], [rt])
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair strncpy_s usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
AC_CHECK_FUNCS([asprintf snprintf vasprintf vsnprintf openpty isatty getpwuid fork waitpid execve pipe ptsname setsid mmap tcgetpgrp posix_memalign strsignal sysconf syslog vsyslog regcmp regexec regfree _NSGetEnviron socketpair usleep mkfifo dbopen dbm_open statfs statvfs setpgid setenv __malloc_initialize_hook __clear_cache])
# Sanity check: ensure that we got at least one of statfs or statvfs.
if test $ac_cv_func_statfs = no && test $ac_cv_func_statvfs = no; then

Просмотреть файл

@ -88,12 +88,8 @@ EXTRA_DIST = \
platform/lanl/darwin/mic-common \
platform/lanl/darwin/debug \
platform/lanl/darwin/debug.conf \
platform/lanl/darwin/debug-mic \
platform/lanl/darwin/debug-mic.conf \
platform/lanl/darwin/optimized \
platform/lanl/darwin/optimized.conf \
platform/lanl/darwin/optimized-mic \
platform/lanl/darwin/optimized-mic.conf \
platform/snl/portals4-m5 \
platform/snl/portals4-orte \
platform/ibm/debug-ppc32-gcc \

Просмотреть файл

@ -10,7 +10,7 @@
m4=1.4.16
ac=2.69
am=1.12.2
am=1.13.4
lt=2.4.2
flex=2.5.35

1
contrib/dist/linux/buildrpm.sh поставляемый
Просмотреть файл

@ -267,7 +267,6 @@ fi
# Find where the top RPM-building directory is
#
rpmtopdir=
file=~/.rpmmacros
if test -r $file; then
rpmtopdir=${rpmtopdir:-"`grep %_topdir $file | awk '{ print $2 }'`"}

Просмотреть файл

@ -1,100 +0,0 @@
#
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
# University of Stuttgart. All rights reserved.
# Copyright (c) 2004-2005 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
# All rights reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# This is the default system-wide MCA parameters defaults file.
# Specifically, the MCA parameter "mca_param_files" defaults to a
# value of
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
# (this file is the latter of the two). So if the default value of
# mca_param_files is not changed, this file is used to set system-wide
# MCA parameters. This file can therefore be used to set system-wide
# default MCA parameters for all users. Of course, users can override
# these values if they want, but this file is an excellent location
# for setting system-specific MCA parameters for those users who don't
# know / care enough to investigate the proper values for them.
# Note that this file is only applicable where it is visible (in a
# filesystem sense). Specifically, MPI processes each read this file
# during their startup to determine what default values for MCA
# parameters should be used. mpirun does not bundle up the values in
# this file from the node where it was run and send them to all nodes;
# the default value decisions are effectively distributed. Hence,
# these values are only applicable on nodes that "see" this file. If
# $sysconf is a directory on a local disk, it is likely that changes
# to this file will need to be propagated to other nodes. If $sysconf
# is a directory that is shared via a networked filesystem, changes to
# this file will be visible to all nodes that share this $sysconf.
# The format is straightforward: one per line, mca_param_name =
# rvalue. Quoting is ignored (so if you use quotes or escape
# characters, they'll be included as part of the value). For example:
# Disable run-time MPI parameter checking
# mpi_param_check = 0
# Note that the value "~/" will be expanded to the current user's home
# directory. For example:
# Change component loading path
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
# See "ompi_info --param all all" for a full listing of Open MPI MCA
# parameters available and their default values.
#
# Basic behavior to smooth startup
mca_base_component_show_load_errors = 0
opal_set_max_sys_limits = 1
orte_report_launch_progress = 1
# Define timeout for daemons to report back during launch
orte_startup_timeout = 10000
## Protect the shared file systems
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
orte_tmpdir_base = /tmp
## Require an allocation to run - protects the frontend
## from inadvertent job executions
orte_allocation_required = 1
## Add the interface for out-of-band communication
## and set it up
oob_tcp_if_include=mic0
oob_tcp_peer_retries = 1000
oob_tcp_sndbuf = 32768
oob_tcp_rcvbuf = 32768
## Define the MPI interconnects
btl = sm,scif,openib,self
## Setup OpenIB - just in case
btl_openib_want_fork_support = 0
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
## Enable cpu affinity
hwloc_base_binding_policy = core
## Setup MPI options
mpi_show_handle_leaks = 1
mpi_warn_on_fork = 1
#mpi_abort_print_stack = 1

Просмотреть файл

@ -10,7 +10,7 @@
# Copyright (c) 2004-2005 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2011-2013 Los Alamos National Security, LLC.
# Copyright (c) 2011-2018 Los Alamos National Security, LLC.
# All rights reserved.
# $COPYRIGHT$
#
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
oob_tcp_rcvbuf = 32768
## Define the MPI interconnects
btl = sm,scif,openib,self
btl = sm,openib,self
## Setup OpenIB - just in case
btl_openib_want_fork_support = 0

Просмотреть файл

@ -1,100 +0,0 @@
#
# Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
# University Research and Technology
# Corporation. All rights reserved.
# Copyright (c) 2004-2005 The University of Tennessee and The University
# of Tennessee Research Foundation. All rights
# reserved.
# Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
# University of Stuttgart. All rights reserved.
# Copyright (c) 2004-2005 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
# reserved.
# $COPYRIGHT$
#
# Additional copyrights may follow
#
# $HEADER$
#
# This is the default system-wide MCA parameters defaults file.
# Specifically, the MCA parameter "mca_param_files" defaults to a
# value of
# "$HOME/.openmpi/mca-params.conf:$sysconf/openmpi-mca-params.conf"
# (this file is the latter of the two). So if the default value of
# mca_param_files is not changed, this file is used to set system-wide
# MCA parameters. This file can therefore be used to set system-wide
# default MCA parameters for all users. Of course, users can override
# these values if they want, but this file is an excellent location
# for setting system-specific MCA parameters for those users who don't
# know / care enough to investigate the proper values for them.
# Note that this file is only applicable where it is visible (in a
# filesystem sense). Specifically, MPI processes each read this file
# during their startup to determine what default values for MCA
# parameters should be used. mpirun does not bundle up the values in
# this file from the node where it was run and send them to all nodes;
# the default value decisions are effectively distributed. Hence,
# these values are only applicable on nodes that "see" this file. If
# $sysconf is a directory on a local disk, it is likely that changes
# to this file will need to be propagated to other nodes. If $sysconf
# is a directory that is shared via a networked filesystem, changes to
# this file will be visible to all nodes that share this $sysconf.
# The format is straightforward: one per line, mca_param_name =
# rvalue. Quoting is ignored (so if you use quotes or escape
# characters, they'll be included as part of the value). For example:
# Disable run-time MPI parameter checking
# mpi_param_check = 0
# Note that the value "~/" will be expanded to the current user's home
# directory. For example:
# Change component loading path
# component_path = /usr/local/lib/openmpi:~/my_openmpi_components
# See "ompi_info --param all all" for a full listing of Open MPI MCA
# parameters available and their default values.
#
# Basic behavior to smooth startup
mca_base_component_show_load_errors = 0
opal_set_max_sys_limits = 1
orte_report_launch_progress = 1
# Define timeout for daemons to report back during launch
orte_startup_timeout = 10000
## Protect the shared file systems
orte_no_session_dirs = /panfs,/scratch,/users,/usr/projects
orte_tmpdir_base = /tmp
## Require an allocation to run - protects the frontend
## from inadvertent job executions
orte_allocation_required = 1
## Add the interface for out-of-band communication
## and set it up
oob_tcp_if_include = mic0
oob_tcp_peer_retries = 1000
oob_tcp_sndbuf = 32768
oob_tcp_rcvbuf = 32768
## Define the MPI interconnects
btl = sm,scif,openib,self
## Setup OpenIB - just in case
btl_openib_want_fork_support = 0
btl_openib_receive_queues = S,4096,1024:S,12288,512:S,65536,512
## Enable cpu affinity
hwloc_base_binding_policy = core
## Setup MPI options
mpi_show_handle_leaks = 0
mpi_warn_on_fork = 1
#mpi_abort_print_stack = 0

Просмотреть файл

@ -10,7 +10,7 @@
# Copyright (c) 2004-2005 The Regents of the University of California.
# All rights reserved.
# Copyright (c) 2006 Cisco Systems, Inc. All rights reserved.
# Copyright (c) 2011-2013 Los Alamos National Security, LLC. All rights
# Copyright (c) 2011-2018 Los Alamos National Security, LLC. All rights
# reserved.
# $COPYRIGHT$
#
@ -84,7 +84,7 @@ oob_tcp_sndbuf = 32768
oob_tcp_rcvbuf = 32768
## Define the MPI interconnects
btl = sm,scif,openib,self
btl = sm,openib,self
## Setup OpenIB - just in case
btl_openib_want_fork_support = 0

Просмотреть файл

@ -23,26 +23,11 @@ if [ "$mellanox_autodetect" == "yes" ]; then
with_ucx=$ucx_dir
fi
mxm_dir=${mxm_dir:="$(pkg-config --variable=prefix mxm)"}
if [ -d $mxm_dir ]; then
with_mxm=$mxm_dir
fi
fca_dir=${fca_dir:="$(pkg-config --variable=prefix fca)"}
if [ -d $fca_dir ]; then
with_fca=$fca_dir
fi
hcoll_dir=${hcoll_dir:="$(pkg-config --variable=prefix hcoll)"}
if [ -d $hcoll_dir ]; then
with_hcoll=$hcoll_dir
fi
knem_dir=${knem_dir:="$(pkg-config --variable=prefix knem)"}
if [ -d $knem_dir ]; then
with_knem=$knem_dir
fi
slurm_dir=${slurm_dir:="/usr"}
if [ -f $slurm_dir/include/slurm/slurm.h ]; then
with_slurm=$slurm_dir

Просмотреть файл

@ -56,12 +56,10 @@
# See "ompi_info --param all all" for a full listing of Open MPI MCA
# parameters available and their default values.
coll_fca_enable = 0
scoll_fca_enable = 0
#rmaps_base_mapping_policy = dist:auto
coll = ^ml
hwloc_base_binding_policy = core
btl = vader,openib,self
btl = self
# Basic behavior to smooth startup
mca_base_component_show_load_errors = 0
orte_abort_timeout = 10
@ -77,3 +75,6 @@ oob_tcp_sndbuf = 32768
oob_tcp_rcvbuf = 32768
opal_event_include=epoll
bml_r2_show_unreach_errors = 0

Просмотреть файл

@ -15,7 +15,7 @@
# Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
# reserved.
# Copyright (c) 2015-2017 Intel, Inc. All rights reserved.
# Copyright (c) 2015-2017 Research Organization for Information Science
# Copyright (c) 2015-2018 Research Organization for Information Science
# and Technology (RIST). All rights reserved.
# Copyright (c) 2016 IBM Corporation. All rights reserved.
# Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
@ -93,6 +93,7 @@ SUBDIRS = \
$(OMPI_FORTRAN_USEMPI_DIR) \
mpi/fortran/mpiext-use-mpi \
mpi/fortran/use-mpi-f08/mod \
mpi/fortran/use-mpi-f08/bindings \
$(OMPI_MPIEXT_USEMPIF08_DIRS) \
mpi/fortran/use-mpi-f08 \
mpi/fortran/mpiext-use-mpi-f08 \
@ -124,6 +125,7 @@ DIST_SUBDIRS = \
mpi/fortran/mpiext-use-mpi \
mpi/fortran/use-mpi-f08 \
mpi/fortran/use-mpi-f08/mod \
mpi/fortran/use-mpi-f08/bindings \
mpi/fortran/mpiext-use-mpi-f08 \
mpi/java \
$(OMPI_MPIEXT_ALL_SUBDIRS) \

Просмотреть файл

@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2013-2016 Los Alamos National Security, LLC. All rights
* Copyright (c) 2013-2018 Los Alamos National Security, LLC. All rights
* reseved.
* Copyright (c) 2015 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
@ -99,7 +99,7 @@ int ompi_comm_request_schedule_append (ompi_comm_request_t *request, ompi_comm_r
static int ompi_comm_request_progress (void)
{
ompi_comm_request_t *request, *next;
static int32_t progressing = 0;
static opal_atomic_int32_t progressing = 0;
/* don't allow re-entry */
if (opal_atomic_swap_32 (&progressing, 1)) {

Просмотреть файл

@ -75,7 +75,7 @@ struct ompi_datatype_t {
struct opal_hash_table_t *d_keyhash; /**< Attribute fields */
void* args; /**< Data description for the user */
void* packed_description; /**< Packed description of the datatype */
opal_atomic_intptr_t packed_description; /**< Packed description of the datatype */
uint64_t pml_data; /**< PML-specific information */
/* --- cacheline 6 boundary (384 bytes) --- */
char name[MPI_MAX_OBJECT_NAME];/**< Externally visible name */

Просмотреть файл

@ -45,7 +45,7 @@ __ompi_datatype_create_from_args( int32_t* i, ptrdiff_t * a,
ompi_datatype_t** d, int32_t type );
typedef struct __dt_args {
int32_t ref_count;
opal_atomic_int32_t ref_count;
int32_t create_type;
size_t total_pack_size;
int32_t ci;
@ -104,7 +104,7 @@ typedef struct __dt_args {
pArgs->total_pack_size = (4 + (IC) + (DC)) * sizeof(int) + \
(AC) * sizeof(ptrdiff_t); \
(PDATA)->args = (void*)pArgs; \
(PDATA)->packed_description = NULL; \
(PDATA)->packed_description = 0; \
} while(0)
@ -483,12 +483,12 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
{
ompi_datatype_args_t* args = (ompi_datatype_args_t*)datatype->args;
int next_index = OMPI_DATATYPE_MAX_PREDEFINED;
void *packed_description = datatype->packed_description;
void *packed_description = (void *) datatype->packed_description;
void* recursive_buffer;
if (NULL == packed_description) {
void *_tmp_ptr = NULL;
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (void *) &_tmp_ptr, (void *) 1)) {
if (opal_atomic_compare_exchange_strong_ptr (&datatype->packed_description, (intptr_t *) &_tmp_ptr, 1)) {
if( ompi_datatype_is_predefined(datatype) ) {
packed_description = malloc(2 * sizeof(int));
} else if( NULL == args ) {
@ -510,10 +510,10 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
}
opal_atomic_wmb ();
datatype->packed_description = packed_description;
datatype->packed_description = (intptr_t) packed_description;
} else {
/* another thread beat us to it */
packed_description = datatype->packed_description;
packed_description = (void *) datatype->packed_description;
}
}
@ -521,11 +521,11 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
struct timespec interval = {.tv_sec = 0, .tv_nsec = 1000};
/* wait until the packed description is updated */
while ((void *) 1 == datatype->packed_description) {
while (1 == datatype->packed_description) {
nanosleep (&interval, NULL);
}
packed_description = datatype->packed_description;
packed_description = (void *) datatype->packed_description;
}
*packed_buffer = (const void *) packed_description;
@ -534,7 +534,7 @@ int ompi_datatype_get_pack_description( ompi_datatype_t* datatype,
size_t ompi_datatype_pack_description_length( ompi_datatype_t* datatype )
{
void *packed_description = datatype->packed_description;
void *packed_description = (void *) datatype->packed_description;
if( ompi_datatype_is_predefined(datatype) ) {
return 2 * sizeof(int);

Просмотреть файл

@ -36,7 +36,7 @@ static void __ompi_datatype_allocate( ompi_datatype_t* datatype )
datatype->id = -1;
datatype->d_keyhash = NULL;
datatype->name[0] = '\0';
datatype->packed_description = NULL;
datatype->packed_description = 0;
datatype->pml_data = 0;
}
@ -46,10 +46,10 @@ static void __ompi_datatype_release(ompi_datatype_t * datatype)
ompi_datatype_release_args( datatype );
datatype->args = NULL;
}
if( NULL != datatype->packed_description ) {
free( datatype->packed_description );
datatype->packed_description = NULL;
}
free ((void *) datatype->packed_description );
datatype->packed_description = 0;
if( datatype->d_f_to_c_index >= 0 ) {
opal_pointer_array_set_item( &ompi_datatype_f_to_c_table, datatype->d_f_to_c_index, NULL );
datatype->d_f_to_c_index = -1;

Просмотреть файл

@ -406,7 +406,7 @@ extern const ompi_datatype_t* ompi_datatype_basicDatatypes[OMPI_DATATYPE_MPI_MAX
.d_f_to_c_index = -1, \
.d_keyhash = NULL, \
.args = NULL, \
.packed_description = NULL, \
.packed_description = 0, \
.name = "MPI_" # NAME
#define OMPI_DATATYPE_INITIALIZER_UNAVAILABLE(FLAGS) \

Просмотреть файл

@ -383,7 +383,7 @@ opal_pointer_array_t ompi_datatype_f_to_c_table = {{0}};
(PDST)->super.desc = (PSRC)->super.desc; \
(PDST)->super.opt_desc = (PSRC)->super.opt_desc; \
(PDST)->packed_description = (PSRC)->packed_description; \
(PSRC)->packed_description = NULL; \
(PSRC)->packed_description = 0; \
/* transfer the ptypes */ \
(PDST)->super.ptypes = (PSRC)->super.ptypes; \
(PSRC)->super.ptypes = NULL; \

Просмотреть файл

@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2007-2016 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2007-2018 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2004-2010 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
@ -1157,8 +1157,18 @@ static int fetch_request( mqs_process *proc, mpi_process_info *p_info,
mqs_fetch_data( proc, ompi_datatype + i_info->ompi_datatype_t.offset.name,
64, data_name );
if( '\0' != data_name[0] ) {
snprintf( (char*)res->extra_text[1], 64, "Data: %d * %s",
(int)res->desired_length, data_name );
// res->extra_text[x] is only 64 chars long -- same as
// data_name. If you try to snprintf it into
// res->extra_text with additional text, some compilers
// will warn that we might truncate the string (because it
// can see the static char array lengths). So just put
// data_name in res->extra_text[2] (vs. extra_text[1]),
// where it is guaranteed to fit.
data_name[4] = '\0';
snprintf( (char*)res->extra_text[1], 64, "Data: %d",
(int)res->desired_length);
snprintf( (char*)res->extra_text[2], 64, "%s",
data_name );
}
/* And now compute the real length as specified by the user */
res->desired_length *=

Просмотреть файл

@ -202,7 +202,7 @@ ompi_errhandler_t *ompi_errhandler_create(ompi_errhandler_type_t object_type,
new_errhandler->eh_comm_fn = (MPI_Comm_errhandler_function *)func;
break;
case (OMPI_ERRHANDLER_TYPE_FILE):
new_errhandler->eh_file_fn = (ompi_file_errhandler_fn *)func;
new_errhandler->eh_file_fn = (ompi_file_errhandler_function *)func;
break;
case (OMPI_ERRHANDLER_TYPE_WIN):
new_errhandler->eh_win_fn = (MPI_Win_errhandler_function *)func;

Просмотреть файл

@ -117,7 +117,7 @@ struct ompi_errhandler_t {
can be invoked on any MPI object type, so we need callbacks for
all of three. */
MPI_Comm_errhandler_function *eh_comm_fn;
ompi_file_errhandler_fn *eh_file_fn;
ompi_file_errhandler_function *eh_file_fn;
MPI_Win_errhandler_function *eh_win_fn;
ompi_errhandler_fortran_handler_fn_t *eh_fort_fn;

Просмотреть файл

@ -356,7 +356,8 @@ static inline struct ompi_proc_t *ompi_group_dense_lookup (ompi_group_t *group,
ompi_proc_t *real_proc =
(ompi_proc_t *) ompi_proc_for_name (ompi_proc_sentinel_to_name ((uintptr_t) proc));
if (opal_atomic_compare_exchange_strong_ptr (group->grp_proc_pointers + peer_id, &proc, real_proc)) {
if (opal_atomic_compare_exchange_strong_ptr ((opal_atomic_intptr_t *)(group->grp_proc_pointers + peer_id),
(intptr_t *) &proc, (intptr_t) real_proc)) {
OBJ_RETAIN(real_proc);
}

Просмотреть файл

@ -385,11 +385,11 @@ typedef int (MPI_Datarep_conversion_function)(void *, MPI_Datatype,
typedef void (MPI_Comm_errhandler_function)(MPI_Comm *, int *, ...);
/* This is a little hackish, but errhandler.h needs space for a
MPI_File_errhandler_fn. While it could just be removed, this
MPI_File_errhandler_function. While it could just be removed, this
allows us to maintain a stable ABI within OMPI, at least for
apps that don't use MPI I/O. */
typedef void (ompi_file_errhandler_fn)(MPI_File *, int *, ...);
typedef ompi_file_errhandler_fn MPI_File_errhandler_function;
typedef void (ompi_file_errhandler_function)(MPI_File *, int *, ...);
typedef ompi_file_errhandler_function MPI_File_errhandler_function;
typedef void (MPI_Win_errhandler_function)(MPI_Win *, int *, ...);
typedef void (MPI_User_function)(void *, void *, int *, MPI_Datatype *);
typedef int (MPI_Comm_copy_attr_function)(MPI_Comm, int, void *,
@ -412,7 +412,7 @@ typedef int (MPI_Grequest_cancel_function)(void *, int);
*/
typedef MPI_Comm_errhandler_function MPI_Comm_errhandler_fn
__mpi_interface_removed__("MPI_Comm_errhandler_fn was removed in MPI-3.0; use MPI_Comm_errhandler_function instead");
typedef ompi_file_errhandler_fn MPI_File_errhandler_fn
typedef ompi_file_errhandler_function MPI_File_errhandler_fn
__mpi_interface_removed__("MPI_File_errhandler_fn was removed in MPI-3.0; use MPI_File_errhandler_function instead");
typedef MPI_Win_errhandler_function MPI_Win_errhandler_fn
__mpi_interface_removed__("MPI_Win_errhandler_fn was removed in MPI-3.0; use MPI_Win_errhandler_function instead");
@ -1088,8 +1088,13 @@ OMPI_DECLSPEC extern struct ompi_predefined_datatype_t ompi_mpi_ub __mpi_interfa
#define MPI_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_int)
#define MPI_SHORT_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_short_int)
#define MPI_2INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_2int)
#if !OMPI_OMIT_MPI1_COMPAT_DECLS
/*
* Removed datatypes
*/
#define MPI_UB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_ub)
#define MPI_LB OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_lb)
#endif
#define MPI_WCHAR OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_wchar)
#if OPAL_HAVE_LONG_LONG
#define MPI_LONG_LONG_INT OMPI_PREDEFINED_GLOBAL(MPI_Datatype, ompi_mpi_long_long_int)

Просмотреть файл

@ -90,7 +90,7 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
mca_coll_base_module_t *module)
{
int line = -1, rank, size, sendto, recvfrom, distance, blockcount, err = 0;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
size = ompi_comm_size(comm);
@ -99,9 +99,6 @@ int ompi_coll_base_allgather_intra_bruck(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgather_intra_bruck rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -262,7 +259,7 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
{
int line = -1, rank, size, pow2size, err;
int remote, distance, sendblocklocation;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
size = ompi_comm_size(comm);
@ -289,9 +286,6 @@ ompi_coll_base_allgather_intra_recursivedoubling(const void *sbuf, int scount,
"coll:base:allgather_intra_recursivedoubling rank %d, size %d",
rank, size));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -369,7 +363,7 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
mca_coll_base_module_t *module)
{
int line = -1, rank, size, err, sendto, recvfrom, i, recvdatafrom, senddatafrom;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
size = ompi_comm_size(comm);
@ -378,9 +372,6 @@ int ompi_coll_base_allgather_intra_ring(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgather_intra_ring rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -499,7 +490,7 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
{
int line = -1, rank, size, i, even_rank, err;
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
size = ompi_comm_size(comm);
@ -517,9 +508,6 @@ ompi_coll_base_allgather_intra_neighborexchange(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgather_intra_neighborexchange rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -616,7 +604,7 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
{
int line = -1, err, rank, remote;
char *tmpsend = NULL, *tmprecv = NULL;
ptrdiff_t sext, rext, lb;
ptrdiff_t rext, lb;
rank = ompi_comm_rank(comm);
@ -627,9 +615,6 @@ int ompi_coll_base_allgather_intra_two_procs(const void *sbuf, int scount,
return MPI_ERR_UNSUPPORTED_OPERATION;
}
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }

Просмотреть файл

@ -100,7 +100,7 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
{
int line = -1, err = 0, rank, size, sendto, recvfrom, distance, blockcount, i;
int *new_rcounts = NULL, *new_rdispls = NULL, *new_scounts = NULL, *new_sdispls = NULL;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
struct ompi_datatype_t *new_rdtype, *new_sdtype;
@ -110,9 +110,6 @@ int ompi_coll_base_allgatherv_intra_bruck(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgather_intra_bruck rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -229,7 +226,7 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
mca_coll_base_module_t *module)
{
int line = -1, rank, size, sendto, recvfrom, i, recvdatafrom, senddatafrom, err = 0;
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
size = ompi_comm_size(comm);
@ -238,9 +235,6 @@ int ompi_coll_base_allgatherv_intra_ring(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgatherv_intra_ring rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -361,7 +355,7 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
int line = -1, rank, size, i, even_rank, err = 0;
int neighbor[2], offset_at_step[2], recv_data_from[2], send_data_from;
int new_scounts[2], new_sdispls[2], new_rcounts[2], new_rdispls[2];
ptrdiff_t slb, rlb, sext, rext;
ptrdiff_t rlb, rext;
char *tmpsend = NULL, *tmprecv = NULL;
struct ompi_datatype_t *new_rdtype, *new_sdtype;
@ -381,9 +375,6 @@ ompi_coll_base_allgatherv_intra_neighborexchange(const void *sbuf, int scount,
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"coll:base:allgatherv_intra_neighborexchange rank %d", rank));
err = ompi_datatype_get_extent (sdtype, &slb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &rlb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
@ -509,7 +500,7 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
{
int line = -1, err = 0, rank, remote;
char *tmpsend = NULL, *tmprecv = NULL;
ptrdiff_t sext, rext, lb;
ptrdiff_t rext, lb;
rank = ompi_comm_rank(comm);
@ -520,9 +511,6 @@ int ompi_coll_base_allgatherv_intra_two_procs(const void *sbuf, int scount,
return MPI_ERR_UNSUPPORTED_OPERATION;
}
err = ompi_datatype_get_extent (sdtype, &lb, &sext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err = ompi_datatype_get_extent (rdtype, &lb, &rext);
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }

Просмотреть файл

@ -350,7 +350,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
ptrdiff_t true_lb, true_extent, lb, extent;
ptrdiff_t block_offset, max_real_segsize;
ompi_request_t *reqs[2] = {NULL, NULL};
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
size = ompi_comm_size(comm);
rank = ompi_comm_rank(comm);
@ -528,6 +528,7 @@ ompi_coll_base_allreduce_intra_ring(const void *sbuf, void *rbuf, int count,
error_hndl:
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
__FILE__, line, rank, ret));
ompi_coll_base_free_reqs(reqs, 2);
(void)line; // silence compiler warning
if (NULL != inbuf[0]) free(inbuf[0]);
if (NULL != inbuf[1]) free(inbuf[1]);
@ -627,7 +628,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
size_t typelng;
char *tmpsend = NULL, *tmprecv = NULL, *inbuf[2] = {NULL, NULL};
ptrdiff_t block_offset, max_real_segsize;
ompi_request_t *reqs[2] = {NULL, NULL};
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
ptrdiff_t lb, extent, gap;
size = ompi_comm_size(comm);
@ -847,6 +848,7 @@ ompi_coll_base_allreduce_intra_ring_segmented(const void *sbuf, void *rbuf, int
error_hndl:
OPAL_OUTPUT((ompi_coll_base_framework.framework_output, "%s:%4d\tRank %d Error occurred %d\n",
__FILE__, line, rank, ret));
ompi_coll_base_free_reqs(reqs, 2);
(void)line; // silence compiler warning
if (NULL != inbuf[0]) free(inbuf[0]);
if (NULL != inbuf[1]) free(inbuf[1]);

Просмотреть файл

@ -393,6 +393,7 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
if (0 < total_reqs) {
reqs = ompi_coll_base_comm_get_reqs(module->base_data, 2 * total_reqs);
if (NULL == reqs) { error = -1; line = __LINE__; goto error_hndl; }
reqs[0] = reqs[1] = MPI_REQUEST_NULL;
}
prcv = (char *) rbuf;
@ -468,6 +469,15 @@ int ompi_coll_base_alltoall_intra_linear_sync(const void *sbuf, int scount,
return MPI_SUCCESS;
error_hndl:
/* find a real error code */
if (MPI_ERR_IN_STATUS == error) {
for( ri = 0; ri < nreqs; ri++ ) {
if (MPI_REQUEST_NULL == reqs[ri]) continue;
if (MPI_ERR_PENDING == reqs[ri]->req_status.MPI_ERROR) continue;
error = reqs[ri]->req_status.MPI_ERROR;
break;
}
}
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"%s:%4d\tError occurred %d, rank %2d", __FILE__, line, error,
rank));
@ -661,7 +671,16 @@ int ompi_coll_base_alltoall_intra_basic_linear(const void *sbuf, int scount,
if (MPI_SUCCESS != err) { line = __LINE__; goto err_hndl; }
err_hndl:
if( MPI_SUCCESS != err ) {
if (MPI_SUCCESS != err) {
/* find a real error code */
if (MPI_ERR_IN_STATUS == err) {
for( i = 0; i < nreqs; i++ ) {
if (MPI_REQUEST_NULL == req[i]) continue;
if (MPI_ERR_PENDING == req[i]->req_status.MPI_ERROR) continue;
err = req[i]->req_status.MPI_ERROR;
break;
}
}
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
__FILE__, line, err, rank) );
(void)line; // silence compiler warning

Просмотреть файл

@ -3,7 +3,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2016 The University of Tennessee and The University
* Copyright (c) 2004-2017 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
@ -276,6 +276,15 @@ ompi_coll_base_alltoallv_intra_basic_linear(const void *sbuf, const int *scounts
err = ompi_request_wait_all(nreqs, reqs, MPI_STATUSES_IGNORE);
err_hndl:
/* find a real error code */
if (MPI_ERR_IN_STATUS == err) {
for( i = 0; i < nreqs; i++ ) {
if (MPI_REQUEST_NULL == reqs[i]) continue;
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
err = reqs[i]->req_status.MPI_ERROR;
break;
}
}
/* Free the requests in all cases as they are persistent */
ompi_coll_base_free_reqs(reqs, nreqs);

Просмотреть файл

@ -3,7 +3,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2016 The University of Tennessee and The University
* Copyright (c) 2004-2017 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
@ -102,8 +102,10 @@ int ompi_coll_base_barrier_intra_doublering(struct ompi_communicator_t *comm,
{
int rank, size, err = 0, line = 0, left, right;
rank = ompi_comm_rank(comm);
size = ompi_comm_size(comm);
if( 1 == size )
return OMPI_SUCCESS;
rank = ompi_comm_rank(comm);
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,"ompi_coll_base_barrier_intra_doublering rank %d", rank));
@ -172,8 +174,10 @@ int ompi_coll_base_barrier_intra_recursivedoubling(struct ompi_communicator_t *c
{
int rank, size, adjsize, err, line, mask, remote;
rank = ompi_comm_rank(comm);
size = ompi_comm_size(comm);
if( 1 == size )
return OMPI_SUCCESS;
rank = ompi_comm_rank(comm);
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"ompi_coll_base_barrier_intra_recursivedoubling rank %d",
rank));
@ -251,8 +255,10 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
{
int rank, size, distance, to, from, err, line = 0;
rank = ompi_comm_rank(comm);
size = ompi_comm_size(comm);
if( 1 == size )
return MPI_SUCCESS;
rank = ompi_comm_rank(comm);
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"ompi_coll_base_barrier_intra_bruck rank %d", rank));
@ -285,16 +291,19 @@ int ompi_coll_base_barrier_intra_bruck(struct ompi_communicator_t *comm,
int ompi_coll_base_barrier_intra_two_procs(struct ompi_communicator_t *comm,
mca_coll_base_module_t *module)
{
int remote, err;
int remote, size, err;
size = ompi_comm_size(comm);
if( 1 == size )
return MPI_SUCCESS;
if( 2 != ompi_comm_size(comm) ) {
return MPI_ERR_UNSUPPORTED_OPERATION;
}
remote = ompi_comm_rank(comm);
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"ompi_coll_base_barrier_intra_two_procs rank %d", remote));
if (2 != ompi_comm_size(comm)) {
return MPI_ERR_UNSUPPORTED_OPERATION;
}
remote = (remote + 1) & 0x1;
err = ompi_coll_base_sendrecv_zero(remote, MCA_COLL_BASE_TAG_BARRIER,
@ -324,8 +333,10 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
int i, err, rank, size, line;
ompi_request_t** requests = NULL;
rank = ompi_comm_rank(comm);
size = ompi_comm_size(comm);
if( 1 == size )
return MPI_SUCCESS;
rank = ompi_comm_rank(comm);
/* All non-root send & receive zero-length message. */
if (rank > 0) {
@ -367,11 +378,21 @@ int ompi_coll_base_barrier_intra_basic_linear(struct ompi_communicator_t *comm,
/* All done */
return MPI_SUCCESS;
err_hndl:
if( NULL != requests ) {
/* find a real error code */
if (MPI_ERR_IN_STATUS == err) {
for( i = 0; i < size; i++ ) {
if (MPI_REQUEST_NULL == requests[i]) continue;
if (MPI_ERR_PENDING == requests[i]->req_status.MPI_ERROR) continue;
err = requests[i]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs(requests, size);
}
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
__FILE__, line, err, rank) );
(void)line; // silence compiler warning
if( NULL != requests )
ompi_coll_base_free_reqs(requests, size);
return err;
}
/* copied function (with appropriate renaming) ends here */
@ -385,8 +406,10 @@ int ompi_coll_base_barrier_intra_tree(struct ompi_communicator_t *comm,
{
int rank, size, depth, err, jump, partner;
rank = ompi_comm_rank(comm);
size = ompi_comm_size(comm);
if( 1 == size )
return MPI_SUCCESS;
rank = ompi_comm_rank(comm);
OPAL_OUTPUT((ompi_coll_base_framework.framework_output,
"ompi_coll_base_barrier_intra_tree %d",
rank));

Просмотреть файл

@ -3,7 +3,7 @@
* Copyright (c) 2004-2005 The Trustees of Indiana University and Indiana
* University Research and Technology
* Corporation. All rights reserved.
* Copyright (c) 2004-2016 The University of Tennessee and The University
* Copyright (c) 2004-2017 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2004-2005 High Performance Computing Center Stuttgart,
@ -214,13 +214,29 @@ ompi_coll_base_bcast_intra_generic( void* buffer,
return (MPI_SUCCESS);
error_hndl:
if (MPI_ERR_IN_STATUS == err) {
for( req_index = 0; req_index < 2; req_index++ ) {
if (MPI_REQUEST_NULL == recv_reqs[req_index]) continue;
if (MPI_ERR_PENDING == recv_reqs[req_index]->req_status.MPI_ERROR) continue;
err = recv_reqs[req_index]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs( recv_reqs, 2);
if( NULL != send_reqs ) {
if (MPI_ERR_IN_STATUS == err) {
for( req_index = 0; req_index < tree->tree_nextsize; req_index++ ) {
if (MPI_REQUEST_NULL == send_reqs[req_index]) continue;
if (MPI_ERR_PENDING == send_reqs[req_index]->req_status.MPI_ERROR) continue;
err = send_reqs[req_index]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
}
OPAL_OUTPUT( (ompi_coll_base_framework.framework_output,"%s:%4d\tError occurred %d, rank %2d",
__FILE__, line, err, rank) );
(void)line; // silence compiler warnings
ompi_coll_base_free_reqs( recv_reqs, 2);
if( NULL != send_reqs ) {
ompi_coll_base_free_reqs(send_reqs, tree->tree_nextsize);
}
return err;
}
@ -649,12 +665,21 @@ ompi_coll_base_bcast_intra_basic_linear(void *buff, int count,
* care what the error was -- just that there *was* an error. The
* PML will finish all requests, even if one or more of them fail.
* i.e., by the end of this call, all the requests are free-able.
* So free them anyway -- even if there was an error, and return
* the error after we free everything. */
* So free them anyway -- even if there was an error.
* Note we still need to get the actual error, as collective
* operations cannot return MPI_ERR_IN_STATUS.
*/
err = ompi_request_wait_all(i, reqs, MPI_STATUSES_IGNORE);
err_hndl:
if( MPI_SUCCESS != err ) { /* Free the reqs */
/* first find the real error code */
for( preq = reqs; preq < reqs+i; preq++ ) {
if (MPI_REQUEST_NULL == *preq) continue;
if (MPI_ERR_PENDING == (*preq)->req_status.MPI_ERROR) continue;
err = (*preq)->req_status.MPI_ERROR;
break;
}
ompi_coll_base_free_reqs(reqs, i);
}

Просмотреть файл

@ -326,6 +326,15 @@ ompi_coll_base_gather_intra_linear_sync(const void *sbuf, int scount,
return MPI_SUCCESS;
error_hndl:
if (NULL != reqs) {
/* find a real error code */
if (MPI_ERR_IN_STATUS == ret) {
for( i = 0; i < size; i++ ) {
if (MPI_REQUEST_NULL == reqs[i]) continue;
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
ret = reqs[i]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs(reqs, size);
}
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,

Просмотреть файл

@ -338,16 +338,34 @@ int ompi_coll_base_reduce_generic( const void* sendbuf, void* recvbuf, int origi
return OMPI_SUCCESS;
error_hndl: /* error handler */
/* find a real error code */
if (MPI_ERR_IN_STATUS == ret) {
for( i = 0; i < 2; i++ ) {
if (MPI_REQUEST_NULL == reqs[i]) continue;
if (MPI_ERR_PENDING == reqs[i]->req_status.MPI_ERROR) continue;
ret = reqs[i]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs(reqs, 2);
if( NULL != sreq ) {
if (MPI_ERR_IN_STATUS == ret) {
for( i = 0; i < max_outstanding_reqs; i++ ) {
if (MPI_REQUEST_NULL == sreq[i]) continue;
if (MPI_ERR_PENDING == sreq[i]->req_status.MPI_ERROR) continue;
ret = sreq[i]->req_status.MPI_ERROR;
break;
}
}
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
}
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
if( accumbuf_free != NULL ) free(accumbuf);
OPAL_OUTPUT (( ompi_coll_base_framework.framework_output,
"ERROR_HNDL: node %d file %s line %d error %d\n",
rank, __FILE__, line, ret ));
(void)line; // silence compiler warning
if( inbuf_free[0] != NULL ) free(inbuf_free[0]);
if( inbuf_free[1] != NULL ) free(inbuf_free[1]);
if( accumbuf_free != NULL ) free(accumbuf);
if( NULL != sreq ) {
ompi_coll_base_free_reqs(sreq, max_outstanding_reqs);
}
return ret;
}

Просмотреть файл

@ -464,7 +464,7 @@ ompi_coll_base_reduce_scatter_intra_ring( const void *sbuf, void *rbuf, const in
char *tmpsend = NULL, *tmprecv = NULL, *accumbuf = NULL, *accumbuf_free = NULL;
char *inbuf_free[2] = {NULL, NULL}, *inbuf[2] = {NULL, NULL};
ptrdiff_t extent, max_real_segsize, dsize, gap = 0;
ompi_request_t *reqs[2] = {NULL, NULL};
ompi_request_t *reqs[2] = {MPI_REQUEST_NULL, MPI_REQUEST_NULL};
size = ompi_comm_size(comm);
rank = ompi_comm_rank(comm);

Просмотреть файл

@ -41,7 +41,7 @@ int ompi_coll_base_sendrecv_actual( const void* sendbuf, size_t scount,
{ /* post receive first, then send, then wait... should be fast (I hope) */
int err, line = 0;
size_t rtypesize, stypesize;
ompi_request_t *req;
ompi_request_t *req = MPI_REQUEST_NULL;
ompi_status_public_t rstatus;
/* post new irecv */

Просмотреть файл

@ -71,12 +71,13 @@ BEGIN_C_DECLS
extern bool libnbc_ibcast_skip_dt_decision;
extern int libnbc_iexscan_algorithm;
extern int libnbc_iscan_algorithm;
struct ompi_coll_libnbc_component_t {
mca_coll_base_component_2_0_0_t super;
opal_free_list_t requests;
opal_list_t active_requests;
int32_t active_comms;
opal_atomic_int32_t active_comms;
opal_mutex_t lock; /* protect access to the active_requests list */
};
typedef struct ompi_coll_libnbc_component_t ompi_coll_libnbc_component_t;

Просмотреть файл

@ -54,6 +54,14 @@ static mca_base_var_enum_value_t iexscan_algorithms[] = {
{0, NULL}
};
int libnbc_iscan_algorithm = 0; /* iscan user forced algorithm */
static mca_base_var_enum_value_t iscan_algorithms[] = {
{0, "ignore"},
{1, "linear"},
{2, "recursive_doubling"},
{0, NULL}
};
static int libnbc_open(void);
static int libnbc_close(void);
static int libnbc_register(void);
@ -177,6 +185,16 @@ libnbc_register(void)
&libnbc_iexscan_algorithm);
OBJ_RELEASE(new_enum);
libnbc_iscan_algorithm = 0;
(void) mca_base_var_enum_create("coll_libnbc_iscan_algorithms", iscan_algorithms, &new_enum);
mca_base_component_var_register(&mca_coll_libnbc_component.super.collm_version,
"iscan_algorithm",
"Which iscan algorithm is used: 0 ignore, 1 linear, 2 recursive_doubling",
MCA_BASE_VAR_TYPE_INT, new_enum, 0, MCA_BASE_VAR_FLAG_SETTABLE,
OPAL_INFO_LVL_5, MCA_BASE_VAR_SCOPE_ALL,
&libnbc_iscan_algorithm);
OBJ_RELEASE(new_enum);
return OMPI_SUCCESS;
}

Просмотреть файл

@ -62,7 +62,6 @@ struct dict {
int (*_insert) __P((void *obj, void *k, void *d, int ow));
int (*_probe) __P((void *obj, void *key, void **dat));
void *(*_search) __P((void *obj, const void *k));
const void *(*_csearch) __P((const void *obj, const void *k));
int (*_remove) __P((void *obj, const void *key, int del));
void (*_walk) __P((void *obj, dict_vis_func func));
unsigned (*_count) __P((const void *obj));
@ -75,7 +74,6 @@ struct dict {
#define dict_insert(dct,k,d,o) (dct)->_insert((dct)->_object, (k), (d), (o))
#define dict_probe(dct,k,d) (dct)->_probe((dct)->_object, (k), (d))
#define dict_search(dct,k) (dct)->_search((dct)->_object, (k))
#define dict_csearch(dct,k) (dct)->_csearch((dct)->_object, (k))
#define dict_remove(dct,k,del) (dct)->_remove((dct)->_object, (k), (del))
#define dict_walk(dct,f) (dct)->_walk((dct)->_object, (f))
#define dict_count(dct) (dct)->_count((dct)->_object)

Просмотреть файл

@ -15,7 +15,6 @@
typedef int (*insert_func) __P((void *, void *k, void *d, int o));
typedef int (*probe_func) __P((void *, void *k, void **d));
typedef void *(*search_func) __P((void *, const void *k));
typedef const void *(*csearch_func) __P((const void *, const void *k));
typedef int (*remove_func) __P((void *, const void *k, int d));
typedef void (*walk_func) __P((void *, dict_vis_func visit));
typedef unsigned (*count_func) __P((const void *));

Просмотреть файл

@ -90,7 +90,6 @@ hb_dict_new(dict_cmp_func key_cmp, dict_del_func key_del,
dct->_insert = (insert_func)hb_tree_insert;
dct->_probe = (probe_func)hb_tree_probe;
dct->_search = (search_func)hb_tree_search;
dct->_csearch = (csearch_func)hb_tree_csearch;
dct->_remove = (remove_func)hb_tree_remove;
dct->_empty = (empty_func)hb_tree_empty;
dct->_walk = (walk_func)hb_tree_walk;
@ -170,12 +169,6 @@ hb_tree_search(hb_tree *tree, const void *key)
return NULL;
}
const void *
hb_tree_csearch(const hb_tree *tree, const void *key)
{
return hb_tree_csearch((hb_tree *)tree, key);
}
int
hb_tree_insert(hb_tree *tree, void *key, void *dat, int overwrite)
{

Просмотреть файл

@ -26,7 +26,6 @@ void hb_tree_destroy __P((hb_tree *tree, int del));
int hb_tree_insert __P((hb_tree *tree, void *key, void *dat, int overwrite));
int hb_tree_probe __P((hb_tree *tree, void *key, void **dat));
void *hb_tree_search __P((hb_tree *tree, const void *key));
const void *hb_tree_csearch __P((const hb_tree *tree, const void *key));
int hb_tree_remove __P((hb_tree *tree, const void *key, int del));
void hb_tree_empty __P((hb_tree *tree, int del));
void hb_tree_walk __P((hb_tree *tree, dict_vis_func visit));

Просмотреть файл

@ -11,8 +11,8 @@
* Copyright (c) 2012 Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -130,7 +130,7 @@ int ompi_coll_libnbc_iallgatherv(const void* sendbuf, int sendcount, MPI_Datatyp
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -209,7 +209,7 @@ int ompi_coll_libnbc_iallgatherv_inter(const void* sendbuf, int sendcount, MPI_D
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -7,8 +7,8 @@
* rights reserved.
* Copyright (c) 2013-2017 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -206,7 +206,7 @@ int ompi_coll_libnbc_iallreduce(const void* sendbuf, void* recvbuf, int count, M
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -289,7 +289,7 @@ int ompi_coll_libnbc_iallreduce_inter(const void* sendbuf, void* recvbuf, int co
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -292,7 +292,7 @@ int ompi_coll_libnbc_ialltoall(const void* sendbuf, int sendcount, MPI_Datatype
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -376,7 +376,7 @@ int ompi_coll_libnbc_ialltoall_inter (const void* sendbuf, int sendcount, MPI_Da
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -153,7 +153,7 @@ int ompi_coll_libnbc_ialltoallv(const void* sendbuf, const int *sendcounts, cons
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -241,7 +241,7 @@ int ompi_coll_libnbc_ialltoallv_inter (const void* sendbuf, const int *sendcount
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -139,7 +139,7 @@ int ompi_coll_libnbc_ialltoallw(const void* sendbuf, const int *sendcounts, cons
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -214,7 +214,7 @@ int ompi_coll_libnbc_ialltoallw_inter(const void* sendbuf, const int *sendcounts
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -7,8 +7,8 @@
* rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
@ -108,7 +108,7 @@ int ompi_coll_libnbc_ibarrier(struct ompi_communicator_t *comm, ompi_request_t *
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -195,7 +195,7 @@ int ompi_coll_libnbc_ibarrier_inter(struct ompi_communicator_t *comm, ompi_reque
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2016-2017 IBM Corporation. All rights reserved.
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ibcast(void *buffer, int count, MPI_Datatype datatype, int
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -405,7 +405,7 @@ int ompi_coll_libnbc_ibcast_inter(void *buffer, int count, MPI_Datatype datatype
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -7,8 +7,8 @@
* rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -176,7 +176,7 @@ int ompi_coll_libnbc_iexscan(const void* sendbuf, void* recvbuf, int count, MPI_
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -8,8 +8,8 @@
* Copyright (c) 2013 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -185,7 +185,7 @@ int ompi_coll_libnbc_igather(const void* sendbuf, int sendcount, MPI_Datatype se
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -265,7 +265,7 @@ int ompi_coll_libnbc_igather_inter(const void* sendbuf, int sendcount, MPI_Datat
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -8,8 +8,8 @@
* Copyright (c) 2013 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2015 Mellanox Technologies. All rights reserved.
@ -117,7 +117,7 @@ int ompi_coll_libnbc_igatherv(const void* sendbuf, int sendcount, MPI_Datatype s
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -197,7 +197,7 @@ int ompi_coll_libnbc_igatherv_inter(const void* sendbuf, int sendcount, MPI_Data
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -173,7 +173,7 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -181,157 +181,6 @@ int ompi_coll_libnbc_ineighbor_allgather(const void *sbuf, int scount, MPI_Datat
return OMPI_SUCCESS;
}
/* better binomial bcast
* working principle:
* - each node gets a virtual rank vrank
* - the 'root' node get vrank 0
* - node 0 gets the vrank of the 'root'
* - all other ranks stay identical (they do not matter)
*
* Algorithm:
* - each node with vrank > 2^r and vrank < 2^r+1 receives from node
* vrank - 2^r (vrank=1 receives from 0, vrank 0 receives never)
* - each node sends each round r to node vrank + 2^r
* - a node stops to send if 2^r > commsize
*/
#define RANK2VRANK(rank, vrank, root) \
{ \
vrank = rank; \
if (rank == 0) vrank = root; \
if (rank == root) vrank = 0; \
}
#define VRANK2RANK(rank, vrank, root) \
{ \
rank = vrank; \
if (vrank == 0) rank = root; \
if (vrank == root) rank = 0; \
}
static inline int bcast_sched_binomial(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
int maxr, vrank, peer, res;
maxr = (int)ceil((log((double)p)/LOG2));
RANK2VRANK(rank, vrank, root);
/* receive from the right hosts */
if (vrank != 0) {
for (int r = 0 ; r < maxr ; ++r) {
if ((vrank >= (1 << r)) && (vrank < (1 << (r + 1)))) {
VRANK2RANK(peer, vrank - (1 << r), root);
res = NBC_Sched_recv (buffer, false, count, datatype, peer, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
}
res = NBC_Sched_barrier (schedule);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
/* now send to the right hosts */
for (int r = 0 ; r < maxr ; ++r) {
if (((vrank + (1 << r) < p) && (vrank < (1 << r))) || (vrank == 0)) {
VRANK2RANK(peer, vrank + (1 << r), root);
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
}
return OMPI_SUCCESS;
}
/* simple linear MPI_Ibcast */
static inline int bcast_sched_linear(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype) {
int res;
/* send to all others */
if(rank == root) {
for (int peer = 0 ; peer < p ; ++peer) {
if (peer != root) {
/* send msg to peer */
res = NBC_Sched_send (buffer, false, count, datatype, peer, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
}
} else {
/* recv msg from root */
res = NBC_Sched_recv (buffer, false, count, datatype, root, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
return OMPI_SUCCESS;
}
/* simple chained MPI_Ibcast */
static inline int bcast_sched_chain(int rank, int p, int root, NBC_Schedule *schedule, void *buffer, int count, MPI_Datatype datatype, int fragsize, size_t size) {
int res, vrank, rpeer, speer, numfrag, fragcount, thiscount;
MPI_Aint ext;
char *buf;
RANK2VRANK(rank, vrank, root);
VRANK2RANK(rpeer, vrank-1, root);
VRANK2RANK(speer, vrank+1, root);
res = ompi_datatype_type_extent(datatype, &ext);
if (MPI_SUCCESS != res) {
NBC_Error("MPI Error in ompi_datatype_type_extent() (%i)", res);
return res;
}
if (count == 0) {
return OMPI_SUCCESS;
}
numfrag = count * size/fragsize;
if ((count * size) % fragsize != 0) {
numfrag++;
}
fragcount = count/numfrag;
for (int fragnum = 0 ; fragnum < numfrag ; ++fragnum) {
buf = (char *) buffer + fragnum * fragcount * ext;
thiscount = fragcount;
if (fragnum == numfrag-1) {
/* last fragment may not be full */
thiscount = count - fragcount * fragnum;
}
/* root does not receive */
if (vrank != 0) {
res = NBC_Sched_recv (buf, false, thiscount, datatype, rpeer, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
/* last rank does not send */
if (vrank != p-1) {
res = NBC_Sched_send (buf, false, thiscount, datatype, speer, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
/* this barrier here seems awaward but isn't!!!! */
if (vrank == 0) {
res = NBC_Sched_barrier (schedule);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
return res;
}
}
}
}
return OMPI_SUCCESS;
}
int ompi_coll_libnbc_neighbor_allgather_init(const void *sbuf, int scount, MPI_Datatype stype, void *rbuf,
int rcount, MPI_Datatype rtype, struct ompi_communicator_t *comm,

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -175,7 +175,7 @@ int ompi_coll_libnbc_ineighbor_allgatherv(const void *sbuf, int scount, MPI_Data
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -177,7 +177,7 @@ int ompi_coll_libnbc_ineighbor_alltoall(const void *sbuf, int scount, MPI_Dataty
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -182,7 +182,7 @@ int ompi_coll_libnbc_ineighbor_alltoallv(const void *sbuf, const int *scounts, c
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -167,7 +167,7 @@ int ompi_coll_libnbc_ineighbor_alltoallw(const void *sbuf, const int *scounts, c
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -516,6 +516,11 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
int res;
ptrdiff_t ext, lb;
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
if (OMPI_SUCCESS != res) {
NBC_Error ("MPI Error in ompi_datatype_pack_external_size() (%i)", res);
return res;
}
#if OPAL_CUDA_SUPPORT
if(NBC_Type_intrinsic(srctype) && !(opal_cuda_check_bufs((char *)tgt, (char *)src))) {
#else
@ -523,7 +528,6 @@ static inline int NBC_Unpack(void *src, int srccount, MPI_Datatype srctype, void
#endif /* OPAL_CUDA_SUPPORT */
/* if we have the same types and they are contiguous (intrinsic
* types are contiguous), we can just use a single memcpy */
res = ompi_datatype_pack_external_size("external32", srccount, srctype, &size);
res = ompi_datatype_get_extent (srctype, &lb, &ext);
if (OMPI_SUCCESS != res) {
NBC_Error ("MPI Error in MPI_Type_extent() (%i)", res);

Просмотреть файл

@ -7,8 +7,8 @@
* rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -218,7 +218,7 @@ int ompi_coll_libnbc_ireduce(const void* sendbuf, void* recvbuf, int count, MPI_
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -284,7 +284,7 @@ int ompi_coll_libnbc_ireduce_inter(const void* sendbuf, void* recvbuf, int count
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -7,8 +7,8 @@
* rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
@ -219,7 +219,7 @@ int ompi_coll_libnbc_ireduce_scatter (const void* sendbuf, void* recvbuf, const
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -361,7 +361,7 @@ int ompi_coll_libnbc_ireduce_scatter_inter (const void* sendbuf, void* recvbuf,
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -8,8 +8,8 @@
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2013-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -217,7 +217,7 @@ int ompi_coll_libnbc_ireduce_scatter_block(const void* sendbuf, void* recvbuf, i
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -356,7 +356,7 @@ int ompi_coll_libnbc_ireduce_scatter_block_inter(const void* sendbuf, void* recv
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -5,8 +5,8 @@
* Corporation. All rights reserved.
* Copyright (c) 2006 The Technical University of Chemnitz. All
* rights reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
@ -18,8 +18,20 @@
* Author(s): Torsten Hoefler <htor@cs.indiana.edu>
*
*/
#include "opal/include/opal/align.h"
#include "ompi/op/op.h"
#include "nbc_internal.h"
static inline int scan_sched_linear(
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
void *tmpbuf);
static inline int scan_sched_recursivedoubling(
int rank, int comm_size, const void *sendbuf, void *recvbuf,
int count, MPI_Datatype datatype, MPI_Op op, char inplace,
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2);
#ifdef NBC_CACHE_SCHEDULE
/* tree comparison function for schedule cache */
int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
@ -39,27 +51,41 @@ int NBC_Scan_args_compare(NBC_Scan_args *a, NBC_Scan_args *b, void *param) {
}
#endif
/* linear iscan
* working principle:
* 1. each node (but node 0) receives from left neighbor
* 2. performs op
* 3. all but rank p-1 do sends to it's right neighbor and exits
*
*/
static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
struct ompi_communicator_t *comm, ompi_request_t ** request,
struct mca_coll_base_module_2_3_0_t *module, bool persistent) {
int rank, p, res;
ptrdiff_t gap, span;
NBC_Schedule *schedule;
void *tmpbuf = NULL;
char inplace;
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
int rank, p, res;
ptrdiff_t gap, span;
NBC_Schedule *schedule;
void *tmpbuf = NULL, *tmpbuf1 = NULL, *tmpbuf2 = NULL;
enum { NBC_SCAN_LINEAR, NBC_SCAN_RDBL } alg;
char inplace;
ompi_coll_libnbc_module_t *libnbc_module = (ompi_coll_libnbc_module_t*) module;
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
NBC_IN_PLACE(sendbuf, recvbuf, inplace);
rank = ompi_comm_rank (comm);
p = ompi_comm_size (comm);
rank = ompi_comm_rank (comm);
p = ompi_comm_size (comm);
if (count == 0) {
return nbc_get_noop_request(persistent, request);
}
span = opal_datatype_span(&datatype->super, count, &gap);
if (libnbc_iscan_algorithm == 2) {
alg = NBC_SCAN_RDBL;
ptrdiff_t span_align = OPAL_ALIGN(span, datatype->super.align, ptrdiff_t);
tmpbuf = malloc(span_align + span);
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
tmpbuf1 = (void *)(-gap);
tmpbuf2 = (char *)(span_align) - gap;
} else {
alg = NBC_SCAN_LINEAR;
if (rank > 0) {
tmpbuf = malloc(span);
if (NULL == tmpbuf) { return OMPI_ERR_OUT_OF_RESOURCE; }
}
}
#ifdef NBC_CACHE_SCHEDULE
NBC_Scan_args *args, *found, search;
@ -75,60 +101,28 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
#endif
schedule = OBJ_NEW(NBC_Schedule);
if (OPAL_UNLIKELY(NULL == schedule)) {
return OMPI_ERR_OUT_OF_RESOURCE;
}
if (!inplace) {
/* copy data to receivebuf */
res = NBC_Sched_copy ((void *)sendbuf, false, count, datatype,
recvbuf, false, count, datatype, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
return res;
}
}
if(rank != 0) {
span = opal_datatype_span(&datatype->super, count, &gap);
tmpbuf = malloc (span);
if (NULL == tmpbuf) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return OMPI_ERR_OUT_OF_RESOURCE;
}
/* we have to wait until we have the data */
res = NBC_Sched_recv ((void *)(-gap), true, count, datatype, rank-1, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
/* perform the reduce in my local buffer */
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
res = NBC_Sched_op ((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
}
if (rank != p-1) {
res = NBC_Sched_send (recvbuf, false, count, datatype, rank+1, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
if (alg == NBC_SCAN_LINEAR) {
res = scan_sched_linear(rank, p, sendbuf, recvbuf, count, datatype,
op, inplace, schedule, tmpbuf);
} else {
res = scan_sched_recursivedoubling(rank, p, sendbuf, recvbuf, count,
datatype, op, inplace, schedule, tmpbuf1, tmpbuf2);
}
res = NBC_Sched_commit (schedule);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
res = NBC_Sched_commit(schedule);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
#ifdef NBC_CACHE_SCHEDULE
@ -162,14 +156,160 @@ static int nbc_scan_init(const void* sendbuf, void* recvbuf, int count, MPI_Data
}
#endif
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
res = NBC_Schedule_request(schedule, comm, libnbc_module, persistent, request, tmpbuf);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
OBJ_RELEASE(schedule);
free(tmpbuf);
return res;
}
return OMPI_SUCCESS;
return OMPI_SUCCESS;
}
/*
* scan_sched_linear:
*
* Function: Linear algorithm for inclusive scan.
* Accepts: Same as MPI_Iscan
* Returns: MPI_SUCCESS or error code
*
* Working principle:
* 1. Each process (but process 0) receives from left neighbor
* 2. Performs op
* 3. All but rank p-1 do sends to it's right neighbor and exits
*
* Schedule length: O(1)
*/
static inline int scan_sched_linear(
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, char inplace, NBC_Schedule *schedule,
void *tmpbuf)
{
int res = OMPI_SUCCESS;
if (!inplace) {
/* Copy data to recvbuf */
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
recvbuf, false, count, datatype, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
}
if (rank > 0) {
ptrdiff_t gap;
opal_datatype_span(&datatype->super, count, &gap);
/* We have to wait until we have the data */
res = NBC_Sched_recv((void *)(-gap), true, count, datatype, rank - 1, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
/* Perform the reduce in my local buffer */
/* this cannot be done until tmpbuf is unused :-( so barrier after the op */
res = NBC_Sched_op((void *)(-gap), true, recvbuf, false, count, datatype, op, schedule,
true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
}
if (rank != comm_size - 1) {
res = NBC_Sched_send(recvbuf, false, count, datatype, rank + 1, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
}
cleanup_and_return:
return res;
}
/*
* scan_sched_recursivedoubling:
*
* Function: Recursive doubling algorithm for inclusive scan.
* Accepts: Same as MPI_Iscan
* Returns: MPI_SUCCESS or error code
*
* Description: Implements recursive doubling algorithm for MPI_Iscan.
* The algorithm preserves order of operations so it can
* be used both by commutative and non-commutative operations.
*
* Example for 5 processes and commutative operation MPI_SUM:
* Process: 0 1 2 3 4
* recvbuf: [0] [1] [2] [3] [4]
* psend: [0] [1] [2] [3] [4]
*
* Step 1:
* recvbuf: [0] [0+1] [2] [2+3] [4]
* psend: [1+0] [0+1] [3+2] [2+3] [4]
*
* Step 2:
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [4]
* psend: [(3+2)+(1+0)] [(2+3)+(0+1)] [(1+0)+(3+2)] [(1+0)+(2+3)] [4]
*
* Step 3:
* recvbuf: [0] [0+1] [(1+0)+2] [(1+0)+(2+3)] [((3+2)+(1+0))+4]
* psend: [4+((3+2)+(1+0))] [((3+2)+(1+0))+4]
*
* Time complexity (worst case): \ceil(\log_2(p))(2\alpha + 2m\beta + 2m\gamma)
* Memory requirements (per process): 2 * count * typesize = O(count)
* Limitations: intra-communicators only
* Schedule length: O(log(p))
*/
static inline int scan_sched_recursivedoubling(
int rank, int comm_size, const void *sendbuf, void *recvbuf, int count,
MPI_Datatype datatype, MPI_Op op, char inplace,
NBC_Schedule *schedule, void *tmpbuf1, void *tmpbuf2)
{
int res = OMPI_SUCCESS;
if (!inplace) {
res = NBC_Sched_copy((void *)sendbuf, false, count, datatype,
recvbuf, false, count, datatype, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
}
if (comm_size < 2)
goto cleanup_and_return;
char *psend = (char *)tmpbuf1;
char *precv = (char *)tmpbuf2;
res = NBC_Sched_copy(recvbuf, false, count, datatype,
psend, true, count, datatype, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
int is_commute = ompi_op_is_commute(op);
for (int mask = 1; mask < comm_size; mask <<= 1) {
int remote = rank ^ mask;
if (remote < comm_size) {
res = NBC_Sched_send(psend, true, count, datatype, remote, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
res = NBC_Sched_recv(precv, true, count, datatype, remote, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
if (rank > remote) {
/* Accumulate prefix reduction: recvbuf = precv <op> recvbuf */
res = NBC_Sched_op(precv, true, recvbuf, false, count,
datatype, op, schedule, false);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
/* Partial result: psend = precv <op> psend */
res = NBC_Sched_op(precv, true, psend, true, count,
datatype, op, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
} else {
if (is_commute) {
/* psend = precv <op> psend */
res = NBC_Sched_op(precv, true, psend, true, count,
datatype, op, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
} else {
/* precv = psend <op> precv */
res = NBC_Sched_op(psend, true, precv, true, count,
datatype, op, schedule, true);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) { goto cleanup_and_return; }
char *tmp = psend;
psend = precv;
precv = tmp;
}
}
}
}
cleanup_and_return:
return res;
}
int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Datatype datatype, MPI_Op op,
@ -182,7 +322,7 @@ int ompi_coll_libnbc_iscan(const void* sendbuf, void* recvbuf, int count, MPI_Da
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -10,8 +10,8 @@
* Copyright (c) 2013 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -179,7 +179,7 @@ int ompi_coll_libnbc_iscatter (const void* sendbuf, int sendcount, MPI_Datatype
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -258,7 +258,7 @@ int ompi_coll_libnbc_iscatter_inter (const void* sendbuf, int sendcount, MPI_Dat
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -10,8 +10,8 @@
* Copyright (c) 2013 The University of Tennessee and The University
* of Tennessee Research Foundation. All rights
* reserved.
* Copyright (c) 2014-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2014-2018 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 IBM Corporation. All rights reserved.
* Copyright (c) 2018 FUJITSU LIMITED. All rights reserved.
* $COPYRIGHT$
@ -114,7 +114,7 @@ int ompi_coll_libnbc_iscatterv(const void* sendbuf, const int *sendcounts, const
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}
@ -192,7 +192,7 @@ int ompi_coll_libnbc_iscatterv_inter(const void* sendbuf, const int *sendcounts,
}
res = NBC_Start(*(ompi_coll_libnbc_request_t **)request);
if (OPAL_UNLIKELY(OMPI_SUCCESS != res)) {
NBC_Return_handle ((ompi_coll_libnbc_request_t *)request);
NBC_Return_handle (*(ompi_coll_libnbc_request_t **)request);
*request = &ompi_request_null.request;
return res;
}

Просмотреть файл

@ -36,7 +36,7 @@ struct mca_coll_monitoring_module_t {
mca_coll_base_module_t super;
mca_coll_base_comm_coll_t real;
mca_monitoring_coll_data_t*data;
int32_t is_initialized;
opal_atomic_int32_t is_initialized;
};
typedef struct mca_coll_monitoring_module_t mca_coll_monitoring_module_t;
OMPI_DECLSPEC OBJ_CLASS_DECLARATION(mca_coll_monitoring_module_t);

Просмотреть файл

@ -1,6 +1,7 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2013-2015 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2015 Bull SAS. All rights reserved.
* Copyright (c) 2015 Research Organization for Information Science
@ -91,7 +92,7 @@ typedef struct ompi_coll_portals4_tree_t {
struct mca_coll_portals4_module_t {
mca_coll_base_module_t super;
size_t coll_count;
opal_atomic_size_t coll_count;
/* record handlers dedicated to fallback if offloaded operations are not supported */
mca_coll_base_module_reduce_fn_t previous_reduce;

Просмотреть файл

@ -114,7 +114,7 @@ BEGIN_C_DECLS
typedef struct mca_coll_sm_in_use_flag_t {
/** Number of processes currently using this set of
segments */
volatile uint32_t mcsiuf_num_procs_using;
opal_atomic_uint32_t mcsiuf_num_procs_using;
/** Must match data->mcb_count */
volatile uint32_t mcsiuf_operation_count;
} mca_coll_sm_in_use_flag_t;
@ -152,7 +152,7 @@ BEGIN_C_DECLS
/** Pointer to my parent's barrier control pages (will be NULL
for communicator rank 0; odd index pages are "in", even
index pages are "out") */
uint32_t *mcb_barrier_control_parent;
opal_atomic_uint32_t *mcb_barrier_control_parent;
/** Pointers to my childrens' barrier control pages (they're
contiguous in memory, so we only point to the base -- the

Просмотреть файл

@ -56,7 +56,8 @@ int mca_coll_sm_barrier_intra(struct ompi_communicator_t *comm,
int rank, buffer_set;
mca_coll_sm_comm_t *data;
uint32_t i, num_children;
volatile uint32_t *me_in, *me_out, *parent, *children = NULL;
volatile uint32_t *me_in, *me_out, *children = NULL;
opal_atomic_uint32_t *parent;
int uint_control_size;
mca_coll_sm_module_t *sm_module = (mca_coll_sm_module_t*) module;

Просмотреть файл

@ -372,7 +372,7 @@ int ompi_coll_sm_lazy_enable(mca_coll_base_module_t *module,
data->mcb_barrier_control_me = (uint32_t*)
(base + (rank * control_size * num_barrier_buffers * 2));
if (data->mcb_tree[rank].mcstn_parent) {
data->mcb_barrier_control_parent = (uint32_t*)
data->mcb_barrier_control_parent = (opal_atomic_uint32_t*)
(base +
(data->mcb_tree[rank].mcstn_parent->mcstn_id * control_size *
num_barrier_buffers * 2));

Просмотреть файл

@ -7,7 +7,7 @@
* Copyright (c) 2015 Bull SAS. All rights reserved.
* Copyright (c) 2016-2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* Copyright (c) 2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2017-2018 Los Alamos National Security, LLC. All rights
* reserved.
* $COPYRIGHT$
*
@ -34,7 +34,7 @@
/*** Monitoring specific variables ***/
/* Keep tracks of how many components are currently using the common part */
static int32_t mca_common_monitoring_hold = 0;
static opal_atomic_int32_t mca_common_monitoring_hold = 0;
/* Output parameters */
int mca_common_monitoring_output_stream_id = -1;
static opal_output_stream_t mca_common_monitoring_output_stream_obj = {
@ -61,18 +61,18 @@ static char* mca_common_monitoring_initial_filename = "";
static char* mca_common_monitoring_current_filename = NULL;
/* array for stroring monitoring data*/
static size_t* pml_data = NULL;
static size_t* pml_count = NULL;
static size_t* filtered_pml_data = NULL;
static size_t* filtered_pml_count = NULL;
static size_t* osc_data_s = NULL;
static size_t* osc_count_s = NULL;
static size_t* osc_data_r = NULL;
static size_t* osc_count_r = NULL;
static size_t* coll_data = NULL;
static size_t* coll_count = NULL;
static opal_atomic_size_t* pml_data = NULL;
static opal_atomic_size_t* pml_count = NULL;
static opal_atomic_size_t* filtered_pml_data = NULL;
static opal_atomic_size_t* filtered_pml_count = NULL;
static opal_atomic_size_t* osc_data_s = NULL;
static opal_atomic_size_t* osc_count_s = NULL;
static opal_atomic_size_t* osc_data_r = NULL;
static opal_atomic_size_t* osc_count_r = NULL;
static opal_atomic_size_t* coll_data = NULL;
static opal_atomic_size_t* coll_count = NULL;
static size_t* size_histogram = NULL;
static opal_atomic_size_t* size_histogram = NULL;
static const int max_size_histogram = 66;
static double log10_2 = 0.;
@ -241,7 +241,7 @@ void mca_common_monitoring_finalize( void )
opal_output_close(mca_common_monitoring_output_stream_id);
free(mca_common_monitoring_output_stream_obj.lds_prefix);
/* Free internal data structure */
free(pml_data); /* a single allocation */
free((void *) pml_data); /* a single allocation */
opal_hash_table_remove_all( common_monitoring_translation_ht );
OBJ_RELEASE(common_monitoring_translation_ht);
mca_common_monitoring_coll_finalize();
@ -446,7 +446,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
if( NULL == pml_data ) {
int array_size = (10 + max_size_histogram) * nprocs_world;
pml_data = (size_t*)calloc(array_size, sizeof(size_t));
pml_data = (opal_atomic_size_t*)calloc(array_size, sizeof(size_t));
pml_count = pml_data + nprocs_world;
filtered_pml_data = pml_count + nprocs_world;
filtered_pml_count = filtered_pml_data + nprocs_world;
@ -493,7 +493,7 @@ int mca_common_monitoring_add_procs(struct ompi_proc_t **procs,
static void mca_common_monitoring_reset( void )
{
int array_size = (10 + max_size_histogram) * nprocs_world;
memset(pml_data, 0, array_size * sizeof(size_t));
memset((void *) pml_data, 0, array_size * sizeof(size_t));
mca_common_monitoring_coll_reset();
}

Просмотреть файл

@ -30,12 +30,12 @@ struct mca_monitoring_coll_data_t {
int world_rank;
int is_released;
ompi_communicator_t*p_comm;
size_t o2a_count;
size_t o2a_size;
size_t a2o_count;
size_t a2o_size;
size_t a2a_count;
size_t a2a_size;
opal_atomic_size_t o2a_count;
opal_atomic_size_t o2a_size;
opal_atomic_size_t a2o_count;
opal_atomic_size_t a2o_size;
opal_atomic_size_t a2a_count;
opal_atomic_size_t a2a_size;
};
/* Collectives operation monitoring */

Просмотреть файл

@ -4,7 +4,7 @@
* reserved.
* Copyright (c) 2013-2017 Inria. All rights reserved.
* Copyright (c) 2013-2015 Bull SAS. All rights reserved.
* Copyright (c) 2016 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2016-2018 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2017 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
* $COPYRIGHT$
@ -42,10 +42,30 @@ writing 4x4 matrix to monitoring_avg.mat
*/
#include "ompi_config.h"
#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>
#include <string.h>
#include <stdbool.h>
#if OMPI_BUILD_FORTRAN_BINDINGS
// Set these #defines in the same way that
// ompi/mpi/fortran/mpif-h/Makefile.am does when compiling the real
// Fortran mpif.h bindings. They set behaviors in the Fortran header
// files so that we can compile properly.
#define OMPI_BUILD_MPI_PROFILING 0
#define OMPI_COMPILING_FORTRAN_WRAPPERS 1
#endif
#include "opal/threads/thread_usage.h"
#include "ompi/include/mpi.h"
#include "ompi/mpi/fortran/base/constants.h"
#include "ompi/mpi/fortran/base/fint_2_int.h"
#if OMPI_BUILD_FORTRAN_BINDINGS
#include "ompi/mpi/fortran/mpif-h/bindings.h"
#endif
static MPI_T_pvar_session session;
static int comm_world_size;
@ -383,12 +403,6 @@ int write_mat(char * filename, size_t * mat, unsigned int dim)
* MPI binding for fortran
*/
#include <stdbool.h>
#include "ompi_config.h"
#include "opal/threads/thread_usage.h"
#include "ompi/mpi/fortran/base/constants.h"
#include "ompi/mpi/fortran/base/fint_2_int.h"
void monitoring_prof_mpi_init_f2c( MPI_Fint * );
void monitoring_prof_mpi_finalize_f2c( MPI_Fint * );
@ -423,8 +437,6 @@ void monitoring_prof_mpi_finalize_f2c( MPI_Fint *ierr ) {
#pragma weak MPI_Finalize_f = monitoring_prof_mpi_finalize_f2c
#pragma weak MPI_Finalize_f08 = monitoring_prof_mpi_finalize_f2c
#elif OMPI_BUILD_FORTRAN_BINDINGS
#define OMPI_F77_PROTOTYPES_MPI_H
#include "ompi/mpi/fortran/mpif-h/bindings.h"
OMPI_GENERATE_F77_BINDINGS (MPI_INIT,
mpi_init,

Просмотреть файл

@ -34,7 +34,7 @@ static opal_mutex_t mca_common_ompio_cuda_mutex; /* lock for thread saf
static mca_allocator_base_component_t* mca_common_ompio_allocator_component=NULL;
static mca_allocator_base_module_t* mca_common_ompio_allocator=NULL;
static int32_t mca_common_ompio_cuda_init = 0;
static opal_atomic_int32_t mca_common_ompio_cuda_init = 0;
static int32_t mca_common_ompio_pagesize=4096;
static void* mca_common_ompio_cuda_alloc_seg ( void *ctx, size_t *size );
static void mca_common_ompio_cuda_free_seg ( void *ctx, void *buf );

Просмотреть файл

@ -124,7 +124,7 @@ ompi_mtl_ofi_component_register(void)
MCA_BASE_VAR_SCOPE_READONLY,
&param_priority);
prov_include = "psm,psm2,gni";
prov_include = NULL;
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
"provider_include",
"Comma-delimited list of OFI providers that are considered for use (e.g., \"psm,psm2\"; an empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_exclude.",
@ -133,7 +133,7 @@ ompi_mtl_ofi_component_register(void)
MCA_BASE_VAR_SCOPE_READONLY,
&prov_include);
prov_exclude = NULL;
prov_exclude = "shm,sockets,tcp,udp,rstream";
mca_base_component_var_register(&mca_mtl_ofi_component.super.mtl_version,
"provider_exclude",
"Comma-delimited list of OFI providers that are not considered for use (default: \"sockets,mxm\"; empty value means that all providers will be considered). Mutually exclusive with mtl_ofi_provider_include.",

Просмотреть файл

@ -115,12 +115,12 @@ struct mca_mtl_portals4_module_t {
opal_mutex_t short_block_mutex;
/** number of send-side operations started */
uint64_t opcount;
opal_atomic_uint64_t opcount;
#if OPAL_ENABLE_DEBUG
/** number of receive-side operations started. Used only for
debugging */
uint64_t recv_opcount;
opal_atomic_uint64_t recv_opcount;
#endif
#if OMPI_MTL_PORTALS4_FLOW_CONTROL

Просмотреть файл

@ -1,7 +1,7 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2012 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2015-2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
* reserved.
* $COPYRIGHT$
*

Просмотреть файл

@ -36,7 +36,7 @@ OBJ_CLASS_DECLARATION(ompi_mtl_portals4_pending_request_t);
struct ompi_mtl_portals4_flowctl_t {
int32_t flowctl_active;
int32_t send_slots;
opal_atomic_int32_t send_slots;
int32_t max_send_slots;
opal_list_t pending_sends;
opal_free_list_t pending_fl;
@ -46,7 +46,7 @@ struct ompi_mtl_portals4_flowctl_t {
/** Flow control epoch counter. Triggered events should be
based on epoch counter. */
int64_t epoch_counter;
opal_atomic_int64_t epoch_counter;
/** Flow control trigger CT. Only has meaning at root. */
ptl_handle_ct_t trigger_ct_h;

Просмотреть файл

@ -54,8 +54,8 @@ struct ompi_mtl_portals4_isend_request_t {
struct ompi_mtl_portals4_pending_request_t *pending;
#endif
ptl_size_t length;
int32_t pending_get;
uint32_t event_count;
opal_atomic_int32_t pending_get;
opal_atomic_uint32_t event_count;
};
typedef struct ompi_mtl_portals4_isend_request_t ompi_mtl_portals4_isend_request_t;
@ -76,7 +76,7 @@ struct ompi_mtl_portals4_recv_request_t {
void *delivery_ptr;
size_t delivery_len;
volatile bool req_started;
int32_t pending_reply;
opal_atomic_int32_t pending_reply;
#if OPAL_ENABLE_DEBUG
uint64_t opcount;
ptl_hdr_data_t hdr_data;

Просмотреть файл

@ -50,7 +50,7 @@
OSC_MONITORING_SET_TEMPLATE_FCT_NAME(template) (ompi_osc_base_module_t*module) \
{ \
/* Define the ompi_osc_monitoring_module_## template ##_init_done variable */ \
static int32_t init_done = 0; \
opal_atomic_int32_t init_done = 0; \
/* Define and set the ompi_osc_monitoring_## template \
* ##_template variable. The functions recorded here are \
* linked to the original functions of the original \

Просмотреть файл

@ -95,7 +95,7 @@ struct ompi_osc_portals4_module_t {
ptl_handle_md_t req_md_h; /* memory descriptor with event completion used by this window */
ptl_handle_me_t data_me_h; /* data match list entry (MB are CID | OSC_PORTALS4_MB_DATA) */
ptl_handle_me_t control_me_h; /* match list entry for control data (node_state_t). Match bits are (CID | OSC_PORTALS4_MB_CONTROL). */
int64_t opcount;
opal_atomic_int64_t opcount;
ptl_match_bits_t match_bits; /* match bits for module. Same as cid for comm in most cases. */
ptl_iovec_t *origin_iovec_list; /* list of memory segments that compose the noncontiguous region */

Просмотреть файл

@ -189,7 +189,7 @@ number_of_fragments(ptl_size_t length, ptl_size_t maxlength)
/* put in segments no larger than segment_length */
static int
segmentedPut(int64_t *opcount,
segmentedPut(opal_atomic_int64_t *opcount,
ptl_handle_md_t md_h,
ptl_size_t origin_offset,
ptl_size_t put_length,
@ -236,7 +236,7 @@ segmentedPut(int64_t *opcount,
/* get in segments no larger than segment_length */
static int
segmentedGet(int64_t *opcount,
segmentedGet(opal_atomic_int64_t *opcount,
ptl_handle_md_t md_h,
ptl_size_t origin_offset,
ptl_size_t get_length,
@ -280,7 +280,7 @@ segmentedGet(int64_t *opcount,
/* atomic op in segments no larger than segment_length */
static int
segmentedAtomic(int64_t *opcount,
segmentedAtomic(opal_atomic_int64_t *opcount,
ptl_handle_md_t md_h,
ptl_size_t origin_offset,
ptl_size_t length,
@ -329,7 +329,7 @@ segmentedAtomic(int64_t *opcount,
/* atomic op in segments no larger than segment_length */
static int
segmentedFetchAtomic(int64_t *opcount,
segmentedFetchAtomic(opal_atomic_int64_t *opcount,
ptl_handle_md_t result_md_h,
ptl_size_t result_offset,
ptl_handle_md_t origin_md_h,
@ -381,7 +381,7 @@ segmentedFetchAtomic(int64_t *opcount,
/* swap in segments no larger than segment_length */
static int
segmentedSwap(int64_t *opcount,
segmentedSwap(opal_atomic_int64_t *opcount,
ptl_handle_md_t result_md_h,
ptl_size_t result_offset,
ptl_handle_md_t origin_md_h,
@ -1187,7 +1187,7 @@ fetch_atomic_to_iovec(ompi_osc_portals4_module_t *module,
/* put in the largest chunks possible given the noncontiguous restriction */
static int
put_to_noncontig(int64_t *opcount,
put_to_noncontig(opal_atomic_int64_t *opcount,
ptl_handle_md_t md_h,
const void *origin_address,
int origin_count,
@ -1521,7 +1521,7 @@ atomic_to_noncontig(ompi_osc_portals4_module_t *module,
/* get from a noncontiguous remote to an (non)contiguous local */
static int
get_from_noncontig(int64_t *opcount,
get_from_noncontig(opal_atomic_int64_t *opcount,
ptl_handle_md_t md_h,
const void *origin_address,
int origin_count,

Просмотреть файл

@ -1,7 +1,7 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2011-2013 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2015 Los Alamos National Security, LLC. All rights
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
* reserved.
* $COPYRIGHT$
*
@ -18,7 +18,7 @@
struct ompi_osc_portals4_request_t {
ompi_request_t super;
int32_t ops_expected;
volatile int32_t ops_committed;
opal_atomic_int32_t ops_committed;
};
typedef struct ompi_osc_portals4_request_t ompi_osc_portals4_request_t;

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
@ -110,7 +110,7 @@ struct ompi_osc_pt2pt_peer_t {
int rank;
/** pointer to the current send fragment for each outgoing target */
struct ompi_osc_pt2pt_frag_t *active_frag;
opal_atomic_intptr_t active_frag;
/** lock for this peer */
opal_mutex_t lock;
@ -119,10 +119,10 @@ struct ompi_osc_pt2pt_peer_t {
opal_list_t queued_frags;
/** number of fragments incomming (negative - expected, positive - unsynchronized) */
volatile int32_t passive_incoming_frag_count;
opal_atomic_int32_t passive_incoming_frag_count;
/** peer flags */
volatile int32_t flags;
opal_atomic_int32_t flags;
};
typedef struct ompi_osc_pt2pt_peer_t ompi_osc_pt2pt_peer_t;
@ -208,16 +208,16 @@ struct ompi_osc_pt2pt_module_t {
/** Nmber of communication fragments started for this epoch, by
peer. Not in peer data to make fence more manageable. */
uint32_t *epoch_outgoing_frag_count;
opal_atomic_uint32_t *epoch_outgoing_frag_count;
/** cyclic counter for a unique tage for long messages. */
volatile uint32_t tag_counter;
opal_atomic_uint32_t tag_counter;
/** number of outgoing fragments still to be completed */
volatile int32_t outgoing_frag_count;
opal_atomic_int32_t outgoing_frag_count;
/** number of incoming fragments */
volatile int32_t active_incoming_frag_count;
opal_atomic_int32_t active_incoming_frag_count;
/** Number of targets locked/being locked */
unsigned int passive_target_access_epoch;
@ -230,13 +230,13 @@ struct ompi_osc_pt2pt_module_t {
/** Number of "count" messages from the remote complete group
we've received */
volatile int32_t num_complete_msgs;
opal_atomic_int32_t num_complete_msgs;
/* ********************* LOCK data ************************ */
/** Status of the local window lock. One of 0 (unlocked),
MPI_LOCK_EXCLUSIVE, or MPI_LOCK_SHARED. */
int32_t lock_status;
opal_atomic_int32_t lock_status;
/** lock for locks_pending list */
opal_mutex_t locks_pending_lock;
@ -526,7 +526,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
"mark_incoming_completion marking passive incoming complete. module %p, source = %d, count = %d",
(void *) module, source, (int) peer->passive_incoming_frag_count + 1));
new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &peer->passive_incoming_frag_count, 1);
new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &peer->passive_incoming_frag_count, 1);
if (0 == new_value) {
OPAL_THREAD_LOCK(&module->lock);
opal_condition_broadcast(&module->cond);
@ -550,7 +550,7 @@ static inline void mark_incoming_completion (ompi_osc_pt2pt_module_t *module, in
*/
static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
{
int32_t new_value = OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, 1);
int32_t new_value = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, 1);
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
"mark_outgoing_completion: outgoing_frag_count = %d", new_value));
if (new_value >= 0) {
@ -574,12 +574,12 @@ static inline void mark_outgoing_completion (ompi_osc_pt2pt_module_t *module)
*/
static inline void ompi_osc_signal_outgoing (ompi_osc_pt2pt_module_t *module, int target, int count)
{
OPAL_THREAD_ADD_FETCH32((int32_t *) &module->outgoing_frag_count, -count);
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->outgoing_frag_count, -count);
if (MPI_PROC_NULL != target) {
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
"ompi_osc_signal_outgoing_passive: target = %d, count = %d, total = %d", target,
count, module->epoch_outgoing_frag_count[target] + count));
OPAL_THREAD_ADD_FETCH32((int32_t *) (module->epoch_outgoing_frag_count + target), count);
OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) (module->epoch_outgoing_frag_count + target), count);
}
}
@ -717,7 +717,7 @@ static inline int get_tag(ompi_osc_pt2pt_module_t *module)
/* the LSB of the tag is used be the receiver to determine if the
message is a passive or active target (ie, where to mark
completion). */
int32_t tmp = OPAL_THREAD_ADD_FETCH32((volatile int32_t *) &module->tag_counter, 4);
int32_t tmp = OPAL_THREAD_ADD_FETCH32((opal_atomic_int32_t *) &module->tag_counter, 4);
return (tmp & OSC_PT2PT_FRAG_MASK) | !!(module->passive_target_access_epoch);
}

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
@ -166,17 +166,16 @@ int ompi_osc_pt2pt_fence(int assert, ompi_win_t *win)
"osc pt2pt: fence done sending"));
/* find out how much data everyone is going to send us. */
ret = module->comm->c_coll->coll_reduce_scatter_block (module->epoch_outgoing_frag_count,
&incoming_reqs, 1, MPI_UINT32_T,
MPI_SUM, module->comm,
module->comm->c_coll->coll_reduce_scatter_block_module);
ret = module->comm->c_coll->coll_reduce_scatter_block ((void *) module->epoch_outgoing_frag_count,
&incoming_reqs, 1, MPI_UINT32_T,
MPI_SUM, module->comm,
module->comm->c_coll->coll_reduce_scatter_block_module);
if (OMPI_SUCCESS != ret) {
return ret;
}
OPAL_THREAD_LOCK(&module->lock);
bzero(module->epoch_outgoing_frag_count,
sizeof(uint32_t) * ompi_comm_size(module->comm));
bzero ((void *) module->epoch_outgoing_frag_count, sizeof(uint32_t) * ompi_comm_size(module->comm));
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,
"osc pt2pt: fence expects %d requests",
@ -366,8 +365,11 @@ int ompi_osc_pt2pt_complete (ompi_win_t *win)
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
* requests. once that is done this can be removed. */
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (complete_req))) {
++complete_req.frag_count;
if (peer->active_frag) {
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
if (active_frag->remain_len < sizeof (complete_req)) {
++complete_req.frag_count;
}
}
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output,

Просмотреть файл

@ -501,7 +501,7 @@ static void ompi_osc_pt2pt_peer_construct (ompi_osc_pt2pt_peer_t *peer)
{
OBJ_CONSTRUCT(&peer->queued_frags, opal_list_t);
OBJ_CONSTRUCT(&peer->lock, opal_mutex_t);
peer->active_frag = NULL;
peer->active_frag = 0;
peer->passive_incoming_frag_count = 0;
peer->flags = 0;
}

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2009-2011 Oracle and/or its affiliates. All rights reserved.
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
@ -56,7 +56,7 @@ struct osc_pt2pt_accumulate_data_t {
int peer;
ompi_datatype_t *datatype;
ompi_op_t *op;
int request_count;
opal_atomic_int32_t request_count;
};
typedef struct osc_pt2pt_accumulate_data_t osc_pt2pt_accumulate_data_t;

Просмотреть файл

@ -1,7 +1,7 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2014-2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2014-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2015 Research Organization for Information Science
* and Technology (RIST). All rights reserved.
@ -65,7 +65,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, frag->target);
int ret;
assert(0 == frag->pending && peer->active_frag != frag);
assert(0 == frag->pending && peer->active_frag != (intptr_t) frag);
/* we need to signal now that a frag is outgoing to ensure the count sent
* with the unlock message is correct */
@ -93,7 +93,7 @@ int ompi_osc_pt2pt_frag_start (ompi_osc_pt2pt_module_t *module,
static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, ompi_osc_pt2pt_peer_t *peer)
{
ompi_osc_pt2pt_frag_t *active_frag = peer->active_frag;
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
int ret = OMPI_SUCCESS;
if (NULL == active_frag) {
@ -105,7 +105,7 @@ static int ompi_osc_pt2pt_flush_active_frag (ompi_osc_pt2pt_module_t *module, om
"osc pt2pt: flushing active fragment to target %d. pending: %d",
active_frag->target, active_frag->pending));
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &active_frag, NULL)) {
if (opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &active_frag, 0)) {
if (0 != OPAL_THREAD_ADD_FETCH32(&active_frag->pending, -1)) {
/* communication going on while synchronizing; this is an rma usage bug */
return OMPI_ERR_RMA_SYNC;

Просмотреть файл

@ -33,7 +33,7 @@ struct ompi_osc_pt2pt_frag_t {
char *top;
/* Number of operations which have started writing into the frag, but not yet completed doing so */
volatile int32_t pending;
opal_atomic_int32_t pending;
int32_t pending_long_sends;
ompi_osc_pt2pt_frag_header_t *header;
ompi_osc_pt2pt_module_t *module;
@ -66,8 +66,8 @@ static inline ompi_osc_pt2pt_frag_t *ompi_osc_pt2pt_frag_alloc_non_buffered (omp
ompi_osc_pt2pt_frag_t *curr;
/* to ensure ordering flush the buffer on the peer */
curr = peer->active_frag;
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, &curr, NULL)) {
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
if (NULL != curr && opal_atomic_compare_exchange_strong_ptr (&peer->active_frag, (intptr_t *) &curr, 0)) {
/* If there's something pending, the pending finish will
start the buffer. Otherwise, we need to start it now. */
int ret = ompi_osc_pt2pt_frag_finish (module, curr);
@ -131,7 +131,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
OPAL_THREAD_LOCK(&module->lock);
if (buffered) {
curr = peer->active_frag;
curr = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
if (NULL == curr || curr->remain_len < request_len || (long_send && curr->pending_long_sends == 32)) {
curr = ompi_osc_pt2pt_frag_alloc_non_buffered (module, peer, request_len);
if (OPAL_UNLIKELY(NULL == curr)) {
@ -140,7 +140,7 @@ static inline int _ompi_osc_pt2pt_frag_alloc (ompi_osc_pt2pt_module_t *module, i
}
curr->pending_long_sends = long_send;
peer->active_frag = curr;
peer->active_frag = (uintptr_t) curr;
} else {
OPAL_THREAD_ADD_FETCH32(&curr->header->num_ops, 1);
curr->pending_long_sends += long_send;

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2015 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2010 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2010 Oracle and/or its affiliates. All rights reserved.
@ -180,7 +180,7 @@ typedef struct ompi_osc_pt2pt_header_flush_ack_t ompi_osc_pt2pt_header_flush_ack
struct ompi_osc_pt2pt_frag_header_t {
ompi_osc_pt2pt_header_base_t base;
uint32_t source; /* rank in window of source process */
int32_t num_ops; /* number of operations in this buffer */
opal_atomic_int32_t num_ops; /* number of operations in this buffer */
uint32_t pad; /* ensure the fragment header is a multiple of 8 bytes */
};
typedef struct ompi_osc_pt2pt_frag_header_t ompi_osc_pt2pt_frag_header_t;

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2016 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
* Copyright (c) 2015 Research Organization for Information Science
@ -104,13 +104,13 @@ int ompi_osc_pt2pt_free(ompi_win_t *win)
free (module->recv_frags);
}
if (NULL != module->epoch_outgoing_frag_count) free(module->epoch_outgoing_frag_count);
free ((void *) module->epoch_outgoing_frag_count);
if (NULL != module->comm) {
ompi_comm_free(&module->comm);
}
if (NULL != module->free_after) free(module->free_after);
free ((void *) module->free_after);
free (module);

Просмотреть файл

@ -8,7 +8,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2007-2017 Los Alamos National Security, LLC. All rights
* Copyright (c) 2007-2018 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2010-2016 IBM Corporation. All rights reserved.
* Copyright (c) 2012-2013 Sandia National Laboratories. All rights reserved.
@ -157,7 +157,7 @@ int ompi_osc_pt2pt_lock_remote (ompi_osc_pt2pt_module_t *module, int target, omp
static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module, int target, ompi_osc_pt2pt_sync_t *lock)
{
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
int lock_type = lock->sync.lock.type;
ompi_osc_pt2pt_header_unlock_t unlock_req;
@ -178,10 +178,13 @@ static inline int ompi_osc_pt2pt_unlock_remote (ompi_osc_pt2pt_module_t *module,
unlock_req.lock_ptr = (uint64_t) (uintptr_t) lock;
OSC_PT2PT_HTON(&unlock_req, module, target);
if (peer->active_frag && peer->active_frag->remain_len < sizeof (unlock_req)) {
/* the peer should expect one more packet */
++unlock_req.frag_count;
--module->epoch_outgoing_frag_count[target];
if (peer->active_frag) {
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
if (active_frag->remain_len < sizeof (unlock_req)) {
/* the peer should expect one more packet */
++unlock_req.frag_count;
--module->epoch_outgoing_frag_count[target];
}
}
OPAL_OUTPUT_VERBOSE((25, ompi_osc_base_framework.framework_output,
@ -204,7 +207,7 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
{
ompi_osc_pt2pt_peer_t *peer = ompi_osc_pt2pt_peer_lookup (module, target);
ompi_osc_pt2pt_header_flush_t flush_req;
int32_t frag_count = opal_atomic_swap_32 ((int32_t *) module->epoch_outgoing_frag_count + target, -1);
int32_t frag_count = opal_atomic_swap_32 ((opal_atomic_int32_t *) module->epoch_outgoing_frag_count + target, -1);
int ret;
(void) OPAL_THREAD_ADD_FETCH32(&lock->sync_expected, 1);
@ -218,10 +221,13 @@ static inline int ompi_osc_pt2pt_flush_remote (ompi_osc_pt2pt_module_t *module,
/* XXX -- TODO -- since fragment are always delivered in order we do not need to count anything but long
* requests. once that is done this can be removed. */
if (peer->active_frag && (peer->active_frag->remain_len < sizeof (flush_req))) {
/* the peer should expect one more packet */
++flush_req.frag_count;
--module->epoch_outgoing_frag_count[target];
if (peer->active_frag) {
ompi_osc_pt2pt_frag_t *active_frag = (ompi_osc_pt2pt_frag_t *) peer->active_frag;
if (active_frag->remain_len < sizeof (flush_req)) {
/* the peer should expect one more packet */
++flush_req.frag_count;
--module->epoch_outgoing_frag_count[target];
}
}
OPAL_OUTPUT_VERBOSE((50, ompi_osc_base_framework.framework_output, "flushing to target %d, frag_count: %d",

Просмотреть файл

@ -28,7 +28,7 @@ struct ompi_osc_pt2pt_request_t {
int origin_count;
struct ompi_datatype_t *origin_dt;
ompi_osc_pt2pt_module_t* module;
int32_t outstanding_requests;
opal_atomic_int32_t outstanding_requests;
bool internal;
};
typedef struct ompi_osc_pt2pt_request_t ompi_osc_pt2pt_request_t;

Просмотреть файл

@ -1,6 +1,6 @@
/* -*- Mode: C; c-basic-offset:4 ; indent-tabs-mode:nil -*- */
/*
* Copyright (c) 2015-2016 Los Alamos National Security, LLC. All rights
* Copyright (c) 2015-2018 Los Alamos National Security, LLC. All rights
* reserved.
* $COPYRIGHT$
*
@ -74,7 +74,7 @@ struct ompi_osc_pt2pt_sync_t {
int num_peers;
/** number of synchronization messages expected */
volatile int32_t sync_expected;
opal_atomic_int32_t sync_expected;
/** eager sends are active to all peers in this access epoch */
volatile bool eager_send_active;

Просмотреть файл

@ -265,7 +265,7 @@ struct ompi_osc_rdma_module_t {
unsigned long get_retry_count;
/** outstanding atomic operations */
volatile int32_t pending_ops;
opal_atomic_int32_t pending_ops;
};
typedef struct ompi_osc_rdma_module_t ompi_osc_rdma_module_t;
OMPI_MODULE_DECLSPEC extern ompi_osc_rdma_component_t mca_osc_rdma_component;

Просмотреть файл

@ -259,7 +259,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
return ret;
}
} else {
post_index = ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) (intptr_t) target, 1) - 1;
post_index = ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) (intptr_t) target, 1) - 1;
}
post_index &= OMPI_OSC_RDMA_POST_PEER_MAX - 1;
@ -279,7 +279,7 @@ static int ompi_osc_rdma_post_peer (ompi_osc_rdma_module_t *module, ompi_osc_rdm
return ret;
}
} else {
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_counter_t *) target, &_tmp_value,
result = !ompi_osc_rdma_lock_compare_exchange ((osc_rdma_atomic_counter_t *) target, &_tmp_value,
1 + (osc_rdma_counter_t) my_rank);
}
@ -491,7 +491,7 @@ int ompi_osc_rdma_complete_atomic (ompi_win_t *win)
ret = ompi_osc_rdma_lock_btl_op (module, peer, target, MCA_BTL_ATOMIC_ADD, 1, true);
assert (OMPI_SUCCESS == ret);
} else {
(void) ompi_osc_rdma_counter_add ((osc_rdma_counter_t *) target, 1);
(void) ompi_osc_rdma_counter_add ((osc_rdma_atomic_counter_t *) target, 1);
}
}

Некоторые файлы не были показаны из-за слишком большого количества измененных файлов Показать больше