1
1
Граф коммитов

26832 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
802deb685f Merge pull request #3090 from jsquyres/pr/mpi-wtick
MPI_Wtick: may return a higher resolution than 10e-6 these days
2017-03-02 10:43:07 -05:00
Jeff Squyres
dc53cd5f74 MPI_Wtick: may return a higher resolution than 10e-6 these days
Thanks to Mark Dixon (@ccaamad) for reporting the error.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-02 10:39:28 -05:00
KAWASHIMA Takahiro
c4ca5e703d coll: Update ompi/mca/coll/base/coll_base_functions.h
- Support MPI-2.2 and MPI-3.0 COLL features.

  * `MPI_REDUCE_SCATTER_BLOCK`
  * neighborhood collective communication
  * nonblocking collective communication

- Add `*_BASE_ARGS` and `*_BASE_ARG_NAMES` for convenience.

- Use parameter names used in the MPI Standard.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-03-02 17:58:02 +09:00
KAWASHIMA Takahiro
96aa0d90c1 pml/bfo: Correct a function name and header filenames
These lines were incorrectly modified in 90f2940.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-03-02 16:02:53 +09:00
Jeff Squyres
a88ae331ae Merge pull request #3077 from jsquyres/pr/tcp-btl-make-the-output-optional
btl/tcp: direct a warning message to show_help
2017-03-01 20:59:26 -05:00
Jeff Squyres
5b484c91f4 btl/tcp: use show_help to print the dropped-TCP warning
Make the message more friendly / more detailed, and de-duplicate it
(just in case it happens a lot).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-01 16:31:29 -08:00
Ralph Castain
4f810d5931 Merge pull request #3076 from rhc54/topic/doublefree
Fix double-free in rml/ofi shutdown
2017-03-01 14:57:59 -08:00
Ralph Castain
c757c3d260 Fix double-free in rml/ofi shutdown
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-01 11:53:46 -08:00
George Bosilca
b0f8d2c460
Never free the statically allocated buffer.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-03-01 13:21:03 -05:00
George Bosilca
ec4a235e6a
Allow a TCP proc release during the create.
This is mostly for error cases, where we need to release the
newly created proc. Currently the code deadlocks because the endpoint
lock is help at the release and the lock is not recursive.

Aslo added some code to print the IP addresses that don't match during
the TCP connection step.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-03-01 13:17:54 -05:00
Yossi
05b568a46d Merge pull request #3074 from alex-mikheev/topic/pml_ucx_bsend_dt_fix
ompi: pml ucx: fix datatype packing error in bsend
2017-03-01 18:56:51 +02:00
Alex Mikheev
152f77df59
ompi: pml ucx: fix datatype packing error in bsend
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-03-01 16:18:19 +02:00
Mike Dubman
abc56cea8f Merge pull request #3071 from yosefe/topic/fix-memhooks-mxm-hcoll
yalla/mtl_mxm/hcoll: open memory component to activate memory hooks.
2017-03-01 12:19:16 +01:00
Yossi Itigin
33471c44ee pml_yalla/mtl_mxm/hcoll: open memory component to activate memory hooks.
Memory hooks are now set-up on demand. pml/yalla, mtl/mxm and
coll/hcoll need the memory hooks, so make sure those are installed.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-03-01 12:12:20 +02:00
Gilles Gouaillardet
880f2d5431 mpi/c: revamp error handling in MPI_{Pack,Unpack}[_external]
Thanks Alex and the folks at Mellanox for the help.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-03-01 10:03:31 +09:00
Jeff Squyres
d5266aba90 Merge pull request #2955 from jsquyres/pr/hwloc-external-fixes
Fix --with-hwloc=external
2017-02-28 14:57:07 -05:00
Josh Hursey
0006f0d7c5 Merge pull request #2773 from jjhursey/topic/hook-fwk
Add a 'hook' framework
2017-02-28 12:29:50 -06:00
Jeff Squyres
cc23439465 Merge pull request #3055 from jsquyres/pr/README-backwards-compat
README: Add more info about "backwards compatibility"
2017-02-28 13:03:43 -05:00
Ralph Castain
735fbf8f67 Merge pull request #3011 from artpol84/add_proc_fix/master
ompi: Avoid unnecessary PMIx lookups when adding procs.
2017-02-28 08:25:08 -08:00
Jeff Squyres
fec519a793 hwloc: rename opal/mca/hwloc/hwloc.h -> hwloc-internal.h
Per a prior commit, the presence of "hwloc.h" can cause ambiguity when
using --with-hwloc=external (i.e., whether to include
opal/mca/hwloc/hwloc.h or whether to include the system-installed
hwloc.h).

This commit:

1. Renames opal/mca/hwloc/hwloc.h to hwloc-internal.h.
2. Adds opal/mca/hwloc/autogen.options to tell autogen.pl to expect to
   find hwloc-internal.h (instead of hwloc.h) in opal/mca/hwloc.
3. s@opal/mca/hwloc/hwloc.h@opal/mca/hwloc/hwloc-internal.h@g in the
   rest of the code base.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-28 07:48:42 -08:00
Jeff Squyres
a065b9b83b hwloc: allow frameworks to have alternate header filenames
Frameworks are usually required to have a framework/framework.h file.
However, this is sometimes problematic (see the hwloc use case/problem
description, below).

This commit allows frameworks to have an "autogen.options" file (i.e.,
project/mca/framework/autogen.options) that specifies things that
autogen needs to know about the framework.  Currently, the only option
recognized in autogen.options is "framework_header", which allows a
framework to specify that its header file is named something other
than "framework.h" (the framework header file must still be in the
project/mca/framework directory; it simply may be named something
other than framework.h).  More options may be introduced over time.

The use case that motivated this is the hwloc framework
(https://github.com/open-mpi/ompi/issues/2616).

Per MCA framework rules, the hwloc framework is required to have an
opal/mca/hwloc/hwloc.h file.  However, the hwloc library itself *also*
has an hwloc.h file.  This causes a problem when configuring Open MPI
with --with-hwloc=external (meaning: do not use the hwloc embedded
within the Open MPI source code tree -- instead, use an hwloc
installation from outside the Open MPI source code tree).
Specifically, when in the opal/mca/hwloc directory, the presence of
"-I." in DEFAULT_INCLUDES (put there by Automake) causes a confusion
between the hwloc.h in opal/mca/hwloc/hwloc.h and the system-installed
hwloc.h.  Chaos ensues (see the GitHub issue for more detail).

The solution is to rename the opal/mca/hwloc/hwloc.h to something else
(e.g., hwloc-internal.h), and extend autogen.pl to allow frameworks to
have an alternate name for their framework header file.

This commit introduces the autogen.pl mechanism to allow the alternate
header file name.  A follow-on commit will effect this change in the
hwloc framework (and update all the places in the code base to use the
new filename).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-28 07:45:34 -08:00
Jeff Squyres
0cd3b6c235 treematch: do not include <hwloc.h>
Instead, include "opal/mca/hwloc/hwloc.h"

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-28 07:45:23 -08:00
Josh Hursey
b1c4e50500 Merge pull request #2934 from jjhursey/topic/coll-comm-restructure
Move coll structure outside of the communicator
2017-02-28 08:45:18 -06:00
Ralph Castain
c320b3671a Merge pull request #3050 from naughtont3/tjn-fix-orteclean
orte-clean: fix bad username usage, add orte-dvm
2017-02-28 06:01:37 -08:00
Thomas Naughton
74f8c2ae30 orte-clean: fix bad username/uid usage, add orte-dvm
This fixes a mismatch between PS listing that returned
USERNAME but code was pruning based on UID.

This changes the OPAL_PS_FLAVOR_CHECK format to return
'uid' instead of 'user'.  (Note: Avoiding call to
getlogin_r() but assuming UID is uniform on system,
same assumption exists for session dir anyway.)

Note, still maintains behavior from man page for root
running orte-clean on node (kills all orteds).

Adds 'orte-dvm' to list of procnames that will be checked/killed.

Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
2017-02-28 08:00:06 -05:00
Jeff Squyres
9ce3c7f150 Merge pull request #3057 from hjelmn/osc_rdma_atomic
osc/rdma: fix compile warning
2017-02-28 06:15:30 -05:00
Nathan Hjelm
032bcf915a osc/rdma: fix compile warning
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-27 16:26:00 -07:00
Jeff Squyres
842f8c1286 README: Add more info about "backwards compatibility"
Add more clarifying statements about our definition of "backwards
compatibility" -- adding an example with static linking and another
with containers.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-27 17:16:48 -05:00
Jeff Squyres
07d8452646 Merge pull request #3052 from jsquyres/pr/update-authors
AUTHORS: update names
2017-02-27 16:39:15 -05:00
Jeff Squyres
e6b3be8e1f AUTHORS: update names
Update the .mailmap and re-run `contrib/dist/make-authors.pl`.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-27 16:37:23 -05:00
Jeff Squyres
45cdf2eb7a Merge pull request #3051 from jsquyres/pr/mailmap-update
.mailmap: Remove accent from Aurelein's name
2017-02-27 16:21:28 -05:00
Jeff Squyres
afc49f3361 .mailmap: Remove accent from Aurelein's name
This was per request of Aurelein.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-27 16:18:13 -05:00
George Bosilca
366d64b7e5 Move the collective structure outside the communicator.
As we changed the ABI (forcing a major release), we can limit
the size of the predefined communicators by moving the collective
structure outside the communicator. This might have a minimal,
but unnoticeable, impact on performance. This approach has been
discussed during the January 2017 devel meeting.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-27 11:54:17 -06:00
Joshua Hursey
c10bbfded6 ompi/hook: Add the hook/license framework
* Include a 'demo' component that shows some of the features.
 * Currently has hooks for:
   - MPI_Initialized
     - top, bottom
   - MPI_Init_thread
     - top, bottom
   - MPI_Finalized
     - top, bottom
   - MPI_Init
     - top (pre-opal_init), top (post-opal_init), error, bottom
   - MPI_Finalize
     - top, bottom
 * Other places in ompi can 'register' to hook into any one of these places
   by passing back a component structure filled with function pointers.
 * Add a `MCA_BASE_COMPONENT_FLAG_REQUIRED` flag to the MCA structure that
   is checked by the `hook` framework. If a required, static component has
   been excluded then the `hook` framework will fail to initialize.
   - See note in `opal/mca/mca.h` as to why this is checked in the `hook`
     framework and not in `opal/mca/base/mca_base_component_find.c`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-27 12:05:53 -05:00
Nathan Hjelm
581bff9871 Merge pull request #3034 from hjelmn/osc_rdma_atomic
osc/rdma: make locking code more robust
2017-02-27 08:46:52 -07:00
Ralph Castain
f054261590 Merge pull request #3027 from naughtont3/tjn-envvar-dvmuri
dvm: Add envvar 'ORTE_HNP_DVM_URI' to schizo:ompi
2017-02-27 06:56:44 -08:00
Ralph Castain
feed472ea5 Merge pull request #3043 from rhc54/topic/purge
Skip empty files to avoid infinite loop
2017-02-27 06:03:54 -08:00
Ralph Castain
a774ea73e4 Skip empty files to avoid infinite loop
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-27 06:02:54 -08:00
Nathan Hjelm
4707c7c5e0 osc/rdma: make locking code more robust
Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-27 00:01:26 -07:00
Gilles Gouaillardet
af0b5cffb4 asm: rename the AMD64 into X86_64
in this context, AMD64 really means amd64 or em64t, so let's
rename this into X86_64 in order to avoid any confusion

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 15:10:50 +09:00
Gilles Gouaillardet
ab5e86c97d travis: install hwloc packages
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 14:40:35 +09:00
Gilles Gouaillardet
2f4013ce33 configury: fix asm atomic detection
there is no need to look for an assembly file when BUILTIN_GCC is used

Fixes open-mpi/ompi#3032
Refs open-mpi/ompi#3036

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 10:42:50 +09:00
Alex Mikheev
c9b5b12af4
oshmem: sshmem ucx: use fixed base address
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-02-26 15:16:28 +02:00
Ralph Castain
efc3a98ea6 Merge pull request #3031 from rhc54/topic/ofi
Add CPPFLAGS to build of rml/ofi component.
2017-02-25 11:23:03 -08:00
Ralph Castain
9f8f7f3189 Add CPPFLAGS to build of rml/ofi component.
Fix finalize to ensure we only destruct the msg queue list once.
Update platform file

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-25 09:17:41 -08:00
Ralph Castain
0db91889a7 Merge pull request #3018 from naughtont3/tjn-dvmerrmgr-issue2987
debug fix for DVM early quit
2017-02-25 08:09:16 -08:00
Sylvain Jeaugey
f827b6b8dd Fix more typos using the allgather module for allreduce operations, causing a crash when CUDA collectives are enabled.
Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
Signed-off-by: Akshay Venkatesh <akvenkatesh@nvidia.com>
2017-02-24 16:35:29 -08:00
Thomas Naughton
006be92df5 dvm: Add envvar 'ORTE_HNP_DVM_URI' to schizo:ompi
Add ability to pass DVM URI purely via environment
to simplify invocation from command-line (e.g., start dvm,
export URI, mpirun w/o needing to add `--hnp` arg).
If user passes both envvar *and* cmdline, the cmdline wins.

Signed-off-by: Thomas Naughton <naughtont@ornl.gov>
2017-02-24 16:55:32 -05:00
Jeff Squyres
d7dd4d769e openmpi-mca-params.conf: Fix comment
Make sure to specify "--level 9" to ompi_info to see all MCA params.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-24 07:09:06 -08:00
Jeff Squyres
99ec16edea Merge pull request #3023 from clementFoyer/patch-1
Fix minor typo
2017-02-23 10:38:46 -05:00