1
1
Граф коммитов

27773 Коммитов

Автор SHA1 Сообщение Дата
Mike Dubman
4c98e6bde2 Merge pull request #4258 from yosefe/topic/spml-ucx-fix-quiet-typo
spml_ucx: fix typo in shmem_quiet() error message.
2017-09-26 11:10:40 +03:00
Mike Dubman
5ab2d8441e Merge pull request #4269 from karasevb/oshmem/spec
oshmem: introduced the definition `SHMEM_ALLTOALLS_SYNC_SIZE`
2017-09-26 11:09:38 +03:00
Boris Karasev
584ff76dea oshmem: introduced the definition SHMEM_ALLTOALLS_SYNC_SIZE
In accordance with the OSHMEM spec, this definition must be included in
the code.

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-09-26 09:12:09 +03:00
Gilles Gouaillardet
b3558f261b opal/util: initialize proc_hostname in the opal_proc_t constructor
Refs open-mpi/ompi#4264

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-26 10:47:26 +09:00
bosilca
f44e674992 Merge pull request #4074 from bosilca/topic/coverity
Fix coverity complaints.
2017-09-25 15:05:31 -04:00
bosilca
f8a02eb649 Merge pull request #3793 from bosilca/topic/monitoring_install_fix
Topic/monitoring install fix
2017-09-25 15:05:11 -04:00
Ralph Castain
702a535c58 Merge pull request #4262 from rhc54/topic/up
Update to track PMIx master (v2.1.0)
2017-09-25 11:30:41 -07:00
Ralph Castain
d5db4ee965 Update to track PMIx master (v2.1.0)
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-25 10:24:13 -07:00
Guillaume Mercier
4e7c130c31
Add correct reordering computation in partially distributed case.
Replaced matching array with k and bcast with scatter.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Guillaume Mercier <mercier@labri.fr>
2017-09-25 13:10:11 -04:00
George Bosilca
3dd1d8cb53
Delay the first check for the HWLOC topology.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 13:09:57 -04:00
George Bosilca
28046b37df
Always succesfully return.
As the reordering is an optional step, if any operation during the
reorder fails we can return the duplicata of the original communicator
associated with the topology information.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:53:20 -04:00
George Bosilca
219a96fa69
Prevent memory leaks.
Reorder the code to simplify the memory management.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:53:20 -04:00
George Bosilca
64bff0e326
Disable monitoring if we compile statically.
Protect all components against compilation on static builds.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00
George Bosilca
458ccc12e1
Move the profiling library in common/monitoring
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00
Clément FOYER
f334607c34
Simplify the communicator's name caching management (#6)
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-09-25 12:18:23 -04:00
bosilca
a680b3ac6d Merge pull request #3853 from clementFoyer/master
OMPI monitoring: Simplify the communicator's name caching management + misc test changes
2017-09-25 12:14:36 -04:00
Jeff Squyres
60a810c88b Merge pull request #4016 from jsquyres/pr/libnl-you-win-again
opal_check_libnl: it's not a fatal error if libnl check fails
2017-09-25 11:53:38 -04:00
Jeff Squyres
db10da97e3 Merge pull request #4257 from jsquyres/pr/moar-hwloc-cuda-cleanup
hwloc2a/configure.m4: be more careful in with_cuda->enable_cuda
2017-09-25 11:35:33 -04:00
Jeff Squyres
0568f4a820 opal_check_libnl: remove abort on libnl check failure
Per https://github.com/open-mpi/ompi/issues/3995, it should not be a
fatal error if the libnl checks fail.  Instead, just fail the check
and let the upper layer decide what to do.  In this case,
OPAL_CHECK_PACKAGE will mark this library as no good, and then
propagate that upward.

E.g., if libfoo fails the libnl check, and the user had specified
--with-libfoo, this will eventually cause configure to fail (because
the libnl check will fail with libfoo, which will cause
OPAL_CHECK_PACKAGE to fail with libfoo, which will ultimately cause
some upper-level logic to realize "a human asked for libfoo but we
could not provide it -- abort!").

However, if libfoo fails the libnl check and the user did *not*
specify --with-libfoo, then this will cause the upper layer to
silently skip libfoo (because the libnl check will fail libfoo, which
will cause OPAL_CHECK_PACKAGE to fail libfoo, but then the upper-level
logic will realize "oh, we can't use libfoo, but a human didn't ask
for it -- so just skip libfoo support.").

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-25 07:49:43 -07:00
Gilles Gouaillardet
e28cc8c986 configury: reset all flags if libnl v3 cannot be used
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-25 17:21:47 +09:00
Brian Barrett
c768f980e1 reachable: Fix string length Coverity warning
Make sure hostnames are null terminated, even when they were
too long to fit in the hostname buffer.

Fixes: CID 1418232

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-24 19:38:45 -07:00
Ralph Castain
fa762973ae Merge pull request #4259 from rhc54/topic/cleanup
Minor cleanups for when using external pmix
2017-09-24 11:12:37 -07:00
Ralph Castain
fcb7a2f29b Minor cleanups for when using external pmix
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-24 09:53:04 -07:00
Yossi Itigin
3081576124 spml_ucx: fix typo in shmem_quiet() error message.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2017-09-24 19:20:55 +03:00
Jeff Squyres
2ec2a329dc hwloc2a/configure.m4: be more careful in with_cuda->enable_cuda
Be a little more deliberate about convering OMPI's --with-cuda CLI
value to hwloc's --enable-cuda configure option.

Also, unconditionally disable hwloc NVML support (because Open MPI is
not currently using it).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-24 05:36:23 -07:00
Ralph Castain
448a8ee953 Merge pull request #4256 from rhc54/topic/scaling
Get the scaling test to properly run a scan across the #nodes
2017-09-22 21:05:32 -07:00
Ralph Castain
1dd45e0f30 Get the scaling test to properly run a scan across the #nodes
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 21:04:36 -07:00
Ralph Castain
ae01dcee7b Merge pull request #4218 from rhc54/topic/config
Attempt vpath inside component
2017-09-22 18:02:45 -07:00
Ralph Castain
4f932819aa Update platform file
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 16:05:57 -07:00
Ralph Castain
5fed7330e7 Update the configure logic to separate the emitting of a libpmix library from with-devel-headers. Instead, we create a new --enable-install-libpmix expressly for that
purpose. Continue to link the new library back to libopen-pal to resolve the renamed symbols.

Update opal configure logic to set disable_dlopen when disable_mca_dso is given. Fix typos in disable_dlopen when setting variables (incorrect inclusion of quotes)

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 16:02:57 -07:00
Ralph Castain
40a25e6077 Merge pull request #4254 from rhc54/topic/fixes
Silence warnings and sync to PMIx master
2017-09-22 14:59:19 -07:00
Matias Cabral
2beeaa46fa Merge pull request #4241 from yburette/topic/fix_provider_include
mtl/ofi: Fix provider selection.
2017-09-22 11:23:43 -07:00
Ralph Castain
3493c43468 Sync to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-22 10:48:00 -07:00
Ralph Castain
b4ad81da85 Silence warnings about verbose output
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 2c9655bb631742fd7693e00289d1949f4b2fc155)
2017-09-22 09:05:03 -07:00
Ralph Castain
9edea02b46 Merge pull request #4246 from rhc54/topic/spawn
Fully support OMPI spawn options.
2017-09-21 11:23:34 -07:00
Jeff Squyres
9708e9dd21 Merge pull request #4245 from jsquyres/pr/disable-hwloc-cuda
hwloc: do not build hwloc CUDA support if --without-cuda used (and also always disable hwloc GL and OpenCL support)
2017-09-21 13:43:01 -04:00
Ralph Castain
fe9b584c05 Fully support OMPI spawn options. Fix a bug in the round-robin mappers where we weren't adding nodes to the job map node array, and so resources were not released
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
(cherry picked from commit 285d8cfef74ffc899e9c51e1d9c597b7fb2ceb89)
2017-09-21 10:29:27 -07:00
Brice Goglin
84a721d17a hwloc: disable GL and OpenCL in the hwloc component
Open MPI doesn't use GL or OpenCL OS devices, so just disable them in
hwloc.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-21 08:25:46 -07:00
Jeff Squyres
f5d51dc2f5 hwloc: do not build hwloc CUDA support if --without-cuda used
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-09-21 08:24:54 -07:00
Gilles Gouaillardet
d704712bad Merge pull request #4242 from ggouaillardet/topic/libnl3
configury: do not use libnl-3 when it is half broken
2017-09-21 16:16:00 +09:00
Gilles Gouaillardet
94747a1d28 configury: do not use libnl-3 when it is half broken
Refs. open-mpi/ompi#4211

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-21 15:27:59 +09:00
yohann
1f8cabc890 mtl/ofi: Fix provider selection.
This allows mtl_ofi_provider_include to work with layered providers as well.
e.g. --mca mtl_ofi_provider_include "providerX;ofi_rxm"

Signed-off-by: yohann <yohann.burette@intel.com>
2017-09-20 16:00:50 -07:00
Jeff Squyres
a182b4fbaa Merge pull request #3883 from jsquyres/pr/readme-pathscale-update
README: Pathscale updates
2017-09-20 10:07:48 -04:00
Gilles Gouaillardet
da2966ace1 Merge pull request #2191 from ggouaillardet/topic/remove_disable_mpi_io
configury: remove the --disable-mpi-io option
2017-09-20 15:55:48 +09:00
Gilles Gouaillardet
b9315edb85 configury: remove the --disable-mpi-io option
Fixes open-mpi/ompi#2185

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-20 14:39:09 +09:00
bosilca
ab68aced23 Merge pull request #3738 from bosilca/topic/tcp_event_count
Fix the TCP performance impact when BTL not used
2017-09-19 23:08:58 -04:00
Brian Barrett
2c59fb7a58 Merge pull request #4221 from AntoineD/master
Fix: Outdated README link #4220
2017-09-19 19:48:13 -07:00
Gabe Saba
c6235a9a0f reachable: add tests
Add test suite for netlink and weighted reachable components.  We
don't have a great way of running components through unit tests
today, so make them stand-alone tests that are run with mpirun
and such.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-19 19:42:54 -07:00
Brian Barrett
ae122c4b17 reachable: Change ownership to Amazon
Amazon is going to use the reachable framework to fix some connection
bugs in the TCP BTL, so claim support  ownership of the weighted and
netlink components.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-19 19:42:54 -07:00
Gabe Saba
9e53605a6f reachable: Implement netlink component
Wire up the libnl utilities Jeff and Ralph added previously to
the netlink reachable component so that it actually does work.
The algorithm is a bit simplistic, but should work for our use
cases.  If there's a route, assume the two interfaces can talk.
If there's no gateway, assume the two interfaces are in the
same subnet, and give preference to that connection.  If there's
a gateway, assume there's a route, but the interfaces are not
in the same subnet.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-19 19:42:54 -07:00