1
1
Граф коммитов

28704 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
538528f659
Merge pull request #5326 from jsquyres/pr/tcp-btl-use-opal-hash-map-for-kindex
btl/tcp: use a hash map for kernel IP interface indexes
2018-06-25 10:50:50 -04:00
Jeff Squyres
3767ce27c0 btl/tcp: trivial whitespace clean
No code/logic changes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 08:04:12 -07:00
Jeff Squyres
9034717876 btl/tcp: use a hash map for kernel IP interface indexes
The giant size of the TCP proc struct is causing a problem in some
environments (because it is allocated on the stack), and it was too
big, anyway.

Instead, use a hash map.  That way, it starts small and can grow if it
needs to.  It also makes no assumptions about the values of the kernel
interface indexes.

Fixes #5292.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 08:03:30 -07:00
Ralph Castain
259d9bd4fe
Merge pull request #5325 from jsquyres/pr/compiler-warning-stomps
pmix3/pmix_server.c: minor compiler warning stomp
2018-06-23 07:39:27 -07:00
Jeff Squyres
e3d6c5ce3a pmix3/pmix_server.c: minor compiler warning stomp
Submitted upstream https://github.com/pmix/pmix/pull/776.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-23 06:35:09 -07:00
Edgar Gabriel
edfdcb6e82
Merge pull request #5324 from edgargabriel/pr/minor-fixes
Pr/minor fixes
2018-06-22 17:20:02 -05:00
Howard Pritchard
8babaad35c
Merge pull request #4520 from ggouaillardet/refresh/romio321
io/romio321: refresh ROMIO based on latest stable MPICH 3.2.1
2018-06-22 16:58:46 -05:00
Edgar Gabriel
cf5cdad40f fcoll: make vulcan the default component
make vulcan the default component except for Lustre file systems.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:12:02 -05:00
Edgar Gabriel
fd8c5fba4e common/ompio: fix the fview based grouping options
a bug sneaked into constructing the list of aggregators
processes when using the fileview based grouping options

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-22 14:01:31 -05:00
Gilles Gouaillardet
45b6e785aa
Merge pull request #5320 from ggouaillardet/topic/ucx_volatile
pml/ucx: silence a warning
2018-06-22 14:00:44 +09:00
Gilles Gouaillardet
edd02b7144 pml/ucx: silence a warning
declare 'fenced' volatile in order to silence CID 1437465

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-22 13:11:42 +09:00
Edgar Gabriel
d5dd008193
Merge pull request #5319 from edgargabriel/pr/ibm-testsuite-fixes2
Pr/ibm testsuite fixes2
2018-06-21 19:46:22 -05:00
Edgar Gabriel
743e0dff5a common/ompio: fix zero size fview issue
handle the situation where the user requests a non-zero amount
of data but has a zero-size fileview. My instrinct would have been
to return an error code, but according to the test that I used
it should be MPI_SUCCESS and zero bytes. It is definitely better
than segfaulting :-)

THis makes another test from the IBM testsuite pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 17:02:13 -05:00
Edgar Gabriel
7643ccfbcf sharedfp/sm and sharedfp/lockedfile: fix seek offset calculation
the seek offset calculation did not treat the offset as a multiple
of the etype provided. Fixing this makes some more ibm tests pass.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 14:26:36 -05:00
Mikhail Kurnosov
c500739293 coll/base: Add MPI_Bcast based on a scatter followed by an allgather
Implements MPI_Bcast using a binomial tree scatter followed by
an recursive doubling allgather.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-06-21 11:47:07 -06:00
Jeff Squyres
e305e80aff
Merge pull request #5317 from jsquyres/pr/update-bind-to-cpulist-option
orterun: use consistent CLI option name for --bind-to
2018-06-21 12:43:18 -04:00
Jeff Squyres
4603852740 orterun: use consistent CLI option name for --bind-to
Since the new binding option is tied to the --cpu-list orterun CLI
option, make the --bind-to option reflect the same name (vs. the
--cpu-set CLI option, which is entirely different).  For example:

    mpirun --bind-to cpu-list:ordered ...

Note that "--bind-to cpulist:ordered" is accepted as a synonym,
because people will be lazy.

Also add some minor updates to the orterun.1in man page for
clarification.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-06-21 08:22:00 -07:00
Edgar Gabriel
fb16d40775
Merge pull request #5196 from edgargabriel/topic/cuda
io/ompio: introduce initial support for cuda buffers in ompio
2018-06-21 10:14:43 -05:00
Edgar Gabriel
7808379a47 common/ompio: incorporate George's comments
incorporate a couple of comments by George as part of the
review on github.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 09:29:49 -05:00
Edgar Gabriel
3c10ed4ed1 common/ompio: use allocator to manage temporary buffers
use an allocator to manage temporary buffers when copying
unmanaged data from GPU buffer to host. This is necessary,
since the buffers have to be pinned for better performance,
which is an expensive operation.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 09:25:50 -05:00
Edgar Gabriel
ac79e576ef fcoll/base: do not use the two_phase compoment with CUDA support
the two_phase compoment does not work with some collective I/O
operations on CUDA buffers due to the data sieving (i.e.
both read and write operations) executed on some buffers, which are
not anticipated in the GPU buffer management of the code.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 09:25:50 -05:00
Edgar Gabriel
6a532101aa io/ompio and common/ompio: add initial support for cuda buffers in ompio
this commit adds the initial support for cuda buffers in ompio, for blocking
and non-blocking individual read and write operations.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 09:25:50 -05:00
Edgar Gabriel
8c2ea0ef49 opal/dataype: add additional interface to retrieve more details about
cuda buffer

the existing interface in opal_datatype_cuda do not allow to distinguish whether a
buffer is a managed or unmanaged cuda buffer. Add an interface that allows to
retrieve this information throug a convertor, since the information is actually available
in the mca_common_cuda_* routines.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-21 09:25:50 -05:00
Ralph Castain
c54db3bd57
Merge pull request #5316 from rhc54/topic/man
Update man and help output for new binding option
2018-06-21 06:39:46 -07:00
Ralph Castain
d2838139e4 Update man and help output for new binding option
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-21 06:36:11 -07:00
Ralph Castain
f875bfd082
Merge pull request #5311 from rhc54/topic/bind
Define a new binding method and qualifier
2018-06-21 05:40:06 -07:00
Yossi Itigin
db26c08336
Merge pull request #5307 from hoopoepg/topic/async-progress-on-mpi-fin
PML/UCX: fixed hang on MPI_Finalize
2018-06-21 13:44:14 +03:00
Sergey Oblomov
5f03628560 PML/UCX: removed uneeded flush
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 12:40:46 +03:00
Sergey Oblomov
2745da7dcc PML/UCX: use non-blocking fence instead of async progress
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-21 09:46:03 +03:00
Ralph Castain
f17d47087a Define a new binding method and qualifier
Allow users to request that procs be bound to a cpu in a given cpu-list based on their corresponding local rank

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-20 21:26:09 -07:00
Ralph Castain
151d13c248
Merge pull request #5310 from rhc54/topic/convert
Cover all the PMIx data types
2018-06-20 10:32:40 -07:00
Ralph Castain
5ac2ce6346 Cover all the PMIx data types
Cover all data types for OPAL-to-PMIx conversion, generating error logs when we hit something we don't support

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-20 09:06:19 -07:00
Edgar Gabriel
7bbeaf30ff
Merge pull request #5306 from edgargabriel/pr/minor-improvements
Pr/minor improvements
2018-06-20 08:43:41 -05:00
Sergey Oblomov
10f2d831ec PML/UCX: fixed hang on MPI_Finalize
- added async UCX progress thread to allow
  pending requests to complete

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-06-20 16:12:05 +03:00
Edgar Gabriel
0757cb11a8 fcoll/all components: minor updates
two minor updates:
 - in all components: use the fh->f_bytes_per_agg value
   (which might have been set by an info object) instead
   of re-reading the mca parameter
 - vulcan and dynamic_gen2: replace one allgather operation
   by an allreduce, since it is used to determine the sum
   of an array.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-06-20 07:47:29 -05:00
Ralph Castain
4bd745940e
Merge pull request #5305 from karasevb/fix_pmix_component
pmix/ext2x: fixed detection PMIx v2.0 by pmix component
2018-06-20 05:41:46 -07:00
Boris Karasev
39c9cb12bb pmix/ext2x: fixed detection PMIx v2.0 by pmix component
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-06-20 13:23:51 +03:00
Gilles Gouaillardet
9d7f0e1c95 Replace MPI_Type_extent with MPI_Type_get_extent in ROMIO.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>

(back-ported from commit open-mpi/ompi@34ec0bd8ab)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:17 +09:00
Gilles Gouaillardet
11428e400a Replace MPI_Address with MPI_Get_address in ROMIO.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>

(back-ported from commit open-mpi/ompi@756cc67221)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:16 +09:00
Gilles Gouaillardet
ad8c49053d io/romio321: fix two more MPI-3 compliance issues
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

(back-ported from commit open-mpi/ompi@ae17908f35)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:16 +09:00
Gilles Gouaillardet
e5460dcb4a io/romio: do not use removed functions
This commit attempts to update the romio io component to not use
functions removed in MPI-3.0 (2012). This is a first cut and will
probably need to be reviewed for correctness.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

(back-ported from commit open-mpi/ompi@84765001aa)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:16 +09:00
Gilles Gouaillardet
c29301da95 io/romio321: fix minmax datatypes
romio assumes that all predefined datatypes are contiguous. Because of
the (terribly named) composed datatypes MPI_SHORT_INT, MPI_DOUBLE_INT,
MPI_LONG_INT, etc this is an incorrect assumption. The simplest way to
fix this is to override the MPI_Type_get_envelope and
MPI_Type_get_contents calls with calls that will work on these
datatypes. Note that not all calls to these MPI functions are
replaced, only the ones used when flattening a non-contiguous
datatype.

References #5009

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

(back-ported from commit open-mpi/ompi@4d876ec6fe)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:16 +09:00
Gilles Gouaillardet
4355a67740 ROMIO 3.2.1 refresh: add refresh notes
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:15 +09:00
Gilles Gouaillardet
bf23e843df ROMIO 3.2.1 refresh: remove old romio
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:15 +09:00
Gilles Gouaillardet
2f0db1945c ROMIO 3.2.1 refresh: patch mpich romio for OMPI
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:14 +09:00
Gilles Gouaillardet
2f391a99a7 ROMIO 3.2.1 refresh: import romio from mpich 3.2.1 tarball
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:14 +09:00
Gilles Gouaillardet
4272b57089 ROMIO 3.2.1 refresh: prepare new romio directory ompi/mca/io/romio321
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-06-20 14:28:13 +09:00
Gilles Gouaillardet
6a504a1544
Merge pull request #5304 from rhc54/topic/resync
Sync to updated PMIx v3.0.0rc
2018-06-20 14:21:03 +09:00
Ralph Castain
97d4e2b578
Merge pull request #5303 from rhc54/topic/lock
Prevent thread lock when show_help msgs are emitted
2018-06-19 21:55:12 -07:00
Ralph Castain
08707c9762 Sync to updated PMIx v3.0.0rc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-06-19 21:25:43 -07:00