1
1
Граф коммитов

9718 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
6ea3c8a0bd Update the interlib example to show an alternative method for model declaration. Add a missing range value to the OPAL layer. Make it easier to see OMPI model callbacks
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-10-23 11:27:42 -07:00
Edgar Gabriel
be0de21e6f fs/ufs and fbtl/posix: cleanup lock management
This commit looks large, but its really mostly a cleanup step.
1. introduce proper error handling for the return values of fcntl and the fbtl_posix_lock function
2. rename a parameter to more accurately reflect what it does
3. introduce an mca parameter in the fs/ufs component that allows to control
   what the level of locking the user would like to enforce
4. move the initialization of the fs_block_size parameter from fs/ufs into the
   common/ompio component. An fs component might be allowed to overwrite this
   value, but none of the actual fs components do that.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 14:56:28 -05:00
Edgar Gabriel
e62f9d2e52 fs/ufs: ensure that the never-lock flag is set if not on NFS
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:40 -05:00
Edgar Gabriel
f66c55f77a fbtl/posix: fixes in the offset calculation and for aio operations
our own internal testsuite passes now correctly. More testing to follow.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:39 -05:00
Edgar Gabriel
a3c638bc38 fbtl/posix: add support for file locking for the non-blocking operations
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:38 -05:00
Edgar Gabriel
415e76514d fbtl/posix: make the code compile
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Edgar Gabriel
f5e158c869 fbtl/posix: first cut in adding locking support
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-19 13:32:37 -05:00
Gilles Gouaillardet
9771c575f5 Merge pull request #4352 from edgargabriel/pr/sem_close_fix
sharedfp/sm: close the named semaphore
2017-10-19 17:04:43 +09:00
Edgar Gabriel
4d995bd4eb sharedfp/sm: close the named semaphore
in case a named semaphore is used, it is necessary to close the semaphore to remove
all sm segments. sem_unlink just removes the name references once all proceeses have closed
the sem.

Fixes issue: #4336

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>

sharedfp/sm: unlink only needs to be called by one process

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-10-18 10:37:30 -05:00
Aurelien Bouteiller
3ef23f41a3
Bugfix a crash when a comm cannot be initialized
Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2017-10-18 11:32:37 -04:00
Valentin Petrov
1e311b2619 coll/hcoll: dtype fallback optimization
If hcoll fails to create mpi derived type let's set zero_dte on this dtype.
    This will save cycles on subsequent collective calls with the same derived
    type since we will not try to create hcoll type again.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2017-10-06 10:29:29 +03:00
Valentin Petrov
06ef344630 coll/hcoll: extends dtypes support
Adds support for legacy MPI_UB/LB types (old apps may use it) as
    well as for BOOL/WCHAR.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2017-10-06 10:29:29 +03:00
Geoff Paulsen
be7b0af5d9 Merge pull request #3609 from markalle/pr/single_type_with_LB_UB
single_predefined_type with MPI_LB/UB
2017-10-04 15:13:09 -05:00
Mark Allen
e24d5ccb7e single_predefined_type with MPI_LB/UB
The ompi_datatype_get_single_predefined_type_from_args() recurses down
into a constructed type to identify what base datatype it's built from
if it's built from a single type.  But if the type has MPI_LB/MPI_UB,
for example
    lens[0] = 1;
    lens[1] = 1;
    disps[0] = 0;
    disps[1] = 0;
    types[0] = MPI_LB;
    types[1] = MPI_INT;
    MPI_Type_create_struct(2, lens, disps, types, &mydt);
then this function will see the base type MPI_LB as differing from MPI_INT
and will identify mydt as not being constructed from a single base type, so
the type will be rejected for calls like MPI_Accumulate.

I think those "meta data" types shouldn't result in rejection like that, and
the above mydt should still be identified as having a single base type
of MPI_INT.

Addition: boslica wanted another change discussed here
    https://github.com/open-mpi/ompi/pull/3609
relating to the calculation for "count" after identifying the
predefined_type that was being used.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-10-03 19:08:18 -04:00
George Bosilca
bdbea63a1c
Update the MPI standard reference.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-10-03 16:48:50 -04:00
George Bosilca
a3ac67be0d
Remove double include.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-10-02 21:33:40 -04:00
Gilles Gouaillardet
9492766dbd romio: disable Fortran support
romio314 is a just a component that does not require Fortran bindings,
so simply disable Fortran support to prevent warnings about deprecated flags

Fixes open-mpi/ompi#4281

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-10-02 21:44:33 +09:00
Matias Cabral
d9b2c94d4a Merge pull request #4286 from aravindksg/master
Use opal_show_help to warn about PSM2_CUDA envvar setting
2017-10-01 11:10:19 -07:00
George Bosilca
2a2db13b32
Gracefully deal with a get returning 1 (complete right away).
Kudos to @EmmanuelBRELLE for spotting it.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-10-01 02:24:02 -04:00
Aravind Gopalakrishnan
f8a2b7f6bf Use opal_show_help to warn about PSM2_CUDA envvar setting
If Open MPI is configured with CUDA, then user also should be using a CUDA build of
PSM2 and therefore be setting PSM2_CUDA environment variable to 1 while using
CUDA buffers for transfers. If we detect this setting to be missing, force set
it. If user wants to use this build for regular (Host buffer) transfers, we
allow the option of setting PSM2_CUDA=0, but print a warning
message to user that it is not a recommended usage scenario.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-09-29 17:04:10 -07:00
George Bosilca
92e4ecb618
Fix the offset computation.
Also ensure the marked array is correctly freed in all cases.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-27 11:42:21 -04:00
George Bosilca
0ceed71368
Fix Coverity warnings in treematch topo.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-26 22:24:01 -04:00
Edgar Gabriel
4c0d347412 Merge pull request #4230 from edgargabriel/topic/no-smart-fview
io/ompio: add a new grouping option avoiding communication
2017-09-26 10:56:06 -05:00
bosilca
f44e674992 Merge pull request #4074 from bosilca/topic/coverity
Fix coverity complaints.
2017-09-25 15:05:31 -04:00
Guillaume Mercier
4e7c130c31
Add correct reordering computation in partially distributed case.
Replaced matching array with k and bcast with scatter.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Guillaume Mercier <mercier@labri.fr>
2017-09-25 13:10:11 -04:00
George Bosilca
3dd1d8cb53
Delay the first check for the HWLOC topology.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 13:09:57 -04:00
George Bosilca
28046b37df
Always succesfully return.
As the reordering is an optional step, if any operation during the
reorder fails we can return the duplicata of the original communicator
associated with the topology information.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:53:20 -04:00
George Bosilca
219a96fa69
Prevent memory leaks.
Reorder the code to simplify the memory management.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:53:20 -04:00
George Bosilca
64bff0e326
Disable monitoring if we compile statically.
Protect all components against compilation on static builds.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00
George Bosilca
458ccc12e1
Move the profiling library in common/monitoring
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-25 12:18:23 -04:00
Clément FOYER
f334607c34
Simplify the communicator's name caching management (#6)
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-09-25 12:18:23 -04:00
bosilca
a680b3ac6d Merge pull request #3853 from clementFoyer/master
OMPI monitoring: Simplify the communicator's name caching management + misc test changes
2017-09-25 12:14:36 -04:00
yohann
1f8cabc890 mtl/ofi: Fix provider selection.
This allows mtl_ofi_provider_include to work with layered providers as well.
e.g. --mca mtl_ofi_provider_include "providerX;ofi_rxm"

Signed-off-by: yohann <yohann.burette@intel.com>
2017-09-20 16:00:50 -07:00
Gilles Gouaillardet
b9315edb85 configury: remove the --disable-mpi-io option
Fixes open-mpi/ompi#2185

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-20 14:39:09 +09:00
Edgar Gabriel
76a8c67575 io/ompio: add a new grouping option avoiding communication
the new grouping option simple+ performs all calculations used
for the aggregator selection as if the default file view would be used,
thus avoiding communication in file_set_view all together. This mode
is useful for applications that do not set a file view, but use
explicit offset operations on the default file view.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2017-09-18 12:30:34 -05:00
Ralph Castain
ed508010b4 Remove stale tools
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-18 07:30:47 -07:00
Ralph Castain
3c914a7a97 Complete the fix of the ORTE DVM. We will now use "prun" instead of "orterun -hnp foo" to execute jobs. This provides the feature of automatic discovery of the orte-dvm so you don't need to manually enter URI's or contact file locations. All IO is forwarded to prun.
Still in the "needs to be done" category:

* mapping/ranking/binding options aren't correctly supported

* if the DVM encounters some errors (e.g., not enough resources for the job), the resulting error is globally set and impacts any subsequent job submission

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-16 13:13:07 -07:00
Ralph Castain
3f8908871b Since the DVM is now tied to prun, don't build the DVM either unless prun can be built
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-09-13 11:55:10 -07:00
Brian Barrett
637ebf60f9 atomics: Remove requirement of 64 bit atomics
Remove two of the three  instances of components requiring
64 bit atomics, even on 32 bit systems.  The SM OSC component
also uses 64 bit atomics, but is a more complicated fix that
will follow this one.  Currently, no one is testing on
platforms that don't provide 64 bit atomics (even in 32 bit
mode), but with the removal of the non-inline assembly for
IA32, the older compilers on Absoft's test systems now
result in no practical way to call cmpxchg8 in 32 bit mode.
At that point, these failures started popping up.

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-09-11 19:50:10 -07:00
Nathan Hjelm
7cdda24206 osc/sm: do not require 64-bit atomic math
This commit fixes a compile issue on 32-bit systems that do not
support 64-bit atomic math. The active target path was using 64-bit
atomics exclusively to support PSCW. This commit updates the code to
use either 32 or 64-bit atomic math depending on what is available.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-11 14:10:38 -10:00
Nathan Hjelm
4bba8774f4 monitoring: fix MPI_T regression
The monitoring code causes MPI_T based tools to segfault when
monitoring is disabled. This happens because the performance
variables remain registered after the common/monitoring
component is dlclosed due to a missing variable registration
flag. This commit adds the necessary flag to all the registered
performance variables.

The issue on github is #4162. Close when applied to master.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-09-06 14:24:35 -06:00
bosilca
dc538e9675 Merge pull request #1177 from bosilca/topic/large_msg
Topic/large msg
2017-09-05 13:30:19 -04:00
Gilles Gouaillardet
ecb6b81a05 mpi: correctly handle MPI_IN_PLACE by memchecker in neighborhood collectives
MPI_IN_PLACE is not a valid send buffer for neighborhood collectives, so do not
invoke memchecker in this case.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
66c9485e77 MPI_Isend: memchecker do not mark send buffer as unaccessible after pml isend invokation
Today's MPI standard mandates the send buffer remains accessible during the send operation.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:21:32 +09:00
Gilles Gouaillardet
af8242a121 pml/ob1: have memchecker make recv buffer defined again when mca_pml_ob1_recv completes
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Gilles Gouaillardet
6ee9366243 MPI_Wait: correctly handle MPI_STATUS_IGNORE in MEMCHECKER
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-09-04 11:18:05 +09:00
Aravind Gopalakrishnan
2e83cf15ce Add support for GPU buffers for PSM2 MTL
PSM2 enables support for GPU buffers and CUDA managed memory and it can
directly recognize GPU buffers, handle copies between HFIs and GPUs.
Therefore, it is not required for OMPI to handle GPU buffers for pt2pt cases.
In this patch, we allow the PSM2 MTL to specify when
it does not require CUDA convertor support. This allows us to skip CUDA
convertor init phases and lets PSM2 handle the memory transfers.

This translates to improvements in latency.
The patch enables blocking collectives and workloads with GPU contiguous,
GPU non-contiguous memory.

Signed-off-by: Aravind Gopalakrishnan <Aravind.Gopalakrishnan@intel.com>
2017-09-01 16:59:03 -07:00
George Bosilca
866899e836
Always abide to the RDMA pipeline limit.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
George Bosilca
050bd3b6d7
Make the pipeline depth an int instead of a size_t. While
they are supposed to be unsigned, casting them to a signed
value for all atomic operations is as errorprone as handling
them as signed entities.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-09-01 18:52:48 -04:00
Clement Foyer
9a8fc1b9f1 Simplify the communicator's name caching management
Remove useless over-initialization

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-08-29 12:52:47 +02:00