1
1

10137 Коммитов

Автор SHA1 Сообщение Дата
Artem Polyakov
77ff99e9ee
Merge pull request #4933 from karasevb/timings_update
timings: added new timing points
2018-03-25 00:10:49 -07:00
Jeff Squyres
871e5c76bc
Merge pull request #4960 from jsquyres/pr/warnings-fixes
Coverity fix + compiler warning fixes
2018-03-23 14:47:56 -05:00
Jeff Squyres
c3adcb05eb Miscellaneous compiler warnings fixes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-23 11:45:30 -07:00
Nathan Hjelm
5f7ff5307e fcoll/two_phase: do not use removed function (MPI_Address)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-23 08:43:24 -06:00
Edgar Gabriel
36747cca67 io/ompio: disable the fcoll timing by default
somehow the flag indicating to gather performance data
on collective io operations has changed to 1 accidentally.
Should be 0 ( false) by default.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:34:35 -05:00
Edgar Gabriel
aae8c6c6ad remove addproc sharedfp component
never got to move this sharedfp component into anything
usable. Can easily be restored if necessary.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00
Edgar Gabriel
e703ac2da8 remove plfs components
plfs components are at this point not utilized by anybody as far as I know.
Easy to bring back if we want to.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00
Boris Karasev
3796307a57 timings: added new timing points
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-21 05:16:25 +02:00
bosilca
bf3dd8af19
Merge pull request #4884 from bosilca/topic/fix_wtime
Improve the range and accuracy of MPI_Wtime.
2018-03-16 14:09:33 +09:00
Nathan Hjelm
7f4872d483 osc/rdma: performance improvments and bug fixes
This commit is a large update to the osc/rdma component. Included in
this commit:

 - Add support for using hardware atomics for fetch-and-op and single
   count accumulate  when using the accumulate lock. This will improve
   the performance of these operations even when not setting the
   single intrinsic info key.

 - Rework how large accumulates are done. They now block on the get
   operation to fix some bugs discovered by an IBM one-sided test. I
   may roll back some of the changes if the underlying bug in the
   original design is discovered. There appear to be no real
   difference (on the hardware this was tested with) in performance so
   its probably a non-issue. References #2530.

 - Add support for an additional lock-all algorithm: on-demand. The
   on-demand algorithm will attempt to acquire the peer lock when
   starting an RMA operation. The lock algorithm default has not
   changed. The algorithm can be selected by setting the
   osc_rdma_locking_mode MCA variable. The valid values are two_level
   and on_demand.

 - Make use of the btl_flush function if available. This can improve
   performance with some btls.

 - When using btl_flush do not keep track of the number of put
   operations. This reduces the number of atomic operations in the
   critical path.

 - Make the window buffers more friendly to multi-threaded
   applications. This was done by dropping support for multiple
   buffers per MPI window. I intend to re-add that support once the
   underlying performance bug under the old buffering scheme is
   fixed.

 - Fix a bug in request completion in the accumulate, get, and put
   paths. This also helps with #2530.

 - General code cleanup and fixes.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-15 14:53:53 -06:00
Benoît Legat
00600c7cbb Fix typo in MPI_Cart_shift doc
Signed-off-by: Benoît Legat <benoit.legat@gmail.com>
2018-03-13 15:25:42 +01:00
Edgar Gabriel
da640f98df fcoll/two_phase: data sieving has to occur at offset 0 as well
data sieving has to occur for any offset provided that is larger
or equal zero for this implementation to work correctly.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-10 11:23:09 -06:00
George Bosilca
0f0c27a184
Allow MPI_PROC_NULL as neighbor.
Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add
gaps on the send and recv buffers. This does make the traditional
neighbor collective have a similar behavior as the V version, but in
same time it allows the users to skip the step where they prepare the
counts and the displacement array.

For more info please take a look at issue #4675.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-03-09 12:20:26 +09:00
Edgar Gabriel
c83b47c266 io/romio314: mark datatypes of size 0 as contiguous
this commit fixes an issue observed with romio314 and the hdf5 1.10.x testsuite.
The ADIOI_Datatype_iscontig() routine in romio314/src/io_romio314_module.c
will now return for a datatype of size 0 that it is contiguous, even if the extent
of the datatype is non-zero. This avoids a segmentation fault observed in the
ADIOI_Flatten routine, and fixes this particular with the hdf5 1.10.x testsuite in
OpenMPI with romio314.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-08 09:10:09 -06:00
George Bosilca
9bced03213
Improve the range and accuracy of MPI_Wtime.
As discussed on https://github.com/mpi-forum/mpi-issues/issues/77#issuecomment-369663119
the conversion to double in the MPI_Wtime decrease the range
and accuracy of the resulting timer. By setting the timer to
0 at the first usage we basically maintain the accuracy for
194 days even for gettimeofday.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-03-08 14:26:02 +09:00
Nathan Hjelm
5ed2fc2d48 mca/base: add support for additional variable types
This commit adds long, int32_t, uint32_t, int64_t, and uint64_t as
possible MCA variable types.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-01 20:42:27 -07:00
bosilca
9944d63de1
Merge pull request #4852 from thananon/pr/ob1_oos_fix
pml/ob1: fixed out of sequence bug.
2018-02-28 13:02:03 -05:00
Thananon Patinyasakdikul
09cba8b30b pml/ob1: fixed out of sequence bug.
This commit fixes #4795

- Fixed typo that sometimes causes deadlock in change of protocol.
- Redesigned out of sequence ordering and address the overflow case of
  sequence number from uint16_t.

Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
2018-02-27 13:49:40 -05:00
Valentin Petrov
bf4e694a96 coll/hcoll: Fix return codes
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2018-02-22 17:48:29 +02:00
Matias Cabral
0a822f8f99
Merge pull request #4821 from nrspruit/OFI_mtl_multi_event_progress
MTL OFI: Added support for reading multiple CQ events in ofi progress
2018-02-20 14:59:47 -08:00
Jeff Squyres
9ef0f3d83a ompi/monitoring: add .sh versionig to common monitoring lib
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-02-20 07:07:23 -08:00
Artem Polyakov
b601dd504a ompi: ompi_mpi_init(): do not export threading level to modex.
For some of our configuration this flag increases per-process contribution
by ~20% while it is not being used currently.

The consumer of this flag was communicator ID calculation logic, but it was
changed in 0bf06de3f1444f469303e47752430ec9b423b33f.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2018-02-18 01:40:15 +07:00
Spruit, Neil R
e7bff501cd MTL OFI: Added support for reading multiple CQ events in ofi progress
-Updated ompi_mtl_ofi_progress to use an array to read CQ events up to a
threshold that can be set by the Open MPI User.

-Users can adjust the number of events that can be handled in the
ompi_mtl_ofi_progress by setting "--mca mtl_ofi_progress_event_cnt #".

-The default value for the the number of CQ events that can be read in a
single call to ofi progress is 100 which is an average
based off workload usecase anaylsis showing 70-128 as the range of
multiple events returned during ofi progress.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2018-02-15 09:41:14 -05:00
Nathan Hjelm
0e83568466 coll/libnbc: do not take lock in progress if there are no requests
This commit fixes a flaw in the progress function for libnbc. The
function was unconditionally taking a lock even if there are no
requests to process. This lock was showing up in vtune traces of
multi-threaded benchmarks.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-13 09:51:01 -07:00
Edgar Gabriel
a3a734b6d2 io/ompio: correctly reset the request
after performing the final OBJ_RELEASE on the request,
reset the user level variable to MPI_REQUEST_NULL.
Otherwise the c_2_f translation step in the fortran
interface fails.

Fixes issue #4807

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-02-13 09:18:25 -06:00
Jeff Squyres
e7f91f8068
Merge pull request #4527 from clementFoyer/osc-no-includes
Remove inter-dependencies between OSC modules.
2018-02-09 15:49:56 -05:00
Nathan Hjelm
da9f833f4a pml/ob1: ignore the eager limit of RDMA-only btls
This commit fixes a flaw in the eager limit check in pml/ob1. The
check was incorrectly checking if RDMA-only BTLs (BTLs without the
send flag) has a valid eager limit. This commit fixes the check by
adding an additional check for the send flag on the BTL module.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-07 12:42:44 -07:00
Clement Foyer
f5b4fc05f8 Remove inter-dependencies between OSC modules.
The osc monitoring component needed to include other OSC components
header in order to be able tu access communicator through the
component specific ompi_osc_*_module_t structures. This commit remove
the dependency, and resolve the issue #4523.

Extend the common monitoring API.

  * Now it's possible to translate from local rank to world rank from
    both the communicator and the group.
  * Remove useless hashtable as we directly use the w_group contained
    in window structure.

Add automatic generation at config time.

The templates are expanded at configure time. It creates a new header
file that generates all the variables/functions needed. Adding this
during the autogen automagicaly generates for each of the available
modules the proper functions.

Only keep a generated argv-style array.

Following Jeff's advice, the configure.m4 file generate a simple array
of module variables to be iterated over to find the proper module.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2018-02-07 11:52:00 +00:00
Ralph Castain
7ddffc627d
Merge pull request #4776 from rhc54/topic/rte
Correct abstraction break and update ignores
2018-01-31 04:34:05 -08:00
Ralph Castain
8e8a9aecc5 Correct abstraction break - direct reference to ORTE
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-30 21:19:14 -08:00
Gilles Gouaillardet
34b45cc879 osc/sm: fix the osc_free callback
If component selection fails, then module->bases might be unallocated
when ompi_osc_sm_free() in invoked, so test it before trying to free()
module->bases[0].

Thanks Martin Binder for the report.

Refs open-mpi/ompi#4770

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 11:23:21 +09:00
Sergey Oblomov
7a5811d0a8 request/state: update state for canceled request
- fixed issue in set state for canceled request

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-01-29 18:26:20 +02:00
Edgar Gabriel
bcf26d419f fs/ufs and fs/lustre: remove erroneous return statement
an erroneous return statement has creeped in commit 1885d99
which leads to some processes not resetting stripe_size
and stripe_count correctly. This can lead in 3.0.x to different
fcoll modules being selected. The impact is not that dramatic on
master and 3.1.x, but could lead to problems as well.

Fixes #4745

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-01-24 14:07:21 -06:00
Xin Zhao
72ff2b1135 OMPI/OSC/UCX: adding atomic lock for fetch_and_op and compare_and_swap
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-01-18 00:36:22 +02:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Alex Mikheev
640e945b9c ompi: pml/ucx: blocking send using ucp_tag_send_nbr
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Gilles Gouaillardet
cb5dfbe5b1 man: fix indentation of MPI_Comm_spawn[_multiple]
no code change.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-15 11:29:15 +09:00
Matias Cabral
8049c06a96
Merge pull request #4580 from matcabral/fix_comments_pr_425_osc_rmda
osc/rmda: fix missing opal_argv_free in mtls search.
2018-01-12 09:20:11 -08:00
Matias A Cabral
009ba475e1 osc/rmda: fix missing opal_argv_free in mtls search.
Use asprintf in description message to avoid missing default values
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2018-01-11 14:29:16 -08:00
Jeff Squyres
1c5664fdec
Merge pull request #4681 from nathanweeks/issue/f08_mpi_errcodes_ignore
Fix type of mpi_f08 MPI_ERRCODES_IGNORE
2018-01-10 12:34:16 -05:00
Gilles Gouaillardet
02f8215b25 ompi: enhance MPI_File_set_view datatype check.
Per MPI 3.1 chapter 13.3 :
"Derived etypes can be constructed by using any of the MPI
datatype constructor routines, provided all resulting typemap
displacements are non-negative and monotonically nondecreasing."
Same restriction applies to ftypes.

add the OMPI_DATATYPE_CHECK_FOR_VIEW() macro that is
check the underlying opal_datatype_t is monotonic, on top
of all checks performed in OMPI_DATATYPE_CHECK_FOR_RECV().

Since checking monotoniciy is expensive, check is only performed
when needed, but the result is cached by ompi_datatype_is_monotonic().

Thanks Wei-keng Liao for the valuable feedback.
Thanks George for the guidance.

Refs. open-mpi/ompi#4682

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-09 18:05:15 +09:00
Nathan T. Weeks
3158d2c5ed Fix type of mpi_f08 MPI_ERRCODES_IGNORE
Signed-off-by: Nathan T. Weeks <weeks@iastate.edu>
2018-01-07 05:36:41 -08:00
KAWASHIMA Takahiro
710080be63
Merge pull request #4667 from kawashima-fj/pr/f08-pmpi
fortran: Fix PMPI interface bugs in mpi_f08 module
2018-01-05 03:45:10 -06:00
bosilca
ef38ca5663
Merge pull request #4644 from bosilca/topic/treematch
Fix treematch topology assert
2018-01-02 21:21:54 -05:00
Gilles Gouaillardet
2dd345465f ompi/communicator: optimize ompi_comm_split()
set grp_local_rank as MPI_UNDEFINED before invoking
ompi_comm_nexcid() in order to benefit from the optimizations
introduced in open-mpi/ompi@68167ec879

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-12-28 15:34:04 +09:00
Yossi Itigin
697a9437e2
Merge pull request #4666 from alex-mikheev/topic/pml_ucx_recv_fix
ompi: pml ucx: improve recv latency
2017-12-26 20:12:18 +02:00
Alex Mikheev
e7bf0617cf
ompi: pml ucx: improve recv latency
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2017-12-26 16:24:16 +02:00
KAWASHIMA Takahiro
bd2fe9c324 fortran: Call PMPI from PMPI_Status_set_cancelled_f08
This is a bug which was forgotten to change in c08f97b0304.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-26 15:53:12 +09:00
KAWASHIMA Takahiro
00e3c7a973 fortran: Align indentation
This change makes comparison of `mpi-f08-interfaces.F90` and
`pmpi-f08-interfaces.F90` easier.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-12-26 14:39:09 +09:00