1
1

10013 Коммитов

Автор SHA1 Сообщение Дата
Edgar Gabriel
ef28d941d9
Merge pull request #5002 from raafatfeki/pr/coverty-dynamic_gen2-fixes
fcoll/dynamic_gen2: fix coverty warnings
2018-04-04 09:08:42 -05:00
Gilles Gouaillardet
e85fa469f3 coll/tuned: add recursive doubling algo for [ex]scan
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-04-04 14:56:23 +09:00
Gilles Gouaillardet
393376bbd9 coll/basic: move [ex]scan from coll/basic to coll/base
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-04-04 13:41:01 +09:00
Gilles Gouaillardet
65fa0b59c3 coll/tuned: add Rabenseifner algo for [all]reduce
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-04-04 13:25:41 +09:00
Mikhail Kurnosov
177c6ce51f Move algorithms from coll/spacc to coll/base and remove coll/spacc
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-04-04 10:21:06 +07:00
raafatfeki
5d99af29cd fcoll/dynamic_gen2: Formatting fixes
Adjust Coding Style to match the 4 space tab rule.
Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-04-02 17:25:00 -05:00
raafatfeki
92822613ea fcoll/dynamic_gen2: fix coverty warnings
fix warnings for coverty CID 1433655 and CID 1433654

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-04-02 16:18:07 -05:00
Mikhail Kurnosov
1d2d43bdf0 Fix compile error with dtype
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-04-01 08:27:34 +07:00
Edgar Gabriel
c4879ec29f io/ompio: don't reset amode if MODE_SEQUENTIAL is set
the ompio module resets the amode from WRONLY to RDWR in order
to accoomodate data sieving in the two-phase fcoll componet. This
leads however to an error if MPI_MODE_SEQUENTIAL has been requested
by the user, since MODE_SEQUENTIAL is incompatible with MODE_RDWR.
SInce the change to the amode was done after opening the file for
individual file pointers but before opening the file for shared filepointers,
this lead to an error message in the sharedfp component.

Note, that data sieving is never necessary if MODE_SEQUENTIAL is set,
so this should not be a problem for any scenario.

Fixes #4991

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-30 07:56:47 -05:00
Mikhail Kurnosov
50ec214d42 Add recursive doubling algorithm for MPI_Scan and MPI_Exscan to coll/base
Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-03-30 10:12:51 +07:00
raafatfeki
100677721d fcoll/dynamic_gen2: use hindexed constructor on the sender side
instead of using a temporary buffer and copy data into the temp buffer before sending, use a derived datatype to describe the data that needs to be sent during a cycle in the collective I/O operation.

Signed-off-by: raafatfeki <fekiraafat@gmail.com>
2018-03-28 14:37:30 -05:00
Mikhail Kurnosov
bd12e2b1c6 Add recursive doubling algorithm for Scan and Exscan
Implements recursive doubling algorithm for MPI_Scan and MPI_Exscan.
The algorithm preserves order of operations so it can be used both
by commutative and non-commutative operations.

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2018-03-28 16:27:11 +07:00
Nathan Hjelm
e79debc320 osc/rdma: fix overflow in offset calculation
This commit fixes a bug is osc/rdma that can occur if the total size
of the shared memory segment gets larger than 4 GiB. The bug was
caused by a typo. The type of my_base_offset should have been size_t
not int.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-27 09:33:44 -06:00
Nathan Hjelm
f7faacca4e osc/rdma: fix 32-bit builds
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-27 09:16:04 -06:00
Jeff Squyres
06af6f1c4c
Merge pull request #4962 from jsquyres/pr/cid-fixes
A bunch of CID fixes
2018-03-26 22:30:31 -04:00
Ralph Castain
f92acd735b
Merge pull request #4965 from rhc54/topic/rank
Fix breakage in ranking system and silence OSC/RDMA warnings
2018-03-26 19:10:36 -05:00
Jeff Squyres
5360035995 topo/treematch: fix CID 1416327
Ensure to free things in the right order so that we don't access
memory after it is freed.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:26:17 -07:00
Jeff Squyres
08ceb66a19 osc/pt2pt: fix (effectively false positive) CID 1402113
This will almost certainly never happen, but be defensive and
guarantee that we never return an uninitialized variable.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:26:17 -07:00
Jeff Squyres
9de750a280 io/ompio: fix CID 1269889
Free some memory upon error conditions.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
dca66b9775 comm_join: fix CID 1323170
Enusre that the port name is always NULL-terminated.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
6319292170 fcoll/static: fix CID 1413066
local_iov_array is unconditionally allocated, so unconditionally
de-allocate it, too.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
2968ffa296 fcoll/static: remove useless/dead code
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
8e925b4f17 fbtl/posix: fix CID 1419954
Ensure to initialized ret_code.  This problem will likely never occur
in practice, but we might as well be defensive about it.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Jeff Squyres
124208198c osc/rdma: fix CID 1424327
Fix minor memory leak.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-26 14:21:21 -07:00
Nathan Hjelm
1c75aa82fc use-mpi-f08: fix rma function signatures
The various RMA functions need to have the asynchronous property on
all buffers. This property was missing and some buffers were
incorrectly marked as intent(in). This commit fixes the function
signatures.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-26 15:11:07 -06:00
Ralph Castain
3a93b535ec Silence the flood of OSC/RDMA warnings
Fixes #4950

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-03-25 16:12:41 -07:00
Artem Polyakov
77ff99e9ee
Merge pull request #4933 from karasevb/timings_update
timings: added new timing points
2018-03-25 00:10:49 -07:00
Jeff Squyres
871e5c76bc
Merge pull request #4960 from jsquyres/pr/warnings-fixes
Coverity fix + compiler warning fixes
2018-03-23 14:47:56 -05:00
Jeff Squyres
c3adcb05eb Miscellaneous compiler warnings fixes
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-03-23 11:45:30 -07:00
Nathan Hjelm
5f7ff5307e fcoll/two_phase: do not use removed function (MPI_Address)
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-23 08:43:24 -06:00
Edgar Gabriel
36747cca67 io/ompio: disable the fcoll timing by default
somehow the flag indicating to gather performance data
on collective io operations has changed to 1 accidentally.
Should be 0 ( false) by default.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:34:35 -05:00
Edgar Gabriel
aae8c6c6ad remove addproc sharedfp component
never got to move this sharedfp component into anything
usable. Can easily be restored if necessary.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00
Edgar Gabriel
e703ac2da8 remove plfs components
plfs components are at this point not utilized by anybody as far as I know.
Easy to bring back if we want to.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-21 11:27:01 -05:00
Boris Karasev
3796307a57 timings: added new timing points
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2018-03-21 05:16:25 +02:00
bosilca
bf3dd8af19
Merge pull request #4884 from bosilca/topic/fix_wtime
Improve the range and accuracy of MPI_Wtime.
2018-03-16 14:09:33 +09:00
Nathan Hjelm
7f4872d483 osc/rdma: performance improvments and bug fixes
This commit is a large update to the osc/rdma component. Included in
this commit:

 - Add support for using hardware atomics for fetch-and-op and single
   count accumulate  when using the accumulate lock. This will improve
   the performance of these operations even when not setting the
   single intrinsic info key.

 - Rework how large accumulates are done. They now block on the get
   operation to fix some bugs discovered by an IBM one-sided test. I
   may roll back some of the changes if the underlying bug in the
   original design is discovered. There appear to be no real
   difference (on the hardware this was tested with) in performance so
   its probably a non-issue. References #2530.

 - Add support for an additional lock-all algorithm: on-demand. The
   on-demand algorithm will attempt to acquire the peer lock when
   starting an RMA operation. The lock algorithm default has not
   changed. The algorithm can be selected by setting the
   osc_rdma_locking_mode MCA variable. The valid values are two_level
   and on_demand.

 - Make use of the btl_flush function if available. This can improve
   performance with some btls.

 - When using btl_flush do not keep track of the number of put
   operations. This reduces the number of atomic operations in the
   critical path.

 - Make the window buffers more friendly to multi-threaded
   applications. This was done by dropping support for multiple
   buffers per MPI window. I intend to re-add that support once the
   underlying performance bug under the old buffering scheme is
   fixed.

 - Fix a bug in request completion in the accumulate, get, and put
   paths. This also helps with #2530.

 - General code cleanup and fixes.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-15 14:53:53 -06:00
Benoît Legat
00600c7cbb Fix typo in MPI_Cart_shift doc
Signed-off-by: Benoît Legat <benoit.legat@gmail.com>
2018-03-13 15:25:42 +01:00
Edgar Gabriel
da640f98df fcoll/two_phase: data sieving has to occur at offset 0 as well
data sieving has to occur for any offset provided that is larger
or equal zero for this implementation to work correctly.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-10 11:23:09 -06:00
George Bosilca
0f0c27a184
Allow MPI_PROC_NULL as neighbor.
Allowing MPI_PROC_NULL as a neighbor in any topology allows us to add
gaps on the send and recv buffers. This does make the traditional
neighbor collective have a similar behavior as the V version, but in
same time it allows the users to skip the step where they prepare the
counts and the displacement array.

For more info please take a look at issue #4675.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-03-09 12:20:26 +09:00
Edgar Gabriel
c83b47c266 io/romio314: mark datatypes of size 0 as contiguous
this commit fixes an issue observed with romio314 and the hdf5 1.10.x testsuite.
The ADIOI_Datatype_iscontig() routine in romio314/src/io_romio314_module.c
will now return for a datatype of size 0 that it is contiguous, even if the extent
of the datatype is non-zero. This avoids a segmentation fault observed in the
ADIOI_Flatten routine, and fixes this particular with the hdf5 1.10.x testsuite in
OpenMPI with romio314.

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-03-08 09:10:09 -06:00
George Bosilca
9bced03213
Improve the range and accuracy of MPI_Wtime.
As discussed on https://github.com/mpi-forum/mpi-issues/issues/77#issuecomment-369663119
the conversion to double in the MPI_Wtime decrease the range
and accuracy of the resulting timer. By setting the timer to
0 at the first usage we basically maintain the accuracy for
194 days even for gettimeofday.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2018-03-08 14:26:02 +09:00
Nathan Hjelm
5ed2fc2d48 mca/base: add support for additional variable types
This commit adds long, int32_t, uint32_t, int64_t, and uint64_t as
possible MCA variable types.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-03-01 20:42:27 -07:00
bosilca
9944d63de1
Merge pull request #4852 from thananon/pr/ob1_oos_fix
pml/ob1: fixed out of sequence bug.
2018-02-28 13:02:03 -05:00
Thananon Patinyasakdikul
09cba8b30b pml/ob1: fixed out of sequence bug.
This commit fixes #4795

- Fixed typo that sometimes causes deadlock in change of protocol.
- Redesigned out of sequence ordering and address the overflow case of
  sequence number from uint16_t.

Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
2018-02-27 13:49:40 -05:00
Valentin Petrov
bf4e694a96 coll/hcoll: Fix return codes
Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2018-02-22 17:48:29 +02:00
Matias Cabral
0a822f8f99
Merge pull request #4821 from nrspruit/OFI_mtl_multi_event_progress
MTL OFI: Added support for reading multiple CQ events in ofi progress
2018-02-20 14:59:47 -08:00
Jeff Squyres
9ef0f3d83a ompi/monitoring: add .sh versionig to common monitoring lib
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-02-20 07:07:23 -08:00
Artem Polyakov
b601dd504a ompi: ompi_mpi_init(): do not export threading level to modex.
For some of our configuration this flag increases per-process contribution
by ~20% while it is not being used currently.

The consumer of this flag was communicator ID calculation logic, but it was
changed in 0bf06de3f1444f469303e47752430ec9b423b33f.

Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2018-02-18 01:40:15 +07:00
Spruit, Neil R
e7bff501cd MTL OFI: Added support for reading multiple CQ events in ofi progress
-Updated ompi_mtl_ofi_progress to use an array to read CQ events up to a
threshold that can be set by the Open MPI User.

-Users can adjust the number of events that can be handled in the
ompi_mtl_ofi_progress by setting "--mca mtl_ofi_progress_event_cnt #".

-The default value for the the number of CQ events that can be read in a
single call to ofi progress is 100 which is an average
based off workload usecase anaylsis showing 70-128 as the range of
multiple events returned during ofi progress.

Signed-off-by: Spruit, Neil R <neil.r.spruit@intel.com>
2018-02-15 09:41:14 -05:00
Nathan Hjelm
0e83568466 coll/libnbc: do not take lock in progress if there are no requests
This commit fixes a flaw in the progress function for libnbc. The
function was unconditionally taking a lock even if there are no
requests to process. This lock was showing up in vtune traces of
multi-threaded benchmarks.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2018-02-13 09:51:01 -07:00