1
1
Граф коммитов

9050 Коммитов

Автор SHA1 Сообщение Дата
Aurélien Bouteiller
7f65c2b18e forgot to update copyright in commits 627a89b 4899c89 2016-05-13 11:34:59 -04:00
George Bosilca
37e03e3e5b Don't update req_bytes_received if no bytes were received. 2016-05-12 23:39:32 -04:00
rhc54
4d026e223c Merge pull request #1661 from matcabral/master
PSM and PSM2 MTLs to detect drivers and link
2016-05-11 17:43:17 -07:00
George Bosilca
f8facb177d atomically update the refcount on the datatype args. 2016-05-11 12:40:18 -04:00
Matias A Cabral
528abff6ae Merge remote-tracking branch 'upstream/master' 2016-05-10 15:42:08 -07:00
Matias A Cabral
d28ee62a96 Update in PSM and PSM2 MTLs to detect entries created by drivers for
Intel TrueScale and Intel OmniPath, and detect a link in ACTIVE state.
This fix addresses the scenario reported in the below OMPI users email,
including formerly named Qlogic IB, now Intel True scale. Given the
nature of the PSM/PSM2 mtls this fix applies to OmniPath:
https://www.open-mpi.org/community/lists/users/2016/04/29018.php
2016-05-09 12:08:44 -07:00
Gilles Gouaillardet
0a19337371 coll/base: return MPI_ERR_UNSUPPORTED_OPERATION when coll_base_*_two_procs algo is used on a communicator that has no two tasks
Thanks Dave Love for the report
2016-05-09 14:18:40 +09:00
Gilles Gouaillardet
b159587325 io/romio: fix filesystem type check on OpenBSD 5.7
check the existence of the f_type field in struct statfs

Thanks Paul Hargrove for the report
2016-05-09 13:54:46 +09:00
Ralph Castain
6b24e2779b Remove stale component - I'm not going to get to it 2016-05-07 04:13:34 -07:00
Edgar Gabriel
def1b95fd7 Merge pull request #1646 from edgargabriel/getview-preallocate-fixes
io/ompio: file_getview and file_preallocate fixes
2016-05-06 11:46:00 -05:00
Edgar Gabriel
e65e189671 io/ompio: fix file size after file_preallocate
Thanks for @dalcini for reporting
Fixes open-mpi/ompi#1633
2016-05-06 08:20:59 -05:00
Edgar Gabriel
d358965134 io/ompio: fix envelope of datatype returned by getview
Thanks for @dalcini for reporting
Fixes open-mpi/ompi#1632
2016-05-06 08:19:48 -05:00
Edgar Gabriel
7c92acaa78 Merge pull request #1637 from edgargabriel/pr/netbsd-compilation-problems
fs/lustre and fs/pvfs2: fix netbsd compilation problems
2016-05-06 08:05:36 -05:00
Jeff Squyres
810db734c4 Merge pull request #1640 from jsquyres/pr/mpir-cleanup
debuggers: remove some useless code
2016-05-05 21:23:30 -04:00
Gilles Gouaillardet
6c9d65c0ca coll/libnbc: fix MPI_Ireduce_scatter_block for one task communicator
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#248
2016-05-06 09:43:29 +09:00
Ralph Castain
08022d7af1 Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required. 2016-05-05 15:28:13 -07:00
Jeff Squyres
83c2d04aa3 debuggers: remove some useless code
MPIR-1.0 specifies that the following symbols are only relevant in the
starter process:

- MPIR_Breakpoint
- MPIR_being_debugged
- MPIR_debug_state
- MPIR_debug_abort_string

I.e., the code filling in values in these various symbols was useless
/ never used.

MPIR-1.1 will define that MPIR_being_debugged *is* relevant in MPI
processes.  That symbol is currently defined in libopen-rte (which is
currently causing a duplicate symbol error for static builds -- this
commit fixes that error), and is therefore still available for MPI
processes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-05 14:22:55 -07:00
Jeff Squyres
f167be1c91 ompio: always return valid info from FILE_GET_INFO
MPI-3.1 says that even if no info keys are set on the file, we need to
return a new, empty info.

Thanks to Lisandro Dalcin for identifying the issue.

Fixes open-mpi/ompi#1630

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-05 12:03:29 -07:00
Aurélien Bouteiller
4899c89731 Fix a race condition when multiple threads try to create a bml endpoint simultaneously. 2016-05-05 10:49:30 -04:00
Aurélien Bouteiller
627a89bf71 Fix a race condition when multiple threads do the "first send" to an endpoint simultaneously. 2016-05-05 09:04:10 -04:00
Joshua Ladd
4771c9ece6 Merge pull request #1617 from jladd-mlnx/topic/disable-hcoll-barrier-in-finalize-ompi-trunk
HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla
2016-05-04 10:12:34 -04:00
Aurélien Bouteiller
8344d00418 use-mpi extensions do not have a .la lib, so the fortran module should not depend on them. 2016-05-03 11:54:35 -04:00
Edgar Gabriel
78fa8bb2c4 remove some unused variables that can cause compilation problems on netbsd 2016-05-03 10:25:15 -05:00
Todd Kordenbrock
3498bed650 Merge pull request #1555 from shawone/check_reduce_ret
coll-portals4: check return value from reduce kary tree functions
2016-05-03 10:17:23 -05:00
Jeff Squyres
33dd8ca81e osc_rdma_peer: properly include ompi_config.h
Thanks to Paul Hargrove for reporting.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-03 07:39:55 -07:00
Devendar Bureddy
cafd55f18c HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla
tear down

HCOLL barrier may not complete if HCOLL progress is not called periodically.
which is the case in HCOLL teardown progress in the finalize.
(cherry picked from commit 793244d75dd94d1d5e0243bcccf6d04318750f3f)
2016-05-03 00:49:57 +03:00
Nathan Hjelm
d3d779f6d9 osc/rdma: clear all_sync object when obtaining a lock
This commit fixes a bad synchronization detection bug that occurs when
mixing MPI_Win_fence() and MPI_Win_lock(). If no communication has
occurred in the fence epoch it is safe to just clear the all_sync
object (it was set up by fence).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 15:28:47 -06:00
Jeff Squyres
265e5b9795 Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1
ompi/opal/orte/oshmem/test: max hostname length cleanup
2016-05-02 09:44:18 -04:00
Ralph Castain
6ac7929bd0 Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need.
Cleanups per @jjhursey review
2016-05-01 11:30:25 -07:00
George Bosilca
6e6ed62a3c Allow NULL arrays for emoty datatypes.
When building an empty datatype (aka. size = 0) because the count of
included datatypes is 0, be less strict on what the arguments are
(allow NULL pointers).
2016-05-01 12:37:02 -04:00
Nathan Hjelm
ec66a6a1f8 Merge pull request #1605 from hjelmn/rdma_fixes
osc/rdma: fix global index array calculation
2016-04-28 20:41:36 -06:00
Nathan Hjelm
7bda3eb2dc osc/rdma: fix global index array calculation
This commit fixes a bug that occurs when ranks are either not mapped
evenly or by something other than core.

Fixes open-mpi/ompi#1599

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-28 19:11:11 -06:00
Nathan Hjelm
1783d94f91 ompi/group: fix sparse group proc reference counting
This commit fixes a bug when sparse groups are in use. Since sparse
group do not actually increment the reference counts of any procs
(they just retain the parent group) it is wrong to decrement the
reference counts of all procs in the group using
ompi_group_decrement_proc_count(). This commit makes the call to
ompi_group_decrement_proc_count() conditional on the group being
dense.

Fixes open-mpi/ompi#1593

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-27 15:55:13 -06:00
Gilles Gouaillardet
01c90d4e71 fortran/mpif-h: fix *_create_keyval_f
correctly handle out parameter *_keyval when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 13:34:32 +09:00
Gilles Gouaillardet
178dde6a20 fortran/mpif-h: fix MPI_Win_shared_query
correctly handle out parameter disp_unit when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 11:22:09 +09:00
Gilles Gouaillardet
7f59d2a8c7 fortran/mpif-h: fix MPI_Win_free_keyval
initialize inout parameter when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 10:46:14 +09:00
Nathan Hjelm
f0f3383006 Merge pull request #1590 from hjelmn/thread_multiple
osc/pt2pt: do not drop/reacquire the ompi_request_lock
2016-04-26 16:48:37 -06:00
Nathan Hjelm
34ff6293bd osc/pt2pt: do not drop/reacquire the ompi_request_lock
This lock is now recursive so it is safe to call into the pml without
dropping the lock.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-26 14:19:38 -06:00
George Bosilca
bf190671e9 Make the request lock recursive.
If during the request completion callback we post another request that
completes right away (such a small send or a match for an unexpected
short message) we will try to complete the second request while holding
the lock for the completion of the first. For performance reasons
(mainly to avoid unlocking and locking the request mutex several times)
we have made the request lock recursive.
2016-04-26 16:16:07 -04:00
Nathan Hjelm
1e4daa2a0e mpi_init: move opal_set_using_threads() earlier in MPI_Init()
There is a potential race condition in MPI_Init() where an orte even
thread could be in a function that uses OPAL_THREAD_LOCK /
OPAL_THREAD_UNLOCK when ompi_mpi_init calls opal_set_using_threads().

Closes open-mpi/ompi#1586

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-26 13:02:42 -06:00
Nathan Hjelm
c16e639b2f Merge pull request #1563 from hjelmn/ompi_coverity
ompi coverity fixes
2016-04-26 09:17:48 -06:00
Jeff Squyres
8ab88f2051 ompi_mpi_finalize: add/update comments
This is a follow-on to open-mpi/ompi@7373111: add some comments
explaining why the code is the way it is.  Also update a previous
comment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-04-25 13:42:30 -07:00
Ralph Castain
7373111662 Somehow, the logic for finalize got lost, so restore it here. If pmix.fence_nb is available, then call it and cycle opal_progress until complete. If pmix.fence_nb is not available, then do an MPI_Barrier and call pmix.fence.
Needs to go over to 2.x
2016-04-25 08:04:35 -07:00
Karol Mroz
3322347da9 ompi: fixup hostname max length usage
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-25 07:08:23 +02:00
Nathan Hjelm
ae0ffbb67f Merge pull request #1397 from hjelmn/enable_thread_multiple
ompi: always enable MPI_THREAD_MULTIPLE support
2016-04-23 08:40:22 -06:00
Joshua Ladd
0d5a57d9d3 Merge pull request #1558 from vspetrov/hcoll_complex_dtype_support
Adds mapping to hcoll complex data type
2016-04-20 08:35:33 -04:00
Gilles Gouaillardet
490b538ad6 ompi/datatype: fix MPI_LONG_LONG_INT type name
MPI_LONG_LONG_INT is a named predefined datatype, so its name is now MPI_LONG_LONG_INT
MPI_LONG_LONG is a synonym of MPI_LONG_LONG_INT, and its name is also MPI_LONG_LONG_INT
2016-04-20 09:34:20 +09:00
Nathan Hjelm
1ff3d3b16b pml/ob1: fix coverity issue
Fix CID 1357978 (1 of 1): Logically dead code (DEADCODE):

Remove duplicate check for NULL == endpoint.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:48:13 -06:00
Nathan Hjelm
70533e6d50 fcoll/static: fix coverity issues
Fix CID 72362: Explicit null dereferenced (FORWARD_NULL)

From what I can tell the code @ fcoll_static_file_read_all.c:649
should be setting bytes_per_process[i] to 0 not bytes_per_process.

Fix CID 72361: Explicit null dereferenced (FORWARD_NULL)

Modified check to check for blocklen_per_process non-NULL before
trying to free blocklen_per_process[l]. This is sufficient because
free (NULL) is safe. Also cleaned up the initialization of this an a
couple other arrays. They were allocated with malloc() then
initialized to 0. Changed to used calloc().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:48:13 -06:00
Nathan Hjelm
8871bdb2f8 fcoll/two_phase: fix coverity issues
Fix CID 72296: Resource leak (RESOURCE_LEAK):

Changed code to goto exit instead of returning to ensure memory is
freed.

Fix CID 712589: Out-of-bounds read (OVERRUN):

In this loop i and j are identical and always less than
iov_count. The CID was triggered because i was incremented if i was <
iov_count. This meant that if the loop did go on the next iteration
would access an invalid index.

Fix CID 741363: Uninitialized scalar variable (UNINIT):

Allocate tmp_len with calloc to insure every index is initialized.

Fix CID 741364: Uninitialized pointer read (UNINIT):

Allocate recv_types with calloc to ensure all indices are always
initialized. Also added a check to not loop and destroy if recv_types
is NULL.

Also added a NULL check on the allocation of decoded iov. This is not
the cause of CID 126784 but should be fixed.

Fix CID 712588: Out-of-bounds read (OVERRUN):

Similar to CID 712589. Should silence the issue.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:47:41 -06:00
Valentin Petrov
21f1c572c0 Adds mapping to hcoll complex dte 2016-04-19 14:14:28 +03:00
Nicolas Chevalier
c86d4035d2 coll-portals4: check return value from reduce kary tree functions 2016-04-18 12:02:30 +00:00
Ralph Castain
7829fbdc29 Per request from Jeff, aggregate all help messages during MPI_Init thru MPI_Finalize as long as the RTE is available 2016-04-15 13:37:22 -07:00
KAWASHIMA Takahiro
e854404570 fortran: Change the line order of #pragma
No code change.
These lines were introduced in my recent commit 17d32ac.
I had a editing mistake and the order is different from
other lines/files.
2016-04-15 12:49:03 +09:00
Nathan Hjelm
3245428e82 Merge pull request #1535 from kawashima-fj/pr/osc-pt2pt-header-fix
osc/pt2pt: Fix a struct name typo
2016-04-14 15:55:25 -06:00
Jeff Squyres
4566286b9a Merge pull request #1538 from kawashima-fj/pr/fortran-binding-fix
fortran: Fix many Fortran binding bugs
2016-04-14 17:18:59 -04:00
Nathan Hjelm
330302c4b4 Merge pull request #1534 from kawashima-fj/pr/parallel-rma-fix
osc/pt2pt: Fix tag conflicts on parallel RMA communications
2016-04-14 15:13:32 -06:00
Jeff Squyres
fdf33674b3 Merge pull request #1532 from kmroz/wip-hindexed-cleanup-1
romio,java: cleanup deprecated hindexed call
2016-04-14 17:07:31 -04:00
Jeff Squyres
2374d8fcf7 Merge pull request #1536 from kawashima-fj/pr/inplace-fix
mpi/c, mpi/fortran: Fix `MPI_IN_PLACE`-related bugs
2016-04-14 15:56:55 -04:00
Nathan Hjelm
b4e5b5c09e Merge pull request #1531 from hjelmn/bml
bml: always enable the bml
2016-04-14 10:22:33 -06:00
Nathan Hjelm
1e6b4f2f55 Merge pull request #1495 from hjelmn/new_hooks
Add new patcher memory hooks
2016-04-13 18:19:23 -06:00
Nathan Hjelm
11e2d7886e opal/memory: update component structure
This commit makes it possible to set relative priorities for
components. Before the addition of the patched component there was
only one component that would run on any system but that is no longer
the case. When determining which component to open each component's
query function is called and the one that returns the highest priority
is opened. The default priority of the patcher component is set
slightly higher than the old ptmalloc2/ummunotify component.

This commit fixes a long-standing break in the abstration of the
memory components. ompi_mpi_init.c was referencing the linux malloc
hook initilize function to ensure the hooks are initialized for
libmpi.so. The abstraction break has been fixed by adding a memory
base function that calls the open memory component's malloc hook init
function if it has one. The code is not yet complete but is intended
to support ptmalloc in 2.0.0. In that case the base function will
always call the ptmalloc hook init if exists.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:14:51 -06:00
KAWASHIMA Takahiro
4944ba7edc datatype: Fix incorrect predefined datatype names and other datatype bugs (#1537)
* datatype: Fix a incorrect datatype name of `MPI_UNSIGNED`

Name of predefined datatype for C `unsigned int` gotten by
`MPI_TYPE_GET_NAME` should be `MPI_UNSIGNED`, not `MPI_UNSIGNED_INT`.

* datatype: Fix incorrect datatype names of `MPI_C_BOOL` and `MPI_CXX_*`

Names of predefined datatypes gotten by `MPI_TYPE_GET_NAME` are:

after this commit (correct) | before this commit (incorrect)
-----------------------------------------------------------
MPI_C_BOOL                    MPI_BOOL
MPI_CXX_BOOL                  MPI_BOOL
MPI_CXX_FLOAT_COMPLEX         MPI_C_FLOAT_COMPLEX
MPI_CXX_DOUBLE_COMPLEX        MPI_C_DOUBLE_COMPLEX
MPI_CXX_LONG_DOUBLE_COMPLEX   MPI_C_LONG_DOUBLE_COMPLEX

* datatype: Fix a incorrect datatype name of `MPI_2DOUBLE_PRECISION`

Name of the predefined datatype for Fortran two `double precision`
gotten by `MPI_TYPE_GET_NAME` should be `MPI_2DOUBLE_PRECISION`,
not `MPI_2DBLPREC`.

This bug was caused by setting the name to `opal_datatype_t::name`
instead of `ompi_datatype_t::name`.

* datatype: Fix `MPI_UNSIGNED_CHAR` internal flag

`MPI_UNSIGNED_CHAR` is an integer type.

* ompi/cxx: Fix C++ `MPI::LONG_DOUBLE_INT` definition

Just a typo fix. Without this fix, `MPI::MAX_LOC` and `MPI::MIN_LOC`
cannot be used with `MPI::LONG_DOUBLE_INT` in C++ programs.

I know the C++ binding is obsolete, but fixing this is harmless.

* Add FUJITSU copyright
2016-04-12 20:17:46 +02:00
KAWASHIMA Takahiro
17d32acbb6 fortran: Add missing (P)MPI_Alloc_mem_cptr_{f,f08} symbols
This commit adds the following symbols

  MPI_Alloc_mem_cptr_f
  MPI_Alloc_mem_cptr_f08
  PMPI_Alloc_mem_cptr_f
  PMPI_Alloc_mem_cptr_f08

These are implemented in the same way as other `_cptr` routines.
2016-04-12 22:40:58 +09:00
KAWASHIMA Takahiro
d48c8525ed fortran: Fix incorrect weak symbol names 2016-04-12 22:16:32 +09:00
KAWASHIMA Takahiro
5d32a601ff fortran: Add missing interfaces (part 2) 2016-04-12 22:06:35 +09:00
KAWASHIMA Takahiro
6f09d53e34 fortran: Add missing interfaces 2016-04-12 21:44:33 +09:00
KAWASHIMA Takahiro
f3b9a49ad1 fortran: Add missing PMPI interfaces 2016-04-12 20:55:41 +09:00
KAWASHIMA Takahiro
b6cb0bc257 fortran: Fix an incorrect interface name 2016-04-12 20:48:08 +09:00
KAWASHIMA Takahiro
96e93a9c5f fortran: Sort declared subroutins in alphabetical order
And insert necessary empty lines and remove unnecessary empty lines.

No code change.
2016-04-12 20:36:46 +09:00
KAWASHIMA Takahiro
334c63cf0a fortran: Change subroutine declaration order
Same order for `comm`, `type`, and `win`.

No code change.
2016-04-12 20:10:15 +09:00
KAWASHIMA Takahiro
10c11ff5b5 fortran: Add missing MPI_DUP_FN subroutine
Though the `MPI_DUP_FN` subroutine is depricated, it is not yet removed
as of MPI-3.1.
2016-04-12 20:06:50 +09:00
KAWASHIMA Takahiro
35ea9e5c3c Add FUJITSU copyright 2016-04-12 13:47:53 +09:00
KAWASHIMA Takahiro
39bcbe439a osc/pt2pt: Fix a struct name typo
Fortunately the sizes of `ompi_osc_pt2pt_header_put_t` and
`ompi_osc_pt2pt_header_get_t` are same. So this doesn't affect
the behavior.
2016-04-11 20:55:22 +09:00
KAWASHIMA Takahiro
d3d6386578 mpi/forran: Support MPI_IN_PLACE on (I)ALLTOALLW and (I)EXSCAN
`MPI_IN_PLACE` support for `MPI_ALLTOALLW` and `MPI_EXSCAN` was
added in MPI-2.2 but it was missed in OMPI Fortran binding code.
2016-04-11 20:38:28 +09:00
KAWASHIMA Takahiro
28a0577364 osc/pt2pt: Insert breaks in long lines 2016-04-11 19:06:01 +09:00
KAWASHIMA Takahiro
5ac95df9dc osc/pt2pt: use two distinct "namespaces" for tags - revised
Before this commit, a same PML tag may be used for distinct
communications for long messages. For example, consider a condition
where rank A calls ```MPI_PUT``` targeting rank B and rank B calls
```MPI_GET``` targeting rank A simultaneously.
A PML tag for the ```MPI_PUT``` is acquired on rank A and is used
for the long-message communication from rank A to rank B.
A PML tag for the ```MPI_GET``` is acquired on rank B and is used
for the long-message communication from rank A to rank B.
These two tags may become a same value because they are managed
independently on each rank. This will cause a data corruption.

This commit separates the tag used in a single RMA communication
call, one for communication from an origin to a target, and one
for communication from a target to an origin. A "base" tag
is acquired using ```get_tag``` function and PML tag is caluculated
from the base tag by ```tag_to_target``` and ```tag_to_origin```
function.
2016-04-11 19:05:20 +09:00
KAWASHIMA Takahiro
3576ecafa7 Revert "osc/pt2pt: use two distinct "namespaces" for tags"
This reverts commit 06ecdb6aa7
to reimplement the fix completely.
2016-04-11 19:04:11 +09:00
KAWASHIMA Takahiro
eb5c31521b mpi/c: Fix MPI_IALLTOALLW memchecker 2016-04-11 18:47:30 +09:00
KAWASHIMA Takahiro
1ced7f213c mpi/c: Fix IALLTOALL{V|W} + MPI_IN_PLACE param check
`sendcounts`, `sdispls`, and `sendtype(s)` must be ignored
if `MPI_IN_PLACE` is specified for `sendbuf`.
This commit makes the param check code same as the blocking
`ALLTOALL{V|W}` function.
2016-04-11 18:34:11 +09:00
Karol Mroz
f8ecdbd623 java: replace deprecated hindexed call
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-10 19:56:22 +02:00
Karol Mroz
5c54184986 romio: replace deprecated hindexed call
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-10 19:56:22 +02:00
Nathan Hjelm
c6b19818be bml: always enable the bml
This commit ensures the bml is always enabled whether or not it will
be used. This ensures that any available btls communicate their modex
so that they can be used for one-sided communication.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-08 21:14:17 -06:00
George Bosilca
896f857fc4 Thanks @hjelmn for catching up the typo. 2016-04-07 13:56:26 -04:00
Thananon Patinyasakdikul
92290b94e0 Fixed Coverity reports 1358014-1358018 (DEADCODE and CHECK_RETURN) 2016-04-07 12:52:17 -04:00
Ryan Grant
7cdf50533c Merge pull request #1314 from francois-wellenreiter/osc_disable_portals4_evt_send
OSC portals4 : do not generate an EVENT_SEND to avoid to filter it
2016-04-07 10:04:27 -06:00
Gilles Gouaillardet
7b803ac557 MPI_Unpack: fix error code when insize <= 0
this fixes a regression from open-mpi/ompi@f2e33c725f
2016-04-06 09:47:21 +09:00
Karol Mroz
a468c3ba1a opal_info_support: pass component map when handling params
Pass component_map to opal_info_do_params(). It will be needed to output
component versions.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-02 21:17:44 +02:00
Gilles Gouaillardet
f2e33c725f MPI_Unpack: fix return status
this regression was previously introduced in open-mpi/ompi@221e6e2eab
2016-03-31 09:56:54 +09:00
Gilles Gouaillardet
5932287cef datatype/[un]pack_external[_size]: move subroutines down to ompi/datatype
so it can be directly used by test/datatype/external32
2016-03-30 13:01:33 +09:00
Gilles Gouaillardet
221e6e2eab Add the datatype checks to the pack/unpack functions.
The datatype must satisfy the same constraints as for the
corresponding communication function (send for pack and
recv for unpack).
2016-03-30 11:40:08 +09:00
Gilles Gouaillardet
a89f113507 mpi/c: add missing OPAL_CR_EXIT_LIBRARY() in [un]pack[_external] 2016-03-30 11:25:21 +09:00
George Bosilca
004c0cc05b Fix issues identified by @derbeyn. 2016-03-29 15:50:32 -04:00
Jeff Squyres
91c54d7a07 Merge pull request #1491 from ICLDisco/progress_thread
BTL TCP async progress
2016-03-29 06:26:10 -04:00
George Bosilca
f69eba1bc4 Update the copyright and cleanup the code.
Per @jsquyres suggestion remove all trailing spaces.
Credit to `sed -i.bak 's/ *$//' */[ch]`.
2016-03-28 14:41:01 -04:00
Thananon Patinyasakdikul
92062492b9 Enable Threading in the BTL TCP
Added mca parameter to turn progress thread on/off
Add a flag to check if we have btl progress thread.
Added macro for ob1 matching lock.
Update the AUTHORS file.
2016-03-28 14:41:01 -04:00
Nathan Hjelm
9d5eeecb8a pml/ob1: detect unreachable errors
This commit adds code to detect when procs are unreachable when using
the dynamic add_procs functionality.

Fixes #1501

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-28 10:52:40 -06:00
Gilles Gouaillardet
1baed498b6 win: silence a warning in alloc_window(...) 2016-03-28 14:57:31 +09:00
Nathan Hjelm
d6e90f24b1 Merge pull request #1483 from hjelmn/flag_enum_2
RFC: Add support for flag enumerators for MCA variables
2016-03-26 11:43:33 -06:00
George Bosilca
57eadb0dd6 Fix for Coverity CID 1357152.
Or at least that was the origin of the issue. It turns out
we were freeing the wrong buffer (but as it only happen in the
case of an error we never noticed).
2016-03-24 00:53:30 -04:00
George Bosilca
4b38b6bd0c Fix multiple issues with the collective requests.
This patch addresses most (if not all) @derbeyn concerns
expressed on #1015. I added checks for the requests allocation
in all functions, ompi_coll_base_free_reqs is called with the
right number of requests, I removed the unnecessary basic_module_comm_t
and use the base_module_comm_t instead, I remove all uses of the
COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes.
2016-03-23 18:35:41 -04:00
Nathan Hjelm
a1420003b6 ompi/comm: clean up includes in comm_request.h
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-22 09:17:38 -06:00
Nathan Hjelm
b15a45088c mca: add support for flag enumerators
This commit adds a new type of enumerator meant to support flag
values. The enumerator parses comma-delimited strings and matches
each string or value to a list of valid flags. Additionally, the
enumerator does some basic checks to see if 1) a flag is valid in the
enumerator, and 2) if any conflicting flags are specified.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 15:20:56 -06:00
Todd Kordenbrock
2122a15217 Merge pull request #1443 from francois-wellenreiter/fix_trig_rndv
MTL portals4 : fix around triggered rndv operations
2016-03-21 08:16:33 -05:00
Ralph Castain
c146c4969b Revert part of open-mpi/ompi@c1bbbb5e2f to restore the usock component, thus fixing show_help aggregation.
Fixes #1467

Restore debugger attach operations

Fixes #1225
2016-03-18 21:49:04 -07:00
Nathan Hjelm
075dfa4121 topo/treematch: fix component coverity issues
Fix CID 1315298: Resource leak (RESOURCE_LEAK) :
Fix CID 1315300: Resource leak (RESOURCE_LEAK):
Fix CID 1315299: Resource leak (RESOURCE_LEAK):
Fix CID 1315297 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leaks in error paths. Added the leaked arrays to the
ERR_EXIT macro to ensure they are freed.

Fix CID 1315296 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leak in error paths. Both the oversub and reqs arrays are
leaked. Free these arrays on error.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 11:31:11 -06:00
Nathan Hjelm
3540b65f7d bcol: fix coverity issues
Fix CID 1269976 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269979 (#1 of 1): Unused value (UNUSED_VALUE):

Removed unused variables k_temp1 and k_temp2.

Fix CID 1269981 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269974 (#1 of 1): Unused value (UNUSED_VALUE):

Removed gotos and use the matched flags to decide whether to return.

Fix CID 715755 (#1 of 1): Dereference null return value (NULL_RETURNS):

This was also a leak. The items on cs->ctl_structures are allocated using OBJ_NEW so they mist be released using OBJ_RELEASE not OBJ_DESTRUCT. Replaced the loop with OPAL_LIST_DESTRUCT().

Fix CID 715776 (#1 of 1): Dereference before null check (REVERSE_INULL):

Rework error path to remove REVERSE_INULL. Also added a free to an error path where it was missing.

Fix CID 1196603 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Fix CID 1196601 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Both of these are false positives but it is still worthwhile to fix so they no longer appear. The loop conditional has been updated to use radix_mask_pow instead of radix_mask to quiet these issues.

Fix CID 1269804 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

In general close (-1) is safe but coverity doesn’t like it. Reworked the error path for open to not try to close (-1).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:59:46 -06:00
Nathan Hjelm
c8b077f232 coll/ml: fix coverity issues
Fix CID 715744 (#1 of 1): Logically dead code (DEADCODE):
Fix CID 715745 (#1 of 1): Logically dead code (DEADCODE):

The free of scratch_num in either place is defensive programming. Instead of removing the free the conditional around the free has been removed to quiet the warning.

Fix CID 715753 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 715778 (#1 of 1): Dereference before null check (REVERSE_INULL):

Fixed the conditional to check for collective_alg != NULL instead of collective_alg->functions != NULL.

Fix CID 715749 (#1 of 4): Explicit null dereferenced (FORWARD_NULL):

Updated code to ensure that none of the parse functions are reached with a non-NULL value.

Fix CID 715746 (#1 of 1): Logically dead code (DEADCODE):

Removed dead code.

Fix CID 715768 (#1 of 1): Resource leak (RESOURCE_LEAK):
Fix CID 715769 (#2 of 2): Resource leak (RESOURCE_LEAK):
Fix CID 715772 (#1 of 1): Resource leak (RESOURCE_LEAK):

Move free calls to before error checks to cleanup leak in error paths.

Fix CID 741334 (#1 of 1): Explicit null dereferenced (FORWARD_NULL):

Added a check to ensure temp is not dereferenced if it is NULL.

Fix CID 1196605 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Fixed overflow in calculation by replacing int mask with 1ul.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:11:16 -06:00
Nathan Hjelm
2f4e5325aa coll/base: fix coverity issues
Fix CID 1325868 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 1325869 (#1-2 of 2): Dereference after null check (FORWARD_NULL):

Here reqs can indeed be NULL. Added a check to
ompi_coll_base_free_reqs to prevent dereferencing NULL pointer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:31:43 -06:00
Nathan Hjelm
2ed4501490 osc: fix coverity issues
Fix CID 1324726 (#1 of 1): Free of address-of expression (BAD_FREE):

Indeed, if a lock conflicts with the lock_all we will end up trying to
free an invalid pointer.

Fix CID 1328826 (#1 of 1): Dereference after null check (FORWARD_NULL):

This was intentional but it would be a good idea to check for
module->comm being non_NULL to be safe. Also cleaned out some checks
for NULL before free().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:11:48 -06:00
Nathan Hjelm
b9d100929b man: fix typo in MPI_Win_allocate_shared
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 14:47:40 -06:00
Nathan Hjelm
ec9712050b Merge pull request #1118 from hjelmn/mpool_rewrite
mpool/rcache rewrite
2016-03-15 10:46:24 -06:00
Nathan Hjelm
deae9e52bf Merge pull request #1259 from kawashima-fj/pr/osc-sm-align
osc/sm: Fix a bus error on MPI_WIN_{POST,START}.
2016-03-15 09:13:38 -06:00
Francois WELLENREITER
2bc432d95f MTL portals4 : fix around triggered rndv operations 2016-03-15 15:31:04 +01:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Gilles Gouaillardet
eb690432e8 fortran: add missing constants for MPI_WIN_CREATE_FLAVOR and MPI_WIN_MODEL
also add valid values used by MPI_WIN_CREATE_FLAVOR :
- MPI_WIN_FLAVOR_CREATE
- MPI_WIN_FLAVOR_ALLOCATE
- MPI_WIN_FLAVOR_DYNAMIC
- MPI_WIN_FLAVOR_SHARED
2016-03-14 10:19:21 +09:00
Nathan Hjelm
60a3eb12ac comm_spawn_multiple_f: fix coverity issue
Fix CID 1327338: Resource leak (RESOURCE_LEAK):

Confirmed that the c_info array was being leaked. Free the array
before returning.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-11 13:07:01 -07:00
Gilles Gouaillardet
fbed6df4a3 coll/base: fix a typo
typo was introduced in open-mpi/ompi@c98e97a46e
2016-03-11 14:18:03 +09:00
Gilles Gouaillardet
0da1374f22 man: fix typo in MPI_File related man pages 2016-03-11 14:16:21 +09:00
Gilles Gouaillardet
d08fb46ec7 ompi/win: use type int* for MPI_WIN_DISP_UNIT, MPI_WIN_CREATE_FLAVOR and MPI_WIN_MODEL
Thanks Alastair McKinstry for the report.
2016-03-11 09:22:25 +09:00
Aurélien Bouteiller
c98e97a46e Do not return MPI_ERR_PENDING from collectives. 2016-03-09 16:13:34 -05:00
Joshua Ladd
4dffae2f88 Fixing MXM Yalla and MTL add procs behavior. MXM cannot support dynamic add procs, so propaget this info to the MTL and PML layers. 2016-03-08 01:46:24 +02:00
Aurélien Bouteiller
892e1ed57e Fix a potential race condition in which a progress matching thread could match a request while we are cancelling it. 2016-03-01 16:43:45 -05:00
Gilles Gouaillardet
8aff67c399 topo/base: correctly support MPI_UNWEIGHTED in mca_topo_base_dist_graph_neighbors()
Thanks Jun Kudo for the bug report.
2016-03-01 10:28:28 +09:00
Jeff Squyres
89d0a033b7 cxx: "rank" is now a function in C++11
Use "myrank" instead (I tried using ::rank, but had varied
success... so I just renamed the variable).
2016-02-25 15:56:08 -06:00
George Bosilca
dbe93b0b19 Use mca_bml_base_get_endpoint
Correctly use mca_bml_base_get_endpoint instead of accessing the
endpoint directly.
2016-02-25 11:00:30 -06:00
Sylvain Jeaugey
5f32f49eb8 pml/ob1: Fix segmentation fault on CUDA path.
Fix segfault due to mca_pml_ob1_cuda_need_buffers not handling the case of the
endpoint not being there. Calling mca_bml_get_endpoint() seems to fix the problem.

Fixes open-mpi/ompi#1402
2016-02-24 21:32:25 -08:00
Gilles Gouaillardet
d8482ce6f4 opal/mca/memory: add a memoryc_set_alignment subroutine to the OPAL memory MCA
this commit also (partially) reverts :
 - open-mpi/ompi@7de01b347c
 - open-mpi/ompi@8b05f308f9
2016-02-24 09:50:12 +09:00
Nathan Hjelm
230d04327e ompi: always enable MPI_THREAD_MULTIPLE support
This commit removes the --with-mpi-thread-multiple option and forces
MPI_THREAD_MULTIPLE support. This cleans up an abstration violation
in opal where OMPI_ENABLE_THREAD_MULTIPLE determines whether the
opal_using_threads is meaningful. To reduce the performance hit on
MPI_THREAD_SINGLE programs an OPAL_UNLIKELY is used for the
check on opal_using_threads in OPAL_THREAD_* macros.

This commit does not clean up the arguments to the various functions
that take whether muti-threading support is enabled. That should be
done at a later time.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-23 10:02:14 -07:00
Edgar Gabriel
45003ef78d fix the data size counter for large ops for the static fcoll component 2016-02-23 08:33:50 -06:00
Gilles Gouaillardet
308bbcbad1 ompi/dpm: retrieves OPAL_PMIX_ARCH in heterogeneous mode
also remove code duplication by using ompi_proc_complete_init_single()

Thanks Siegmar Gross for reporting this issue, and Ralph for the guidance.
2016-02-22 11:01:06 +09:00
Gilles Gouaillardet
a4aa4c9571 ompi_proc_complete_init_single: make the subroutine public
and accept a proc from a different job
2016-02-22 11:01:06 +09:00
yohann
59b6d041f8 mtl/ofi: Check allocated pointer. 2016-02-19 16:59:47 -08:00
yohann
bd47062764 mtl/ofi: Fix error handling. 2016-02-19 16:58:41 -08:00
yohann
404987e9b3 mtl/ofi: Fix mismatching types. 2016-02-19 16:57:26 -08:00
yohann
3ad59435ce mtl/ofi: Prevent possible memory leak. 2016-02-19 16:57:02 -08:00
Edgar Gabriel
92d1b99468 optimize the shuffle step:
1. use communicator collectives if possible for performance reasons
 2. combined multiple allgathers into a single one
2016-02-19 11:04:04 -06:00
Edgar Gabriel
e63836c653 clean up the mca parameter handling of the component. Add new parameters for number of sub groups and write chunk size. This will allow to perform a systematic parameter study. 2016-02-19 10:15:28 -06:00
Edgar Gabriel
4f400314e0 add the dynamic_gen2 component into the fcoll selection table. 2016-02-19 09:32:54 -06:00
Edgar Gabriel
268d525053 change the tag to be a positive value. handle 0-byte situations correctly. 2016-02-19 08:28:50 -06:00
Edgar Gabriel
ad79012059 first cut on the version which overlaps the communication/computation of 2 iterations. 2016-02-19 08:28:50 -06:00
Jeff Squyres
7b73c868d5 memchecker.h: fix memchecker no-data case
Thanks to @clintonstimpson for reporting the issue.

Fixes open-mpi/ompi#100

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-02-18 10:48:11 -08:00
Ralph Castain
60a7bc2e50 Enable the PMIx notification callback system. This currently is only supported by the pmix120 component, which is not selected by default. All other components will ignore error registration requests, and thus do not support debugger attach when launched via mpirun. Note that direct launched applications will support such attachment, but may not do so in a scalable fashion.
Fixes ##1225
2016-02-18 09:29:12 -08:00
yohann
7fe395c82a mtl/ofi: cleanup 2016-02-16 09:57:57 -08:00
yohann
22eddfee10 mtl/ofi: update copyright dates. 2016-02-16 09:56:09 -08:00
Gilles Gouaillardet
7de01b347c ompi/init: fix abstraction violation
This fixes open-mpi/ompi@8b05f308f9

libmpi.so cannot be built (unresolved symbols) with configure'd with
--disable-mem-debug --disable-mem-profile --disable-memchecker --without-memory-manager
2016-02-16 16:39:21 +09:00
igor-ivanov
d9eefefa74 Merge pull request #1351 from igor-ivanov/pr/issue-1336
opal/memory: Move Memory Allocation Hooks usage from openib
2016-02-15 14:07:36 +04:00
George Bosilca
56425a5d48 Fix issue identified by Lisandro Dalcin regarding the lack
of support for NULL value in MPI_Type_set_attr.
Provides a fix for issue #1359.
2016-02-14 00:07:08 -05:00
George Bosilca
68c36ea9dc Fix two annoying warnings in our UCX support. 2016-02-14 00:02:16 -05:00
Jeff Squyres
7bc62e8f4c Merge pull request #1356 from hjelmn/get_address
Fix MPI_Get_address (MPI_BOTTOM, ...)
2016-02-13 08:27:18 -05:00