1
1
Граф коммитов

9090 Коммитов

Автор SHA1 Сообщение Дата
Matias A Cabral
29ab28f4f6 Adding owner.txt file for PSM2 MTL. 2016-06-02 16:26:16 -07:00
Joshua Hursey
a776d78f2d ompi/op: Provide a default value for type/flags
* User defined ops leave the op_type unset which can confuse logic
   in a collective component that is trying to convert the op to the
   approprate local function.
2016-06-02 13:59:04 -05:00
George Bosilca
d577e12dd0 Fix comment. 2016-06-03 00:57:31 +09:00
George Bosilca
fc5d458249 Consistency in handling OPAL_ENABLE_FT_CR.
I am not sure if we should continue to maintain the request support
for FT_CR, but I tried here to simplify the code while maintaining
the same meaning.
2016-06-03 00:54:24 +09:00
Nathan Hjelm
b001184e63 request: fix warnings (#1742)
Fix warnings introduced by request rework.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-06-02 04:53:16 -04:00
George Bosilca
bfcf145613 Refactor the request test and wait functions. 2016-06-02 11:58:25 +09:00
George Bosilca
2e1b1d34c6 Safety first ! 2016-06-02 11:52:43 +09:00
George Bosilca
50cec456fb ompi_request_complete with signal
Rewrite the ompi_request_complete function to take in account the
with_signal argument. Change the comment to explain the expected
behavior.
Alter all the ompi_request_complete uses to make sure the status of the
request is set before calling ompi_request_complete.

bot🏷️enhancement
2016-06-02 11:49:12 +09:00
George Bosilca
223d75595d Give a boost to MPI_Barrier.
Based on current implementation it is faster to use a blocking
send than the non-blocking version. Switch the exchange function
used in the barrier to use the blocking version combined with
the non-blocking version of the receive.
2016-06-02 11:45:25 +09:00
Ralph Castain
2c086e56be Add an experimental ability to skip the RTE barriers at the end of MPI_Init and the beginning of MPI_Finalize 2016-06-01 17:01:15 -07:00
Nathan Hjelm
086ffc1838 pml/ob1: fix race on pml completion of send requests
The request code was setting the request as pml_complete before
calling MCA_PML_OB1_SEND_REQUEST_MPI_COMPLETE. This was causing
MCA_PML_OB1_SEND_REQUEST_RETURN to be called twice in some cases. The
code now mirrors the recvreq code and only sets the request as pml
complete if the request has not already been freed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-01 13:36:06 -06:00
Gilles Gouaillardet
5f565dfec3 configury: clean the flex generated .c files 2016-06-01 11:13:31 +09:00
Gilles Gouaillardet
1bbc5fadee ompi/win: silence an other warning 2016-05-31 13:18:39 +09:00
Gilles Gouaillardet
c41321b9e5 ompi/win: silence warning 2016-05-31 13:03:20 +09:00
Jeff Squyres
59f4a765b3 Merge pull request #1656 from hpcraink/pr/make_manpage
In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in …
2016-05-28 11:02:12 -04:00
Nathan Hjelm
d8fd3a411a Merge pull request #1725 from hjelmn/request_fixes
ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any
2016-05-27 13:47:49 -06:00
Nathan Hjelm
0591139f49 ompi/request: fix bugs in MPI_Wait_some and MPI_Wait_any
This commit fixes two bugs in MPI_Wait_any:

 - If all requests are inactive then the sync wait would hang forever
   because no requests are attached to the sync.

 - The request pointer was pointing to the request before the completed
   request which caused the wrong request to be freed or marked inactive.

MPI_Wait_some had a similar issue if all the requests were pending.

These issues were identified by MTT.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-27 12:36:10 -06:00
Nathan Hjelm
0adfb328e1 win: fix warnings
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-05-27 10:14:02 -06:00
Thananon Patinyasakdikul
60d0fbf683 Removal of ompi_request_lock from pml/ucx. 2016-05-26 12:36:58 -04:00
George Bosilca
90f294096e Remove more references to the request mutex.
Regarding BFO it should be mentionned that this component is currently
unmaintained, and that despite my efforts I could not make it compile
(it would not compile before this patch either).
2016-05-25 23:27:06 -04:00
Nathan Hjelm
9d439664f0 pml/yalla: update for request changes
This commit brings the pml/yalla component up to date with the request
rework changes.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 15:42:53 -06:00
Nathan Hjelm
8445c885ce pml/cm: update for request changes
This fixes a hang caused by the request refactor work. The cm pml was
not updated and was hanging is most cases.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 15:35:32 -06:00
Nathan Hjelm
ef11ba9394 request: fix compilation error
The request.h header is unfortunately included files in the C++
bindings. C++ does not allow assigning from void * to another
pointer without a cast. This commit adds the cast. We can clean this
up when the C++ bindings are deleted.

Fixes open-mpi/ompi#1707

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 09:52:23 -06:00
Valentin Petrov
5ff6372886 coll/hcoll: bugfix: initialize req_type field
If left uninitialized then segfault is possible in MPI_Waitall in
    the case the field by chance equals OMPI_REQUEST_GEN.
2016-05-25 15:38:01 +03:00
George Bosilca
2b868c4952 Fix MPI datatype args.
Compensate for the datatype ID that we add to the array.
2016-05-24 23:36:54 -04:00
bosilca
b90c83840f Refactor the request completion (#1422)
* Remodel the request.
Added the wait sync primitive and integrate it into the PML and MTL
infrastructure. The multi-threaded requests are now significantly
less heavy and less noisy (only the threads associated with completed
requests are signaled).

* Fix the condition to release the request.
2016-05-24 18:20:51 -05:00
Nathan Hjelm
5126da5377 win: add support for accumulate_ordering info key
This commit adds support for the MPI-3.1 accumulate_ordering info
key. The default value is rar,war,raw,waw and is supported using an
MCA variable flag enumerator.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-24 11:13:30 -06:00
Jeff Squyres
e7d46b96a3 Merge pull request #1680 from yburette/topic/fix_provider_selection
mtl/ofi: Change default provider selection behavior.
2016-05-23 15:06:02 -04:00
Francois WELLENREITER
b2b0fc63e2 MTL portals4 : remove the triggered rendez-vous protocol 2016-05-23 15:50:00 +02:00
Gilles Gouaillardet
bca44592af Merge pull request #1643 from ggouaillardet/topic/romio_openbsd57
io/romio: fix filesystem type check on OpenBSD
2016-05-23 16:33:56 +09:00
George Bosilca
16d9f71d01 Correctly compute the space needed for the args.
Add checks to bail out if our precomputed value is less
than needed (we are already at fault).

bot:milestone:v1.10.3
bot:milestone:v2.0
bot🏷️bug
bot:assign: @ggouaillardet
2016-05-21 16:01:16 -04:00
George Bosilca
0641005dab Only check the parameters on valid dimensions. 2016-05-21 15:54:04 -04:00
George Bosilca
6aac0d9c22 Remove useless output stream. 2016-05-21 15:54:04 -04:00
Nathan Hjelm
31bfeede82 bml/r2: always add btl progress function
This commit changes the behavior of bml/r2 from conditionally
registering btl progress functions to always registering progress
functions. Any progress function beloning to a btl that is not yet in
use is registered as low-priority. As soon as a proc is added that
will make use of the btl is is re-registered normally.

This works around an issue with some btls. In order to progress a
first message from an unknown peer both ugni and openib need to have
their progress functions called. If either btl is not in use after the
first call to add_procs the callback was never happening. This commit
ensures the btl progress function is called at some point but the
number of progress callbacks is reduced from normal to ensure lower
overhead when a btl is not used. The current ratio is 1 low priority
progress callback for every 8 calls to opal_progress().

Fixes open-mpi/ompi#1676

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-21 15:54:04 -04:00
yohann
2f0cde791a mtl/ofi: Change default provider selection behavior.
As more providers get added to libfabric, the default exclude list would need
to be updated.
Instead, we choose to include only the providers known to work by default.

New default:
  - include: psm,psm2,gni
  - exclude: none
2016-05-19 10:59:25 -07:00
Ralph Castain
a35bb8453a Unlock the mutex prior to destructing it.
Thanks to Nicolas Joly for the report
2016-05-19 10:36:58 -07:00
Rainer Keller
0fb0913cd4 In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in make_manpage.pl will delete all
lines up to the next ".fi" -- which for functions that do not implement the corresponding interface
as code will have all eliminated.
Change to delete the man page's content up to the next section header ".SH"

Also in case of make V=1, we'd like to see the command line, too.

Amend OMPI_Affinity_str according to the other man-pages definitions.
2016-05-17 14:21:35 +02:00
rhc54
8b534e9897 Merge pull request #1668 from rhc54/topic/slurm
When direct launching applications, we must allow the MPI layer to pr…
2016-05-16 12:23:19 -07:00
Jeff Squyres
5275e5e2a1 bml_r2: use __func__ to identify function names
There were some old/stale function names in some debugging/verbose
opal_output calls.  Use __func__ instead, so that they won't become
stale in the future.

Thanks to Durga Choudhury for pointing out the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-16 11:06:47 -04:00
Ralph Castain
01ba861f2a When direct launching applications, we must allow the MPI layer to progress during RTE-level barriers. Neither SLURM nor Cray provide non-blocking fence functions, so push those calls into a separate event thread (use the OPAL async thread for this purpose so we don't create another one) and let the MPI thread sping in wait_for_completion. This also restores the "lazy" completion during MPI_Finalize to minimize cpu utilization.
Update external as well

Revise the change: we still need the MPI_Barrier in MPI_Finalize when we use a blocking fence, but do use the "lazy" wait for completion. Replace the direct logic in MPI_Init with a cleaner macro
2016-05-14 16:37:00 -07:00
Aurélien Bouteiller
7f65c2b18e forgot to update copyright in commits 627a89b 4899c89 2016-05-13 11:34:59 -04:00
George Bosilca
37e03e3e5b Don't update req_bytes_received if no bytes were received. 2016-05-12 23:39:32 -04:00
rhc54
4d026e223c Merge pull request #1661 from matcabral/master
PSM and PSM2 MTLs to detect drivers and link
2016-05-11 17:43:17 -07:00
George Bosilca
f8facb177d atomically update the refcount on the datatype args. 2016-05-11 12:40:18 -04:00
Matias A Cabral
528abff6ae Merge remote-tracking branch 'upstream/master' 2016-05-10 15:42:08 -07:00
Matias A Cabral
d28ee62a96 Update in PSM and PSM2 MTLs to detect entries created by drivers for
Intel TrueScale and Intel OmniPath, and detect a link in ACTIVE state.
This fix addresses the scenario reported in the below OMPI users email,
including formerly named Qlogic IB, now Intel True scale. Given the
nature of the PSM/PSM2 mtls this fix applies to OmniPath:
https://www.open-mpi.org/community/lists/users/2016/04/29018.php
2016-05-09 12:08:44 -07:00
Gilles Gouaillardet
0a19337371 coll/base: return MPI_ERR_UNSUPPORTED_OPERATION when coll_base_*_two_procs algo is used on a communicator that has no two tasks
Thanks Dave Love for the report
2016-05-09 14:18:40 +09:00
Gilles Gouaillardet
b159587325 io/romio: fix filesystem type check on OpenBSD 5.7
check the existence of the f_type field in struct statfs

Thanks Paul Hargrove for the report
2016-05-09 13:54:46 +09:00
Ralph Castain
6b24e2779b Remove stale component - I'm not going to get to it 2016-05-07 04:13:34 -07:00
Edgar Gabriel
def1b95fd7 Merge pull request #1646 from edgargabriel/getview-preallocate-fixes
io/ompio: file_getview and file_preallocate fixes
2016-05-06 11:46:00 -05:00
Edgar Gabriel
e65e189671 io/ompio: fix file size after file_preallocate
Thanks for @dalcini for reporting
Fixes open-mpi/ompi#1633
2016-05-06 08:20:59 -05:00
Edgar Gabriel
d358965134 io/ompio: fix envelope of datatype returned by getview
Thanks for @dalcini for reporting
Fixes open-mpi/ompi#1632
2016-05-06 08:19:48 -05:00
Edgar Gabriel
7c92acaa78 Merge pull request #1637 from edgargabriel/pr/netbsd-compilation-problems
fs/lustre and fs/pvfs2: fix netbsd compilation problems
2016-05-06 08:05:36 -05:00
Jeff Squyres
810db734c4 Merge pull request #1640 from jsquyres/pr/mpir-cleanup
debuggers: remove some useless code
2016-05-05 21:23:30 -04:00
Gilles Gouaillardet
6c9d65c0ca coll/libnbc: fix MPI_Ireduce_scatter_block for one task communicator
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#248
2016-05-06 09:43:29 +09:00
Ralph Castain
08022d7af1 Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required. 2016-05-05 15:28:13 -07:00
Jeff Squyres
83c2d04aa3 debuggers: remove some useless code
MPIR-1.0 specifies that the following symbols are only relevant in the
starter process:

- MPIR_Breakpoint
- MPIR_being_debugged
- MPIR_debug_state
- MPIR_debug_abort_string

I.e., the code filling in values in these various symbols was useless
/ never used.

MPIR-1.1 will define that MPIR_being_debugged *is* relevant in MPI
processes.  That symbol is currently defined in libopen-rte (which is
currently causing a duplicate symbol error for static builds -- this
commit fixes that error), and is therefore still available for MPI
processes.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-05 14:22:55 -07:00
Jeff Squyres
f167be1c91 ompio: always return valid info from FILE_GET_INFO
MPI-3.1 says that even if no info keys are set on the file, we need to
return a new, empty info.

Thanks to Lisandro Dalcin for identifying the issue.

Fixes open-mpi/ompi#1630

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-05 12:03:29 -07:00
Aurélien Bouteiller
4899c89731 Fix a race condition when multiple threads try to create a bml endpoint simultaneously. 2016-05-05 10:49:30 -04:00
Aurélien Bouteiller
627a89bf71 Fix a race condition when multiple threads do the "first send" to an endpoint simultaneously. 2016-05-05 09:04:10 -04:00
Joshua Ladd
4771c9ece6 Merge pull request #1617 from jladd-mlnx/topic/disable-hcoll-barrier-in-finalize-ompi-trunk
HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla
2016-05-04 10:12:34 -04:00
Aurélien Bouteiller
8344d00418 use-mpi extensions do not have a .la lib, so the fortran module should not depend on them. 2016-05-03 11:54:35 -04:00
Edgar Gabriel
78fa8bb2c4 remove some unused variables that can cause compilation problems on netbsd 2016-05-03 10:25:15 -05:00
Todd Kordenbrock
3498bed650 Merge pull request #1555 from shawone/check_reduce_ret
coll-portals4: check return value from reduce kary tree functions
2016-05-03 10:17:23 -05:00
Jeff Squyres
33dd8ca81e osc_rdma_peer: properly include ompi_config.h
Thanks to Paul Hargrove for reporting.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-03 07:39:55 -07:00
Devendar Bureddy
cafd55f18c HCOLL: fix hang in hcoll barrier called from finalize for MXM/yalla
tear down

HCOLL barrier may not complete if HCOLL progress is not called periodically.
which is the case in HCOLL teardown progress in the finalize.
(cherry picked from commit 793244d75dd94d1d5e0243bcccf6d04318750f3f)
2016-05-03 00:49:57 +03:00
Nathan Hjelm
d3d779f6d9 osc/rdma: clear all_sync object when obtaining a lock
This commit fixes a bad synchronization detection bug that occurs when
mixing MPI_Win_fence() and MPI_Win_lock(). If no communication has
occurred in the fence epoch it is safe to just clear the all_sync
object (it was set up by fence).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 15:28:47 -06:00
Jeff Squyres
265e5b9795 Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1
ompi/opal/orte/oshmem/test: max hostname length cleanup
2016-05-02 09:44:18 -04:00
Ralph Castain
6ac7929bd0 Extend the schizo framework to allow definition of CLI options by environment. Refactor orterun to mesh with the orted_submit code, thus improving code reuse. Eliminate the orte-submit tool as orterun can now meet that need.
Cleanups per @jjhursey review
2016-05-01 11:30:25 -07:00
George Bosilca
6e6ed62a3c Allow NULL arrays for emoty datatypes.
When building an empty datatype (aka. size = 0) because the count of
included datatypes is 0, be less strict on what the arguments are
(allow NULL pointers).
2016-05-01 12:37:02 -04:00
Nathan Hjelm
ec66a6a1f8 Merge pull request #1605 from hjelmn/rdma_fixes
osc/rdma: fix global index array calculation
2016-04-28 20:41:36 -06:00
Nathan Hjelm
7bda3eb2dc osc/rdma: fix global index array calculation
This commit fixes a bug that occurs when ranks are either not mapped
evenly or by something other than core.

Fixes open-mpi/ompi#1599

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-28 19:11:11 -06:00
Nathan Hjelm
1783d94f91 ompi/group: fix sparse group proc reference counting
This commit fixes a bug when sparse groups are in use. Since sparse
group do not actually increment the reference counts of any procs
(they just retain the parent group) it is wrong to decrement the
reference counts of all procs in the group using
ompi_group_decrement_proc_count(). This commit makes the call to
ompi_group_decrement_proc_count() conditional on the group being
dense.

Fixes open-mpi/ompi#1593

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-27 15:55:13 -06:00
Gilles Gouaillardet
01c90d4e71 fortran/mpif-h: fix *_create_keyval_f
correctly handle out parameter *_keyval when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 13:34:32 +09:00
Gilles Gouaillardet
178dde6a20 fortran/mpif-h: fix MPI_Win_shared_query
correctly handle out parameter disp_unit when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 11:22:09 +09:00
Gilles Gouaillardet
7f59d2a8c7 fortran/mpif-h: fix MPI_Win_free_keyval
initialize inout parameter when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 10:46:14 +09:00
Nathan Hjelm
f0f3383006 Merge pull request #1590 from hjelmn/thread_multiple
osc/pt2pt: do not drop/reacquire the ompi_request_lock
2016-04-26 16:48:37 -06:00
Nathan Hjelm
34ff6293bd osc/pt2pt: do not drop/reacquire the ompi_request_lock
This lock is now recursive so it is safe to call into the pml without
dropping the lock.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-26 14:19:38 -06:00
George Bosilca
bf190671e9 Make the request lock recursive.
If during the request completion callback we post another request that
completes right away (such a small send or a match for an unexpected
short message) we will try to complete the second request while holding
the lock for the completion of the first. For performance reasons
(mainly to avoid unlocking and locking the request mutex several times)
we have made the request lock recursive.
2016-04-26 16:16:07 -04:00
Nathan Hjelm
1e4daa2a0e mpi_init: move opal_set_using_threads() earlier in MPI_Init()
There is a potential race condition in MPI_Init() where an orte even
thread could be in a function that uses OPAL_THREAD_LOCK /
OPAL_THREAD_UNLOCK when ompi_mpi_init calls opal_set_using_threads().

Closes open-mpi/ompi#1586

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-26 13:02:42 -06:00
Nathan Hjelm
c16e639b2f Merge pull request #1563 from hjelmn/ompi_coverity
ompi coverity fixes
2016-04-26 09:17:48 -06:00
Jeff Squyres
8ab88f2051 ompi_mpi_finalize: add/update comments
This is a follow-on to open-mpi/ompi@7373111: add some comments
explaining why the code is the way it is.  Also update a previous
comment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-04-25 13:42:30 -07:00
Ralph Castain
7373111662 Somehow, the logic for finalize got lost, so restore it here. If pmix.fence_nb is available, then call it and cycle opal_progress until complete. If pmix.fence_nb is not available, then do an MPI_Barrier and call pmix.fence.
Needs to go over to 2.x
2016-04-25 08:04:35 -07:00
Karol Mroz
3322347da9 ompi: fixup hostname max length usage
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-25 07:08:23 +02:00
Nathan Hjelm
ae0ffbb67f Merge pull request #1397 from hjelmn/enable_thread_multiple
ompi: always enable MPI_THREAD_MULTIPLE support
2016-04-23 08:40:22 -06:00
Joshua Ladd
0d5a57d9d3 Merge pull request #1558 from vspetrov/hcoll_complex_dtype_support
Adds mapping to hcoll complex data type
2016-04-20 08:35:33 -04:00
Gilles Gouaillardet
490b538ad6 ompi/datatype: fix MPI_LONG_LONG_INT type name
MPI_LONG_LONG_INT is a named predefined datatype, so its name is now MPI_LONG_LONG_INT
MPI_LONG_LONG is a synonym of MPI_LONG_LONG_INT, and its name is also MPI_LONG_LONG_INT
2016-04-20 09:34:20 +09:00
Nathan Hjelm
1ff3d3b16b pml/ob1: fix coverity issue
Fix CID 1357978 (1 of 1): Logically dead code (DEADCODE):

Remove duplicate check for NULL == endpoint.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:48:13 -06:00
Nathan Hjelm
70533e6d50 fcoll/static: fix coverity issues
Fix CID 72362: Explicit null dereferenced (FORWARD_NULL)

From what I can tell the code @ fcoll_static_file_read_all.c:649
should be setting bytes_per_process[i] to 0 not bytes_per_process.

Fix CID 72361: Explicit null dereferenced (FORWARD_NULL)

Modified check to check for blocklen_per_process non-NULL before
trying to free blocklen_per_process[l]. This is sufficient because
free (NULL) is safe. Also cleaned up the initialization of this an a
couple other arrays. They were allocated with malloc() then
initialized to 0. Changed to used calloc().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:48:13 -06:00
Nathan Hjelm
8871bdb2f8 fcoll/two_phase: fix coverity issues
Fix CID 72296: Resource leak (RESOURCE_LEAK):

Changed code to goto exit instead of returning to ensure memory is
freed.

Fix CID 712589: Out-of-bounds read (OVERRUN):

In this loop i and j are identical and always less than
iov_count. The CID was triggered because i was incremented if i was <
iov_count. This meant that if the loop did go on the next iteration
would access an invalid index.

Fix CID 741363: Uninitialized scalar variable (UNINIT):

Allocate tmp_len with calloc to insure every index is initialized.

Fix CID 741364: Uninitialized pointer read (UNINIT):

Allocate recv_types with calloc to ensure all indices are always
initialized. Also added a check to not loop and destroy if recv_types
is NULL.

Also added a NULL check on the allocation of decoded iov. This is not
the cause of CID 126784 but should be fixed.

Fix CID 712588: Out-of-bounds read (OVERRUN):

Similar to CID 712589. Should silence the issue.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-19 14:47:41 -06:00
Valentin Petrov
21f1c572c0 Adds mapping to hcoll complex dte 2016-04-19 14:14:28 +03:00
Nicolas Chevalier
c86d4035d2 coll-portals4: check return value from reduce kary tree functions 2016-04-18 12:02:30 +00:00
Ralph Castain
7829fbdc29 Per request from Jeff, aggregate all help messages during MPI_Init thru MPI_Finalize as long as the RTE is available 2016-04-15 13:37:22 -07:00
KAWASHIMA Takahiro
e854404570 fortran: Change the line order of #pragma
No code change.
These lines were introduced in my recent commit 17d32ac.
I had a editing mistake and the order is different from
other lines/files.
2016-04-15 12:49:03 +09:00
Nathan Hjelm
3245428e82 Merge pull request #1535 from kawashima-fj/pr/osc-pt2pt-header-fix
osc/pt2pt: Fix a struct name typo
2016-04-14 15:55:25 -06:00
Jeff Squyres
4566286b9a Merge pull request #1538 from kawashima-fj/pr/fortran-binding-fix
fortran: Fix many Fortran binding bugs
2016-04-14 17:18:59 -04:00
Nathan Hjelm
330302c4b4 Merge pull request #1534 from kawashima-fj/pr/parallel-rma-fix
osc/pt2pt: Fix tag conflicts on parallel RMA communications
2016-04-14 15:13:32 -06:00
Jeff Squyres
fdf33674b3 Merge pull request #1532 from kmroz/wip-hindexed-cleanup-1
romio,java: cleanup deprecated hindexed call
2016-04-14 17:07:31 -04:00
Jeff Squyres
2374d8fcf7 Merge pull request #1536 from kawashima-fj/pr/inplace-fix
mpi/c, mpi/fortran: Fix `MPI_IN_PLACE`-related bugs
2016-04-14 15:56:55 -04:00
Nathan Hjelm
b4e5b5c09e Merge pull request #1531 from hjelmn/bml
bml: always enable the bml
2016-04-14 10:22:33 -06:00
Nathan Hjelm
1e6b4f2f55 Merge pull request #1495 from hjelmn/new_hooks
Add new patcher memory hooks
2016-04-13 18:19:23 -06:00
Nathan Hjelm
11e2d7886e opal/memory: update component structure
This commit makes it possible to set relative priorities for
components. Before the addition of the patched component there was
only one component that would run on any system but that is no longer
the case. When determining which component to open each component's
query function is called and the one that returns the highest priority
is opened. The default priority of the patcher component is set
slightly higher than the old ptmalloc2/ummunotify component.

This commit fixes a long-standing break in the abstration of the
memory components. ompi_mpi_init.c was referencing the linux malloc
hook initilize function to ensure the hooks are initialized for
libmpi.so. The abstraction break has been fixed by adding a memory
base function that calls the open memory component's malloc hook init
function if it has one. The code is not yet complete but is intended
to support ptmalloc in 2.0.0. In that case the base function will
always call the ptmalloc hook init if exists.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:14:51 -06:00
KAWASHIMA Takahiro
4944ba7edc datatype: Fix incorrect predefined datatype names and other datatype bugs (#1537)
* datatype: Fix a incorrect datatype name of `MPI_UNSIGNED`

Name of predefined datatype for C `unsigned int` gotten by
`MPI_TYPE_GET_NAME` should be `MPI_UNSIGNED`, not `MPI_UNSIGNED_INT`.

* datatype: Fix incorrect datatype names of `MPI_C_BOOL` and `MPI_CXX_*`

Names of predefined datatypes gotten by `MPI_TYPE_GET_NAME` are:

after this commit (correct) | before this commit (incorrect)
-----------------------------------------------------------
MPI_C_BOOL                    MPI_BOOL
MPI_CXX_BOOL                  MPI_BOOL
MPI_CXX_FLOAT_COMPLEX         MPI_C_FLOAT_COMPLEX
MPI_CXX_DOUBLE_COMPLEX        MPI_C_DOUBLE_COMPLEX
MPI_CXX_LONG_DOUBLE_COMPLEX   MPI_C_LONG_DOUBLE_COMPLEX

* datatype: Fix a incorrect datatype name of `MPI_2DOUBLE_PRECISION`

Name of the predefined datatype for Fortran two `double precision`
gotten by `MPI_TYPE_GET_NAME` should be `MPI_2DOUBLE_PRECISION`,
not `MPI_2DBLPREC`.

This bug was caused by setting the name to `opal_datatype_t::name`
instead of `ompi_datatype_t::name`.

* datatype: Fix `MPI_UNSIGNED_CHAR` internal flag

`MPI_UNSIGNED_CHAR` is an integer type.

* ompi/cxx: Fix C++ `MPI::LONG_DOUBLE_INT` definition

Just a typo fix. Without this fix, `MPI::MAX_LOC` and `MPI::MIN_LOC`
cannot be used with `MPI::LONG_DOUBLE_INT` in C++ programs.

I know the C++ binding is obsolete, but fixing this is harmless.

* Add FUJITSU copyright
2016-04-12 20:17:46 +02:00
KAWASHIMA Takahiro
17d32acbb6 fortran: Add missing (P)MPI_Alloc_mem_cptr_{f,f08} symbols
This commit adds the following symbols

  MPI_Alloc_mem_cptr_f
  MPI_Alloc_mem_cptr_f08
  PMPI_Alloc_mem_cptr_f
  PMPI_Alloc_mem_cptr_f08

These are implemented in the same way as other `_cptr` routines.
2016-04-12 22:40:58 +09:00
KAWASHIMA Takahiro
d48c8525ed fortran: Fix incorrect weak symbol names 2016-04-12 22:16:32 +09:00
KAWASHIMA Takahiro
5d32a601ff fortran: Add missing interfaces (part 2) 2016-04-12 22:06:35 +09:00
KAWASHIMA Takahiro
6f09d53e34 fortran: Add missing interfaces 2016-04-12 21:44:33 +09:00
KAWASHIMA Takahiro
f3b9a49ad1 fortran: Add missing PMPI interfaces 2016-04-12 20:55:41 +09:00
KAWASHIMA Takahiro
b6cb0bc257 fortran: Fix an incorrect interface name 2016-04-12 20:48:08 +09:00
KAWASHIMA Takahiro
96e93a9c5f fortran: Sort declared subroutins in alphabetical order
And insert necessary empty lines and remove unnecessary empty lines.

No code change.
2016-04-12 20:36:46 +09:00
KAWASHIMA Takahiro
334c63cf0a fortran: Change subroutine declaration order
Same order for `comm`, `type`, and `win`.

No code change.
2016-04-12 20:10:15 +09:00
KAWASHIMA Takahiro
10c11ff5b5 fortran: Add missing MPI_DUP_FN subroutine
Though the `MPI_DUP_FN` subroutine is depricated, it is not yet removed
as of MPI-3.1.
2016-04-12 20:06:50 +09:00
KAWASHIMA Takahiro
35ea9e5c3c Add FUJITSU copyright 2016-04-12 13:47:53 +09:00
KAWASHIMA Takahiro
39bcbe439a osc/pt2pt: Fix a struct name typo
Fortunately the sizes of `ompi_osc_pt2pt_header_put_t` and
`ompi_osc_pt2pt_header_get_t` are same. So this doesn't affect
the behavior.
2016-04-11 20:55:22 +09:00
KAWASHIMA Takahiro
d3d6386578 mpi/forran: Support MPI_IN_PLACE on (I)ALLTOALLW and (I)EXSCAN
`MPI_IN_PLACE` support for `MPI_ALLTOALLW` and `MPI_EXSCAN` was
added in MPI-2.2 but it was missed in OMPI Fortran binding code.
2016-04-11 20:38:28 +09:00
KAWASHIMA Takahiro
28a0577364 osc/pt2pt: Insert breaks in long lines 2016-04-11 19:06:01 +09:00
KAWASHIMA Takahiro
5ac95df9dc osc/pt2pt: use two distinct "namespaces" for tags - revised
Before this commit, a same PML tag may be used for distinct
communications for long messages. For example, consider a condition
where rank A calls ```MPI_PUT``` targeting rank B and rank B calls
```MPI_GET``` targeting rank A simultaneously.
A PML tag for the ```MPI_PUT``` is acquired on rank A and is used
for the long-message communication from rank A to rank B.
A PML tag for the ```MPI_GET``` is acquired on rank B and is used
for the long-message communication from rank A to rank B.
These two tags may become a same value because they are managed
independently on each rank. This will cause a data corruption.

This commit separates the tag used in a single RMA communication
call, one for communication from an origin to a target, and one
for communication from a target to an origin. A "base" tag
is acquired using ```get_tag``` function and PML tag is caluculated
from the base tag by ```tag_to_target``` and ```tag_to_origin```
function.
2016-04-11 19:05:20 +09:00
KAWASHIMA Takahiro
3576ecafa7 Revert "osc/pt2pt: use two distinct "namespaces" for tags"
This reverts commit 06ecdb6aa7
to reimplement the fix completely.
2016-04-11 19:04:11 +09:00
KAWASHIMA Takahiro
eb5c31521b mpi/c: Fix MPI_IALLTOALLW memchecker 2016-04-11 18:47:30 +09:00
KAWASHIMA Takahiro
1ced7f213c mpi/c: Fix IALLTOALL{V|W} + MPI_IN_PLACE param check
`sendcounts`, `sdispls`, and `sendtype(s)` must be ignored
if `MPI_IN_PLACE` is specified for `sendbuf`.
This commit makes the param check code same as the blocking
`ALLTOALL{V|W}` function.
2016-04-11 18:34:11 +09:00
Karol Mroz
f8ecdbd623 java: replace deprecated hindexed call
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-10 19:56:22 +02:00
Karol Mroz
5c54184986 romio: replace deprecated hindexed call
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-10 19:56:22 +02:00
Nathan Hjelm
c6b19818be bml: always enable the bml
This commit ensures the bml is always enabled whether or not it will
be used. This ensures that any available btls communicate their modex
so that they can be used for one-sided communication.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-08 21:14:17 -06:00
George Bosilca
896f857fc4 Thanks @hjelmn for catching up the typo. 2016-04-07 13:56:26 -04:00
Thananon Patinyasakdikul
92290b94e0 Fixed Coverity reports 1358014-1358018 (DEADCODE and CHECK_RETURN) 2016-04-07 12:52:17 -04:00
Ryan Grant
7cdf50533c Merge pull request #1314 from francois-wellenreiter/osc_disable_portals4_evt_send
OSC portals4 : do not generate an EVENT_SEND to avoid to filter it
2016-04-07 10:04:27 -06:00
Gilles Gouaillardet
7b803ac557 MPI_Unpack: fix error code when insize <= 0
this fixes a regression from open-mpi/ompi@f2e33c725f
2016-04-06 09:47:21 +09:00
Karol Mroz
a468c3ba1a opal_info_support: pass component map when handling params
Pass component_map to opal_info_do_params(). It will be needed to output
component versions.

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-02 21:17:44 +02:00
Gilles Gouaillardet
f2e33c725f MPI_Unpack: fix return status
this regression was previously introduced in open-mpi/ompi@221e6e2eab
2016-03-31 09:56:54 +09:00
Gilles Gouaillardet
5932287cef datatype/[un]pack_external[_size]: move subroutines down to ompi/datatype
so it can be directly used by test/datatype/external32
2016-03-30 13:01:33 +09:00
Gilles Gouaillardet
221e6e2eab Add the datatype checks to the pack/unpack functions.
The datatype must satisfy the same constraints as for the
corresponding communication function (send for pack and
recv for unpack).
2016-03-30 11:40:08 +09:00
Gilles Gouaillardet
a89f113507 mpi/c: add missing OPAL_CR_EXIT_LIBRARY() in [un]pack[_external] 2016-03-30 11:25:21 +09:00
George Bosilca
004c0cc05b Fix issues identified by @derbeyn. 2016-03-29 15:50:32 -04:00
Jeff Squyres
91c54d7a07 Merge pull request #1491 from ICLDisco/progress_thread
BTL TCP async progress
2016-03-29 06:26:10 -04:00
George Bosilca
f69eba1bc4 Update the copyright and cleanup the code.
Per @jsquyres suggestion remove all trailing spaces.
Credit to `sed -i.bak 's/ *$//' */[ch]`.
2016-03-28 14:41:01 -04:00
Thananon Patinyasakdikul
92062492b9 Enable Threading in the BTL TCP
Added mca parameter to turn progress thread on/off
Add a flag to check if we have btl progress thread.
Added macro for ob1 matching lock.
Update the AUTHORS file.
2016-03-28 14:41:01 -04:00
Nathan Hjelm
9d5eeecb8a pml/ob1: detect unreachable errors
This commit adds code to detect when procs are unreachable when using
the dynamic add_procs functionality.

Fixes #1501

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-28 10:52:40 -06:00
Gilles Gouaillardet
1baed498b6 win: silence a warning in alloc_window(...) 2016-03-28 14:57:31 +09:00
Nathan Hjelm
d6e90f24b1 Merge pull request #1483 from hjelmn/flag_enum_2
RFC: Add support for flag enumerators for MCA variables
2016-03-26 11:43:33 -06:00
George Bosilca
57eadb0dd6 Fix for Coverity CID 1357152.
Or at least that was the origin of the issue. It turns out
we were freeing the wrong buffer (but as it only happen in the
case of an error we never noticed).
2016-03-24 00:53:30 -04:00
George Bosilca
4b38b6bd0c Fix multiple issues with the collective requests.
This patch addresses most (if not all) @derbeyn concerns
expressed on #1015. I added checks for the requests allocation
in all functions, ompi_coll_base_free_reqs is called with the
right number of requests, I removed the unnecessary basic_module_comm_t
and use the base_module_comm_t instead, I remove all uses of the
COLL_BASE_BCAST_USE_BLOCKING define, and other minor fixes.
2016-03-23 18:35:41 -04:00
Nathan Hjelm
a1420003b6 ompi/comm: clean up includes in comm_request.h
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-22 09:17:38 -06:00
Nathan Hjelm
b15a45088c mca: add support for flag enumerators
This commit adds a new type of enumerator meant to support flag
values. The enumerator parses comma-delimited strings and matches
each string or value to a list of valid flags. Additionally, the
enumerator does some basic checks to see if 1) a flag is valid in the
enumerator, and 2) if any conflicting flags are specified.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 15:20:56 -06:00
Todd Kordenbrock
2122a15217 Merge pull request #1443 from francois-wellenreiter/fix_trig_rndv
MTL portals4 : fix around triggered rndv operations
2016-03-21 08:16:33 -05:00
Ralph Castain
c146c4969b Revert part of open-mpi/ompi@c1bbbb5e2f to restore the usock component, thus fixing show_help aggregation.
Fixes #1467

Restore debugger attach operations

Fixes #1225
2016-03-18 21:49:04 -07:00
Nathan Hjelm
075dfa4121 topo/treematch: fix component coverity issues
Fix CID 1315298: Resource leak (RESOURCE_LEAK) :
Fix CID 1315300: Resource leak (RESOURCE_LEAK):
Fix CID 1315299: Resource leak (RESOURCE_LEAK):
Fix CID 1315297 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leaks in error paths. Added the leaked arrays to the
ERR_EXIT macro to ensure they are freed.

Fix CID 1315296 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leak in error paths. Both the oversub and reqs arrays are
leaked. Free these arrays on error.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 11:31:11 -06:00
Nathan Hjelm
3540b65f7d bcol: fix coverity issues
Fix CID 1269976 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269979 (#1 of 1): Unused value (UNUSED_VALUE):

Removed unused variables k_temp1 and k_temp2.

Fix CID 1269981 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269974 (#1 of 1): Unused value (UNUSED_VALUE):

Removed gotos and use the matched flags to decide whether to return.

Fix CID 715755 (#1 of 1): Dereference null return value (NULL_RETURNS):

This was also a leak. The items on cs->ctl_structures are allocated using OBJ_NEW so they mist be released using OBJ_RELEASE not OBJ_DESTRUCT. Replaced the loop with OPAL_LIST_DESTRUCT().

Fix CID 715776 (#1 of 1): Dereference before null check (REVERSE_INULL):

Rework error path to remove REVERSE_INULL. Also added a free to an error path where it was missing.

Fix CID 1196603 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Fix CID 1196601 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Both of these are false positives but it is still worthwhile to fix so they no longer appear. The loop conditional has been updated to use radix_mask_pow instead of radix_mask to quiet these issues.

Fix CID 1269804 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

In general close (-1) is safe but coverity doesn’t like it. Reworked the error path for open to not try to close (-1).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:59:46 -06:00
Nathan Hjelm
c8b077f232 coll/ml: fix coverity issues
Fix CID 715744 (#1 of 1): Logically dead code (DEADCODE):
Fix CID 715745 (#1 of 1): Logically dead code (DEADCODE):

The free of scratch_num in either place is defensive programming. Instead of removing the free the conditional around the free has been removed to quiet the warning.

Fix CID 715753 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 715778 (#1 of 1): Dereference before null check (REVERSE_INULL):

Fixed the conditional to check for collective_alg != NULL instead of collective_alg->functions != NULL.

Fix CID 715749 (#1 of 4): Explicit null dereferenced (FORWARD_NULL):

Updated code to ensure that none of the parse functions are reached with a non-NULL value.

Fix CID 715746 (#1 of 1): Logically dead code (DEADCODE):

Removed dead code.

Fix CID 715768 (#1 of 1): Resource leak (RESOURCE_LEAK):
Fix CID 715769 (#2 of 2): Resource leak (RESOURCE_LEAK):
Fix CID 715772 (#1 of 1): Resource leak (RESOURCE_LEAK):

Move free calls to before error checks to cleanup leak in error paths.

Fix CID 741334 (#1 of 1): Explicit null dereferenced (FORWARD_NULL):

Added a check to ensure temp is not dereferenced if it is NULL.

Fix CID 1196605 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Fixed overflow in calculation by replacing int mask with 1ul.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:11:16 -06:00
Nathan Hjelm
2f4e5325aa coll/base: fix coverity issues
Fix CID 1325868 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 1325869 (#1-2 of 2): Dereference after null check (FORWARD_NULL):

Here reqs can indeed be NULL. Added a check to
ompi_coll_base_free_reqs to prevent dereferencing NULL pointer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:31:43 -06:00
Nathan Hjelm
2ed4501490 osc: fix coverity issues
Fix CID 1324726 (#1 of 1): Free of address-of expression (BAD_FREE):

Indeed, if a lock conflicts with the lock_all we will end up trying to
free an invalid pointer.

Fix CID 1328826 (#1 of 1): Dereference after null check (FORWARD_NULL):

This was intentional but it would be a good idea to check for
module->comm being non_NULL to be safe. Also cleaned out some checks
for NULL before free().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:11:48 -06:00