1
1
Граф коммитов

24659 Коммитов

Автор SHA1 Сообщение Дата
Nathan Hjelm
645bd9d9dd btl/openib: update rdmacm for dynamic add_procs
This commit adds the data necessesary for supporting dynamic add_procs
to the rdma message (opal_process_name_t). The endpoint lookup
function has been updated to match the code in udcm.

Closes open-mpi/ompi#1468.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 10:00:41 -06:00
Todd Kordenbrock
2122a15217 Merge pull request #1443 from francois-wellenreiter/fix_trig_rndv
MTL portals4 : fix around triggered rndv operations
2016-03-21 08:16:33 -05:00
igor-ivanov
4315435963 Merge pull request #1476 from igor-ivanov/pr/fix-scoll-basic-barrier
oshmem/scoll: Fix bug in basic/barrier algorithm
2016-03-21 13:04:42 +03:00
Igor Ivanov
e690521cdd oshmem/scoll: Fix bug in basic/barrier algorithm 2016-03-21 10:34:55 +02:00
rhc54
10de9c7919 Merge pull request #1480 from rhc54/topic/usock
Fix debugger operations and show_help aggregation
2016-03-19 02:33:17 -07:00
Ralph Castain
c146c4969b Revert part of open-mpi/ompi@c1bbbb5e2f to restore the usock component, thus fixing show_help aggregation.
Fixes #1467

Restore debugger attach operations

Fixes #1225
2016-03-18 21:49:04 -07:00
Nathan Hjelm
e020566924 Merge pull request #1479 from hjelmn/opal_coverity
opal: fix coverity issues
2016-03-18 17:00:53 -06:00
Nathan Hjelm
4d4fa28f75 opal: fix coverity issues
Fix CID 1345825 (1 of 1): Dereference before null check (REVERSE_INULL):

ib_proc should not be NULL in this case. Removed the check and added a
check for NULL after OBJ_NEW.

CID 1269821 (1 of 1): Dereference null return value (NULL_RETURNS):

I labeled this one as a false positive (which it is) but the code in
question could stand be be cleaned up.

Fix CID 1356424 (1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

While trying to silence another Coverity issue another was
flagged. Protect the close of fd with if (fd >= 0).

CID 70772 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70773 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70774 (1 of 1): Dereference null return value (NULL_RETURNS):

None of these are errors and are intentional but now that we have a
list release function use that to make these go away. The cleanup is
similar to CID 1269821.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 15:56:08 -06:00
Nathan Hjelm
018e3ebeb6 Merge pull request #1471 from hjelmn/lanl_platform
contrib/lanl: update platform files for TOSS2
2016-03-18 15:04:34 -06:00
Nathan Hjelm
81085dd89d Merge pull request #1477 from hjelmn/ompi_coverity
Fix some issues in OMPI identified by Coverity
2016-03-18 14:55:54 -06:00
Ralph Castain
972026b9c1 Add the option to not make the greek tarball, only making the non-greek one 2016-03-18 12:25:20 -07:00
Nathan Hjelm
075dfa4121 topo/treematch: fix component coverity issues
Fix CID 1315298: Resource leak (RESOURCE_LEAK) :
Fix CID 1315300: Resource leak (RESOURCE_LEAK):
Fix CID 1315299: Resource leak (RESOURCE_LEAK):
Fix CID 1315297 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leaks in error paths. Added the leaked arrays to the
ERR_EXIT macro to ensure they are freed.

Fix CID 1315296 (#1 of 1): Resource leak (RESOURCE_LEAK):

Confirmed leak in error paths. Both the oversub and reqs arrays are
leaked. Free these arrays on error.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 11:31:11 -06:00
Nathan Hjelm
3540b65f7d bcol: fix coverity issues
Fix CID 1269976 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269979 (#1 of 1): Unused value (UNUSED_VALUE):

Removed unused variables k_temp1 and k_temp2.

Fix CID 1269981 (#1 of 1): Unused value (UNUSED_VALUE):
Fix CID 1269974 (#1 of 1): Unused value (UNUSED_VALUE):

Removed gotos and use the matched flags to decide whether to return.

Fix CID 715755 (#1 of 1): Dereference null return value (NULL_RETURNS):

This was also a leak. The items on cs->ctl_structures are allocated using OBJ_NEW so they mist be released using OBJ_RELEASE not OBJ_DESTRUCT. Replaced the loop with OPAL_LIST_DESTRUCT().

Fix CID 715776 (#1 of 1): Dereference before null check (REVERSE_INULL):

Rework error path to remove REVERSE_INULL. Also added a free to an error path where it was missing.

Fix CID 1196603 (#1 of 1): Bad bit shift operation (BAD_SHIFT):
Fix CID 1196601 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Both of these are false positives but it is still worthwhile to fix so they no longer appear. The loop conditional has been updated to use radix_mask_pow instead of radix_mask to quiet these issues.

Fix CID 1269804 (#1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

In general close (-1) is safe but coverity doesn’t like it. Reworked the error path for open to not try to close (-1).

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:59:46 -06:00
Nathan Hjelm
c8b077f232 coll/ml: fix coverity issues
Fix CID 715744 (#1 of 1): Logically dead code (DEADCODE):
Fix CID 715745 (#1 of 1): Logically dead code (DEADCODE):

The free of scratch_num in either place is defensive programming. Instead of removing the free the conditional around the free has been removed to quiet the warning.

Fix CID 715753 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 715778 (#1 of 1): Dereference before null check (REVERSE_INULL):

Fixed the conditional to check for collective_alg != NULL instead of collective_alg->functions != NULL.

Fix CID 715749 (#1 of 4): Explicit null dereferenced (FORWARD_NULL):

Updated code to ensure that none of the parse functions are reached with a non-NULL value.

Fix CID 715746 (#1 of 1): Logically dead code (DEADCODE):

Removed dead code.

Fix CID 715768 (#1 of 1): Resource leak (RESOURCE_LEAK):
Fix CID 715769 (#2 of 2): Resource leak (RESOURCE_LEAK):
Fix CID 715772 (#1 of 1): Resource leak (RESOURCE_LEAK):

Move free calls to before error checks to cleanup leak in error paths.

Fix CID 741334 (#1 of 1): Explicit null dereferenced (FORWARD_NULL):

Added a check to ensure temp is not dereferenced if it is NULL.

Fix CID 1196605 (#1 of 1): Bad bit shift operation (BAD_SHIFT):

Fixed overflow in calculation by replacing int mask with 1ul.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 10:11:16 -06:00
Nathan Hjelm
2f4e5325aa coll/base: fix coverity issues
Fix CID 1325868 (#1 of 1): Dereference after null check (FORWARD_NULL):
Fix CID 1325869 (#1-2 of 2): Dereference after null check (FORWARD_NULL):

Here reqs can indeed be NULL. Added a check to
ompi_coll_base_free_reqs to prevent dereferencing NULL pointer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:31:43 -06:00
rhc54
d30911f186 Merge pull request #1474 from rhc54/topic/revert
Revert problematic change to app context handling
2016-03-18 08:14:20 -07:00
Nathan Hjelm
2ed4501490 osc: fix coverity issues
Fix CID 1324726 (#1 of 1): Free of address-of expression (BAD_FREE):

Indeed, if a lock conflicts with the lock_all we will end up trying to
free an invalid pointer.

Fix CID 1328826 (#1 of 1): Dereference after null check (FORWARD_NULL):

This was intentional but it would be a good idea to check for
module->comm being non_NULL to be safe. Also cleaned out some checks
for NULL before free().

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 09:11:48 -06:00
Ralph Castain
8f410d7897 Revert one part of open-mpi/ompi@4d0cc27eb7 2016-03-18 07:23:30 -07:00
Ralph Castain
2970becd6b Revert "Merge pull request #1451 from ggouaillardet/topic/orte_fork_wrapper_fullname"
This reverts commit efafd62d38, reversing
changes made to a93b849f13.
2016-03-18 07:18:36 -07:00
Gilles Gouaillardet
013aec894b opal/class/opal_lifo: rename a local variable initially called new
this file is now indirectly included from C++, and new is a reserved C++ keyword
2016-03-18 22:15:44 +09:00
Nathan Hjelm
c749d7d977 Merge pull request #1470 from hjelmn/fifo_fix
opal/fifo: use atomics to set fifo head in opal_fifo_push
2016-03-17 22:30:31 -06:00
Jeff Squyres
361f931967 HACKING: update language about developer builds 2016-03-17 17:30:13 -04:00
Jeff Squyres
cb1837e595 Merge pull request #1417 from jsquyres/rfc/mpirun-warn-if-debug-build
RFC: change default build to always be optimized (even for developers)
2016-03-17 17:03:07 -04:00
Nathan Hjelm
147e780fa5 contrib/lanl: update platform files for TOSS2
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 14:30:50 -06:00
Nathan Hjelm
f6069c5600 Merge pull request #1469 from hjelmn/mpool_coverity
rcache/grdma: do not OBJ_RELEASE vma tree too early
2016-03-17 13:24:54 -06:00
Nathan Hjelm
dc000213ea opal/fifo: use atomics to set fifo head in opal_fifo_push
This commit changes the opal_fifo_push code to use
opal_update_counted_pointer to set the head. This fixes a data race
that occurs because the read of the fifo head in opal_fifo_pop
requires two instructions. This combined with the non-atomic update in
opal_fifo_push can lead to an ABA issue that puts the fifo in an
inconsistant state.

There are other ways this problem could be fixed. One way would be to
introduce an opal_atomic_read_128 implementation. On x86_64 this would
have to use the cmpxchg16b instruction. Since this instruction would
have to be in the pop path (and always executed) it would be slower
than the fix in this commit.

Closes open-mpi/ompi#1460.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 13:21:27 -06:00
Nathan Hjelm
676a33bfff rcache/grdma: do not OBJ_RELEASE vma tree too early
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 11:31:41 -06:00
Nathan Hjelm
40dbd7fe2d Merge pull request #1466 from hjelmn/mpool_coverity
opal: fix various coverity errors
2016-03-17 11:26:00 -06:00
Nathan Hjelm
852cc8cfbc opal: fix various coverity errors
Fix CID 1356358:  Null pointer dereferences  (REVERSE_INULL):

flist->fl_mpool can no longer be NULL. Removed the conditional.

Fix CID 1356357:  Resource leaks  (RESOURCE_LEAK):

Added the call to free the hints array.

Fix CID 1356356:  Resource leaks  (RESOURCE_LEAK):

This is a false error but it is safe to call close (-1) so just always
call close.

Fix CID 1356354:  Control flow issues  (MISSING_BREAK):
Fix CID 1356353:  Control flow issues  (MISSING_BREAK):

Add comments that indicate the fall-through is intentional.

Fix CID 1356351:  Null pointer dereferences  (FORWARD_NULL):

Fix potential SEGV if the page_size key is malformed.

Fix CID 1356350:  Error handling issues  (CHECKED_RETURN):

Add (void) to indicate that we do not care about the return code of
sscanf in this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 10:05:57 -06:00
Nathan Hjelm
f0e374096a Merge pull request #1465 from hjelmn/re_enable_builtin_atomics
configure: re-enable built-in atomic support
2016-03-17 08:21:59 -06:00
igor-ivanov
d81993945e Merge pull request #1461 from igor-ivanov/pr/opal-config
config: Fix wrong variable name
2016-03-17 11:29:33 +03:00
Mike Dubman
8f1838df4b Merge pull request #1459 from alinask/topic/openib_diff_subnets
btl/openib: enable connecting processes from different subnets.
2016-03-17 08:40:08 +02:00
Nathan Hjelm
664ecc8f84 configure: re-enable built-in atomic support
This commit removes an erroneous else statement from the OSX built-in
atomics check. The else branch sets the built-in atomics support to
BUILTIN_NO if either opal_cv_asm_builtin is not BUILTIN_NO or OSX
atomics support is disabled.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-03-16 20:46:09 -06:00
Nathan Hjelm
9b83e66794 Merge pull request #1464 from hjelmn/grdma_fix
rcache/grdma: fix typo
2016-03-16 19:35:26 -06:00
Nathan Hjelm
cbce085b12 rcache/grdma: fix typo
This typo was originally fixed on the mpool_rewrite branch but the change
was lost.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 18:30:44 -06:00
Nathan Hjelm
01162cece2 Merge pull request #1463 from hjelmn/win_shared_man_update
man: fix typo in MPI_Win_allocate_shared
2016-03-16 17:27:21 -06:00
Jeff Squyres
d44804f0c9 usnic: use version 1 of the API, not the current version 2016-03-16 16:03:51 -07:00
Jeff Squyres
e7ef711455 usnic: allow mpool_hints to be empty
Follow on to open-mpi/ompi@eac0b11
2016-03-16 15:04:39 -07:00
Nathan Hjelm
b9d100929b man: fix typo in MPI_Win_allocate_shared
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 14:47:40 -06:00
Igor Ivanov
7bd12dc1a4 config: Fix wrong variable name
Changed $project_shmem with $project_oshmem
2016-03-16 17:48:51 +02:00
Ralph Castain
a67ff065ae Silence coverity warnings 2016-03-16 08:43:16 -07:00
Alina Sklarevich
bbcbe3cacd btl/openib: enable connecting processes from different subnets.
+ Added an mca parameter to allow connecting processes from different
subnets. Its current default value is 'false' - don't allow, to keep the
current flow the way it is now.

+ rmdacm: when calling ibv_query_gid, use the gid index from
btl_openib_gid_index.
2016-03-16 10:52:06 +02:00
Gilles Gouaillardet
99809162b0 rcache: initialize common symbol mca_rcache_base_used_mem_hooks 2016-03-16 09:27:33 +09:00
Jeff Squyres
54687d0155 opal_configure_options.m4: clarify some help messages
Make the help messages for --enable-mem-debug and --enable-mem-profile
the same as other help messages.
2016-03-15 19:50:19 -04:00
Jeff Squyres
7c29ceb911 opal_configure_options: disable debug-by-default builds for devs
After 11 years, it's probably ok to say that we're no longer in "early
development" -- disable the "build a debug version of Open MPI by
default if we find a .git directory" behavior.

However, we are keeping the "use compiler picky flags if we find a
.git directory" behavior.  That's useful behavior for developers, and
has no effect on performance.
2016-03-15 19:50:14 -04:00
Ralph Castain
beecf1b6eb Add missing include, remove unused vairable 2016-03-15 13:45:27 -07:00
Nathan Hjelm
ec9712050b Merge pull request #1118 from hjelmn/mpool_rewrite
mpool/rcache rewrite
2016-03-15 10:46:24 -06:00
Nathan Hjelm
deae9e52bf Merge pull request #1259 from kawashima-fj/pr/osc-sm-align
osc/sm: Fix a bus error on MPI_WIN_{POST,START}.
2016-03-15 09:13:38 -06:00
Francois WELLENREITER
2bc432d95f MTL portals4 : fix around triggered rndv operations 2016-03-15 15:31:04 +01:00
Nysal Jan K A
1b5433da30 Merge pull request #1454 from nysal/orte-ps
Fix memory corruption in orte-ps
2016-03-15 19:53:15 +05:30