1
1
Граф коммитов

24638 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
013aec894b opal/class/opal_lifo: rename a local variable initially called new
this file is now indirectly included from C++, and new is a reserved C++ keyword
2016-03-18 22:15:44 +09:00
Nathan Hjelm
c749d7d977 Merge pull request #1470 from hjelmn/fifo_fix
opal/fifo: use atomics to set fifo head in opal_fifo_push
2016-03-17 22:30:31 -06:00
Jeff Squyres
361f931967 HACKING: update language about developer builds 2016-03-17 17:30:13 -04:00
Jeff Squyres
cb1837e595 Merge pull request #1417 from jsquyres/rfc/mpirun-warn-if-debug-build
RFC: change default build to always be optimized (even for developers)
2016-03-17 17:03:07 -04:00
Nathan Hjelm
f6069c5600 Merge pull request #1469 from hjelmn/mpool_coverity
rcache/grdma: do not OBJ_RELEASE vma tree too early
2016-03-17 13:24:54 -06:00
Nathan Hjelm
dc000213ea opal/fifo: use atomics to set fifo head in opal_fifo_push
This commit changes the opal_fifo_push code to use
opal_update_counted_pointer to set the head. This fixes a data race
that occurs because the read of the fifo head in opal_fifo_pop
requires two instructions. This combined with the non-atomic update in
opal_fifo_push can lead to an ABA issue that puts the fifo in an
inconsistant state.

There are other ways this problem could be fixed. One way would be to
introduce an opal_atomic_read_128 implementation. On x86_64 this would
have to use the cmpxchg16b instruction. Since this instruction would
have to be in the pop path (and always executed) it would be slower
than the fix in this commit.

Closes open-mpi/ompi#1460.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 13:21:27 -06:00
Nathan Hjelm
676a33bfff rcache/grdma: do not OBJ_RELEASE vma tree too early
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 11:31:41 -06:00
Nathan Hjelm
40dbd7fe2d Merge pull request #1466 from hjelmn/mpool_coverity
opal: fix various coverity errors
2016-03-17 11:26:00 -06:00
Nathan Hjelm
852cc8cfbc opal: fix various coverity errors
Fix CID 1356358:  Null pointer dereferences  (REVERSE_INULL):

flist->fl_mpool can no longer be NULL. Removed the conditional.

Fix CID 1356357:  Resource leaks  (RESOURCE_LEAK):

Added the call to free the hints array.

Fix CID 1356356:  Resource leaks  (RESOURCE_LEAK):

This is a false error but it is safe to call close (-1) so just always
call close.

Fix CID 1356354:  Control flow issues  (MISSING_BREAK):
Fix CID 1356353:  Control flow issues  (MISSING_BREAK):

Add comments that indicate the fall-through is intentional.

Fix CID 1356351:  Null pointer dereferences  (FORWARD_NULL):

Fix potential SEGV if the page_size key is malformed.

Fix CID 1356350:  Error handling issues  (CHECKED_RETURN):

Add (void) to indicate that we do not care about the return code of
sscanf in this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-17 10:05:57 -06:00
Nathan Hjelm
f0e374096a Merge pull request #1465 from hjelmn/re_enable_builtin_atomics
configure: re-enable built-in atomic support
2016-03-17 08:21:59 -06:00
igor-ivanov
d81993945e Merge pull request #1461 from igor-ivanov/pr/opal-config
config: Fix wrong variable name
2016-03-17 11:29:33 +03:00
Mike Dubman
8f1838df4b Merge pull request #1459 from alinask/topic/openib_diff_subnets
btl/openib: enable connecting processes from different subnets.
2016-03-17 08:40:08 +02:00
Nathan Hjelm
664ecc8f84 configure: re-enable built-in atomic support
This commit removes an erroneous else statement from the OSX built-in
atomics check. The else branch sets the built-in atomics support to
BUILTIN_NO if either opal_cv_asm_builtin is not BUILTIN_NO or OSX
atomics support is disabled.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-03-16 20:46:09 -06:00
Nathan Hjelm
9b83e66794 Merge pull request #1464 from hjelmn/grdma_fix
rcache/grdma: fix typo
2016-03-16 19:35:26 -06:00
Nathan Hjelm
cbce085b12 rcache/grdma: fix typo
This typo was originally fixed on the mpool_rewrite branch but the change
was lost.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 18:30:44 -06:00
Nathan Hjelm
01162cece2 Merge pull request #1463 from hjelmn/win_shared_man_update
man: fix typo in MPI_Win_allocate_shared
2016-03-16 17:27:21 -06:00
Jeff Squyres
d44804f0c9 usnic: use version 1 of the API, not the current version 2016-03-16 16:03:51 -07:00
Jeff Squyres
e7ef711455 usnic: allow mpool_hints to be empty
Follow on to open-mpi/ompi@eac0b11
2016-03-16 15:04:39 -07:00
Nathan Hjelm
b9d100929b man: fix typo in MPI_Win_allocate_shared
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 14:47:40 -06:00
Igor Ivanov
7bd12dc1a4 config: Fix wrong variable name
Changed $project_shmem with $project_oshmem
2016-03-16 17:48:51 +02:00
Ralph Castain
a67ff065ae Silence coverity warnings 2016-03-16 08:43:16 -07:00
Alina Sklarevich
bbcbe3cacd btl/openib: enable connecting processes from different subnets.
+ Added an mca parameter to allow connecting processes from different
subnets. Its current default value is 'false' - don't allow, to keep the
current flow the way it is now.

+ rmdacm: when calling ibv_query_gid, use the gid index from
btl_openib_gid_index.
2016-03-16 10:52:06 +02:00
Gilles Gouaillardet
99809162b0 rcache: initialize common symbol mca_rcache_base_used_mem_hooks 2016-03-16 09:27:33 +09:00
Jeff Squyres
54687d0155 opal_configure_options.m4: clarify some help messages
Make the help messages for --enable-mem-debug and --enable-mem-profile
the same as other help messages.
2016-03-15 19:50:19 -04:00
Jeff Squyres
7c29ceb911 opal_configure_options: disable debug-by-default builds for devs
After 11 years, it's probably ok to say that we're no longer in "early
development" -- disable the "build a debug version of Open MPI by
default if we find a .git directory" behavior.

However, we are keeping the "use compiler picky flags if we find a
.git directory" behavior.  That's useful behavior for developers, and
has no effect on performance.
2016-03-15 19:50:14 -04:00
Ralph Castain
beecf1b6eb Add missing include, remove unused vairable 2016-03-15 13:45:27 -07:00
Nathan Hjelm
ec9712050b Merge pull request #1118 from hjelmn/mpool_rewrite
mpool/rcache rewrite
2016-03-15 10:46:24 -06:00
Nathan Hjelm
deae9e52bf Merge pull request #1259 from kawashima-fj/pr/osc-sm-align
osc/sm: Fix a bus error on MPI_WIN_{POST,START}.
2016-03-15 09:13:38 -06:00
Nysal Jan K A
1b5433da30 Merge pull request #1454 from nysal/orte-ps
Fix memory corruption in orte-ps
2016-03-15 19:53:15 +05:30
Nysal Jan K.A
f6e932c864 Fix memory corruption in orte-ps
orte-ps ends up free'ing the same pointer multiple times
2016-03-15 16:03:31 +05:30
Nathan Hjelm
eac0b110b8 btl/usnic: update for mpool/rcache rewrite
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Nathan Hjelm
522c2f2b82 rcache: add major/minor/release version macros
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Nathan Hjelm
69d9266497 Update memkind mpool for new mpool interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Vishwanath Venkatesan
8024142f46 The latest version of memkind uses jemalloc as a submodule.
This means we need not check for jemalloc in the configure script for
this component. Removing this.

In some machines having the TLS option on can cause errors in
opening this component. --disable-tls while configuring jemalloc.
Please look for instructions for installing jemalloc as a static
library linked directly into memkind in CONTRIBUTING file
github.com/memkind/memkindw
2016-03-14 10:50:41 -06:00
Vishwanath Venkatesan
3d98a1a01e Adding memkind component to use MPI_Alloc_mem through memkind 2016-03-14 10:50:41 -06:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Ralph Castain
6d7ada9675 Silence Coverity warning 2016-03-14 09:42:43 -07:00
rhc54
efafd62d38 Merge pull request #1451 from ggouaillardet/topic/orte_fork_wrapper_fullname
odls/base: use the full app name when using an orte fork agent
2016-03-14 09:40:21 -07:00
Gilles Gouaillardet
a93b849f13 configury: UCX uses CPPFLAGS (instead of CFLAGS) 2016-03-14 11:42:16 +09:00
Gilles Gouaillardet
589924c4aa odls/base: use the full app name when using an orte fork agent 2016-03-14 11:18:21 +09:00
Gilles Gouaillardet
eb690432e8 fortran: add missing constants for MPI_WIN_CREATE_FLAVOR and MPI_WIN_MODEL
also add valid values used by MPI_WIN_CREATE_FLAVOR :
- MPI_WIN_FLAVOR_CREATE
- MPI_WIN_FLAVOR_ALLOCATE
- MPI_WIN_FLAVOR_DYNAMIC
- MPI_WIN_FLAVOR_SHARED
2016-03-14 10:19:21 +09:00
Gilles Gouaillardet
fbed6df4a3 coll/base: fix a typo
typo was introduced in open-mpi/ompi@c98e97a46e
2016-03-11 14:18:03 +09:00
Gilles Gouaillardet
0da1374f22 man: fix typo in MPI_File related man pages 2016-03-11 14:16:21 +09:00
Gilles Gouaillardet
d08fb46ec7 ompi/win: use type int* for MPI_WIN_DISP_UNIT, MPI_WIN_CREATE_FLAVOR and MPI_WIN_MODEL
Thanks Alastair McKinstry for the report.
2016-03-11 09:22:25 +09:00
Jeff Squyres
6f17b46a3c Merge pull request #1447 from jsquyres/pr/updates-to-config-summary
configury: minor updates to config summary output
2016-03-10 17:29:45 -05:00
Jeff Squyres
48c650c47a configury: minor updates to config summary output 2016-03-10 13:02:52 -08:00
Jeff Squyres
97714716ec usnic: add some cagent verification checks
Add primitive magic number and version checking in the connectivity
checker protocol.  These checks doesn't *guarantee* to we won't get
false PINGs and ACKs, but they do significantly reduce the possibility
of interpretating random incoming fragments as PINGs or ACKs.
2016-03-09 13:25:00 -08:00
Aurélien Bouteiller
c98e97a46e Do not return MPI_ERR_PENDING from collectives. 2016-03-09 16:13:34 -05:00
Nathan Hjelm
f8469de832 Merge pull request #1415 from hjelmn/configure_summary
configure: add a summary section at the end of configure output
2016-03-09 12:25:39 -07:00
Nathan Hjelm
fdebebc4c0 Merge pull request #1439 from hjelmn/btl_openib_send_size
btl/openib: fix inconsistency in the default settings
2016-03-09 09:31:18 -07:00