1
1

25125 Коммитов

Автор SHA1 Сообщение Дата
rhc54
e5ee7adbe0 Merge pull request #1722 from rhc54/topic/pmixext
Enable PMIx external support for both 1.1.4 and 2.0 versions
2016-05-27 08:59:09 -07:00
Ralph Castain
55923eacd3 Stealing some pieces of Josh Hursey's PR #1583 and modifying a bit, allow the opal/pmix external component to handle both PMIx 1.1.4 and PMIx 2.0 versions. Automatically detect the version of the target external library and adjust the only two APIs that changed (PMIx_Init and PMIx_Finalize)
Rename temp vars in .m4 to avoid conflict with Travis
2016-05-27 08:06:31 -07:00
Nathan Hjelm
d25b846c01 Merge pull request #1704 from hpcraink/pr/configure_framework
Fix configure for FreePGI on OSX
2016-05-26 17:01:08 -06:00
Nathan Hjelm
8c9292d5d1 Merge pull request #1721 from hjelmn/xrc_fix
btl/openib: fix XRC WQE calculation
2016-05-26 17:00:31 -06:00
Nathan Hjelm
56bdcd0888 btl/openib: fix XRC WQE calculation
Before dynamic add_procs support was committed to master we called
add_procs with every proc in the job. The XRC code in the openib btl
was taking advantage of this and setting the number of work queue
entries (WQE) based on all the procs on a remote node. Since that is
no longer the case we can not simply increment the sd_wqe field on the
queue pair. To fix the issue a new field has been added to the xrc
queue pair structure to keep track of how many wqes there are total on
the queue pair. If a new endpoint is added that increases the number
of wqes and the xrc queue pair is already connected the code will
attempt to modify the number of wqes on the queue pair. A failure is
ignored because all that will happen is the number of active send work
requests on an XRC queue pair will be more limited.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-26 15:58:31 -06:00
Aurelien Bouteiller
49bd28d0ac Merge pull request #1714 from hjelmn/scif_exclusivity
btl/scif: reduce default exclusivity
2016-05-26 17:53:11 -04:00
Nathan Hjelm
f19c647f21 Merge pull request #1718 from hjelmn/config_fix
config: fix typo in mxm configury
2016-05-26 13:19:23 -06:00
Joshua Ladd
1a5fd6bf83 Merge pull request #1719 from ICLDisco/ucx_request_fix
Removal of ompi_request_lock from pml/ucx.
2016-05-26 15:09:57 -04:00
Thananon Patinyasakdikul
60d0fbf683 Removal of ompi_request_lock from pml/ucx. 2016-05-26 12:36:58 -04:00
Nathan Hjelm
8c2086995d config: fix typo in mxm configury
A 1 was missing when setting $1_LDFLAGS leading to erroneous items in
the wrapper cflags.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-26 10:28:07 -06:00
Nathan Hjelm
87ea9be863 Merge pull request #1715 from hjelmn/ugni_overhead
btl/ugni: reduce overhead of progress function
2016-05-26 10:17:00 -06:00
Gilles Gouaillardet
46710ba151 travis: fix a typo and create bogus directories to avoid compiler warnings 2016-05-26 15:28:10 +09:00
George Bosilca
90f294096e Remove more references to the request mutex.
Regarding BFO it should be mentionned that this component is currently
unmaintained, and that despite my efforts I could not make it compile
(it would not compile before this patch either).
2016-05-25 23:27:06 -04:00
Nathan Hjelm
5d322170a0 Merge pull request #1716 from hjelmn/request_fixes
Request fixes
2016-05-25 18:14:03 -06:00
Nathan Hjelm
9d439664f0 pml/yalla: update for request changes
This commit brings the pml/yalla component up to date with the request
rework changes.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 15:42:53 -06:00
Nathan Hjelm
8445c885ce pml/cm: update for request changes
This fixes a hang caused by the request refactor work. The cm pml was
not updated and was hanging is most cases.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 15:35:32 -06:00
Nathan Hjelm
dbfab94ede atomic/mxm: rename symbol that is a duplicate of one in atomic/ucx
This fixes an error when building with --enable-static.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 15:34:40 -06:00
Nathan Hjelm
99627319f0 btl/ugni: reduce overhead of progress function
This commit reduces the overhead of calling the ugni progress
function. It does the following:

 - Check for new connections once every eight calls.

 - Do not call remote smsg progress unless we are connected to at
   least one remote peer.

 - Do not call rdma progress unless at least one rdma fragment is
   outstanding.

 - Check endpoint wait list size before obtaining a lock.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 14:27:34 -06:00
Nathan Hjelm
5caf12cd9b btl/scif: reduce default exclusivity
This commit reduces the default exclusivity so that btl/scif is not
used for send/recv over other shared memory transports.

Fixes open-mpi/ompi#1712

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 14:25:07 -06:00
Nathan Hjelm
8e1d59aea8 Merge pull request #1708 from hjelmn/c__fix
request: fix compilation error
2016-05-25 10:48:02 -06:00
Nathan Hjelm
ef11ba9394 request: fix compilation error
The request.h header is unfortunately included files in the C++
bindings. C++ does not allow assigning from void * to another
pointer without a cast. This commit adds the cast. We can clean this
up when the C++ bindings are deleted.

Fixes open-mpi/ompi#1707

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 09:52:23 -06:00
Joshua Ladd
ce783a9ebf Merge pull request #1706 from vspetrov/coll_hcoll_req_type_bugfix
coll/hcoll: bugfix: initialize req_type field
2016-05-25 10:56:33 -04:00
Valentin Petrov
5ff6372886 coll/hcoll: bugfix: initialize req_type field
If left uninitialized then segfault is possible in MPI_Waitall in
    the case the field by chance equals OMPI_REQUEST_GEN.
2016-05-25 15:38:01 +03:00
Rainer Keller
3727cba9bb Fix compilation for FreePGI on OSX
Our checks and the ones of libevent are somewhat flawed.
If adding multiple "-framework" to CXXFLAGS or CFLAGS, we strip
the keyword from the command-line, not good.
libevent however assumes plain gcc without testing properly
that the compiler supports -Wno-deprecated-declarations.
2016-05-25 09:12:39 +02:00
George Bosilca
2b868c4952 Fix MPI datatype args.
Compensate for the datatype ID that we add to the array.
2016-05-24 23:36:54 -04:00
Jeff Squyres
dd9a819a1c odls_default: do not opal_output() while creating a process!
It is verbotten to use opal_output() after the fork() but before the
exec()!  It results in all manner of undefined behavior.  For example,
on some OS X systems, if you run a trivial "hello world" MPI program
with a high level of ODLS verbosity:

```sh
$ mpirun -np 3 --mca odls_base_verbose 100 ./hello_c
```

You will see a bunch of output from the mpirun ODLS base, but then it
*may* hang in odls_default_module.c:do_child() -- after the fork() but
before the exec() -- while trying to opal_output() some debugging
statements.

The solution is to remove these extraneous opal_output() statements.
Indeed, the ODLS base is already outputting the same information that
these opal_output() statements are trying to emit, anyway.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-05-24 21:28:57 -04:00
Nathan Hjelm
461ca1203b Merge pull request #1703 from hjelmn/grdma_cuda_fix
rcache/grdma: fix typo in cuda code
2016-05-24 18:51:22 -06:00
bosilca
b90c83840f Refactor the request completion (#1422)
* Remodel the request.
Added the wait sync primitive and integrate it into the PML and MTL
infrastructure. The multi-threaded requests are now significantly
less heavy and less noisy (only the threads associated with completed
requests are signaled).

* Fix the condition to release the request.
2016-05-24 18:20:51 -05:00
Nathan Hjelm
af52dad8f8 rcache/grdma: fix typo in cuda code
Fixes open-mpi/ompi#1702

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-24 15:56:39 -06:00
Nathan Hjelm
1d3110471c Merge pull request #1697 from hjelmn/acc_order
win: add support for accumulate_ordering info key
2016-05-24 14:34:05 -06:00
Nathan Hjelm
5126da5377 win: add support for accumulate_ordering info key
This commit adds support for the MPI-3.1 accumulate_ordering info
key. The default value is rar,war,raw,waw and is supported using an
MCA variable flag enumerator.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-24 11:13:30 -06:00
rhc54
b7928c2607 Merge pull request #1693 from rhc54/topic/eval2
Fix the dist mapper option
2016-05-24 05:32:12 -07:00
Ralph Castain
30aaf785a8 Fix the dist mapper option 2016-05-23 23:20:33 -07:00
rhc54
927d3f4c3c Merge pull request #1692 from rhc54/topic/eval2
Fix the --tune problem by searching the argv for MCA params in advance of opal_init_util
2016-05-23 22:19:09 -07:00
rhc54
8d2d5ef1fe Merge pull request #1691 from rhc54/topic/java
Fix command line usage when Java user provides the -Djava.library.path=foo options
2016-05-23 21:12:49 -07:00
Ralph Castain
80f4e3b872 Fix the --tune problem by searching the argv for MCA params in advance of opal_init_util. Only search the first app_context as we historically have done - we can debate whether or not to search all app_contexts 2016-05-23 21:09:44 -07:00
Ralph Castain
2da0210de3 Fix command line usage when Java user provides the -Djava.library.path=foo options 2016-05-23 15:29:36 -07:00
Nathan Hjelm
a651f26701 Merge pull request #1690 from hjelmn/flag_enum
mca/base: fix typo in flag enumeration
2016-05-23 14:14:36 -06:00
Jeff Squyres
e7d46b96a3 Merge pull request #1680 from yburette/topic/fix_provider_selection
mtl/ofi: Change default provider selection behavior.
2016-05-23 15:06:02 -04:00
Nathan Hjelm
37e9e2c660 mca/base: fix typo in flag enumeration
This commit fixes a typo in flag enumeration that can cause the parser
to miss valid flags or crash.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-23 12:21:34 -06:00
Gilles Gouaillardet
bca44592af Merge pull request #1643 from ggouaillardet/topic/romio_openbsd57
io/romio: fix filesystem type check on OpenBSD
2016-05-23 16:33:56 +09:00
Gilles Gouaillardet
d5a2ac6f2f btl/openib: fix #if vs #ifdef 2016-05-23 14:27:33 +09:00
Gilles Gouaillardet
5a8cbe5a8f btl/openib: remove obsolete reference to MEMORY_LINUX_MALLOC_ALIGN_ENABLED macro 2016-05-23 14:12:21 +09:00
Gilles Gouaillardet
8466a3daf3 pmix: update .gitignore
git ignore opal/mca/pmix/pmix114/pmix/include/pmix/autogen/config.h.in
git rm opal/mca/pmix/pmix114/pmix/include/pmix/autogen/config.h.in
git ignore opal/mca/pmix/pmix*/...
2016-05-23 11:58:07 +09:00
George Bosilca
5be5c40f93 Merge branch 'master' of github.com:open-mpi/ompi 2016-05-21 16:02:45 -04:00
George Bosilca
16d9f71d01 Correctly compute the space needed for the args.
Add checks to bail out if our precomputed value is less
than needed (we are already at fault).

bot:milestone:v1.10.3
bot:milestone:v2.0
bot🏷️bug
bot:assign: @ggouaillardet
2016-05-21 16:01:16 -04:00
George Bosilca
0641005dab Only check the parameters on valid dimensions. 2016-05-21 15:54:04 -04:00
George Bosilca
6aac0d9c22 Remove useless output stream. 2016-05-21 15:54:04 -04:00
Nathan Hjelm
31bfeede82 bml/r2: always add btl progress function
This commit changes the behavior of bml/r2 from conditionally
registering btl progress functions to always registering progress
functions. Any progress function beloning to a btl that is not yet in
use is registered as low-priority. As soon as a proc is added that
will make use of the btl is is re-registered normally.

This works around an issue with some btls. In order to progress a
first message from an unknown peer both ugni and openib need to have
their progress functions called. If either btl is not in use after the
first call to add_procs the callback was never happening. This commit
ensures the btl progress function is called at some point but the
number of progress callbacks is reduced from normal to ensure lower
overhead when a btl is not used. The current ratio is 1 low priority
progress callback for every 8 calls to opal_progress().

Fixes open-mpi/ompi#1676

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-21 15:54:04 -04:00
Nathan Hjelm
6195ec0727 Merge pull request #1677 from hjelmn/add_procs_lockup
bml/r2: always add btl progress function
2016-05-21 11:23:28 -06:00