Jeff Squyres
0c4b5f1ea9
Merge pull request #1426 from jsquyres/pr/fix-unescaped-braces-in-autogen
...
autogen: fix deprecated construct
2016-03-08 11:53:29 -05:00
Ralph Castain
bac6290b22
Ensure the process name is positive when using direct launch
...
Fixes #1425
2016-03-08 08:31:05 -08:00
Nathan Hjelm
f89cc3c2f1
Merge pull request #1433 from hjelmn/keyval_parse_fix
...
opal/util: fix bug in key value parser
2016-03-08 09:12:56 -07:00
Joshua Ladd
fc5a201030
Merge pull request #1434 from jladd-mlnx/topic/mxm_add_procs_fix
...
Fixing MXM Yalla and MTL add procs behavior. MXM cannot support dynam…
2016-03-07 21:32:19 -05:00
Joshua Ladd
4dffae2f88
Fixing MXM Yalla and MTL add procs behavior. MXM cannot support dynamic add procs, so propaget this info to the MTL and PML layers.
2016-03-08 01:46:24 +02:00
Nathan Hjelm
63bac9a4e0
opal/util: fix bug in key value parser
...
This commit fixes a bug in the opal key value parser that might cause
the filename parser to go past the beginning of the string.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-07 14:51:29 -07:00
rhc54
8ffde8d020
Merge pull request #1431 from rhc54/topic/orted
...
Do not push child processes into separate process groups so that any …
2016-03-06 20:08:17 -08:00
Ralph Castain
d72c1c72ff
Do not push child processes into separate process groups so that any host RM can still "see" them, and ensure that any signal sent to the orted's themselves will be provided to all child processes. Forward all signals from mpirun to the child processes, removing the old MCA parameter required to turn that behavior "on".
2016-03-06 17:55:09 -08:00
rhc54
36a6a3b691
Merge pull request #1430 from rhc54/topic/sing
...
Update singularity support
2016-03-06 07:35:17 -08:00
Ralph Castain
4d0cc27eb7
Update the singularity support to match that of the latest singularity master. Remove the restriction on shared memory components by instructing singularity to not isolate the PID space. Add a new schizo API to allow setting up the original app_context. Ensure the container is installed prior to execution.
2016-03-05 21:47:42 -08:00
Aurelien Bouteiller
f55a06da00
Merge pull request #1416 from abouteiller/bugfix/recvcancelwiththreads
...
Fix a potential race condition in recv cancel
2016-03-05 02:14:32 -05:00
Ralph Castain
ce0a05d7d1
Minor cleanup - Singularity now has an internal check for installed, so we no longer need to do so.
2016-03-04 19:07:53 -08:00
Jeff Squyres
9c9bad3db5
autogen: fix deprecated construct
...
Newer versions of Perl warn that unescaped left braces in regexps are
deprecated.
2016-03-04 08:42:45 -05:00
Ralph Castain
b57a191ccc
Update the external client to the new PMIx init/finalize signatures
2016-03-03 20:50:20 -08:00
igor-ivanov
5e9fdabdbb
Merge pull request #1420 from ggouaillardet/topic/memory_linux_memalign_enum
...
memory/linux: make memory_linux_memalign an enum
2016-03-03 21:21:55 +04:00
Gilles Gouaillardet
80bdbfd9e7
add missing include file
2016-03-03 13:46:28 +09:00
rhc54
d38e2e6655
Merge pull request #1423 from rhc54/topic/suicide
...
Fix registration of error handlers thru the pmix120 component.
2016-03-02 17:43:06 -08:00
Gilles Gouaillardet
5c685e2332
memory/linux: make memory_linux_memalign an enum
...
Thanks Igor Ivanov for the review.
2016-03-03 08:38:46 +09:00
Ralph Castain
4a55fba414
Fix registration of error handlers thru the pmix120 component. A thread-shift operation was hanging on the sync_event_base, which made it dependent on someone calling opal_progress. Unfortunately, a process in "sleep" or spinning outside the MPI library won't do that, and so we never complete errhandler registration.
2016-03-02 15:01:01 -08:00
Nathan Hjelm
5a85a039fa
Merge pull request #1421 from hjelmn/btl_vader_threads
...
btl/vader: various threading fixes
2016-03-02 16:53:57 -06:00
Nathan Hjelm
2a0b3a5700
btl/vader: various threading fixes
...
This commit fixes several threading bugs:
- Add an additional lock to the btl_base_endpoint_t structure to lock
the list of pending frags. This allows the progress function to
attempt to send pending frags without needing to drop/reaquire the
lock. This should provide a small improvement in performance and
fixes a potential race between adding an removing items from the
pending list.
- Ensure fast boxes are only set up once by updating the send count
using atomics when needed and do not set the fast box buffer
pointer until the fast box is set up.
Closes open-mpi/ompi#1408
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-02 10:50:59 -07:00
Ralph Castain
f0680008d1
Add test file for singularity
2016-03-02 05:40:41 -08:00
Ralph Castain
06e811c5a6
Properly use the OPAL_MCA_PREFIX in orte_submit
2016-03-01 18:16:40 -08:00
Ralph Castain
1b81d90eaa
Minor cleanups required for orte-dvm operation
2016-03-01 18:12:53 -08:00
Aurélien Bouteiller
892e1ed57e
Fix a potential race condition in which a progress matching thread could match a request while we are cancelling it.
2016-03-01 16:43:45 -05:00
Ralph Castain
c9f7bb6751
Add the include file to all the schizo components
2016-03-01 13:18:23 -08:00
Ralph Castain
625083fe18
Add include file
2016-03-01 13:04:20 -08:00
rhc54
9df73568f4
Merge pull request #1411 from rhc54/topic/pmixdefault
...
Fix a number of issues, some of which have lingered for a long time
2016-03-01 11:08:28 -08:00
Ralph Castain
011403c04a
Fix a number of issues, some of which have lingered for a long time:
...
* provide a more reliable way of determining that a process is a singleton by leveraging the schizo framework. Add new components for slurm, alps, and orte to detect when we are in a managed environment, and if we have been launched by mpirun or a native launcher. Set the correct envars to control ess and pmix selection in each case.
* change the relative priority of the pmix120 and pmix112 components to make pmix120 the default
* fix singleton comm-spawn by correctly setting the num_apps field of the orte_job_t created by the daemon - this fixes a segfault in register_nspace on newly created daemons
* ensure orterun doesn't propagate any ess or pmix directives in its environment
* Cleanup a few valgrind issues and memory leaks
* Fix a race condition that prevented the client from completing notification registrations (missing thread shift)
* Ensure the shizo/alps component detects launch by mpirun
2016-03-01 06:53:00 -08:00
Gilles Gouaillardet
67e45028df
Merge pull request #1414 from jsquyres/pr/egrep-for-examples-makefile
...
examples: update ompi_info bindings checks
2016-03-01 11:55:49 +09:00
Gilles Gouaillardet
8aff67c399
topo/base: correctly support MPI_UNWEIGHTED in mca_topo_base_dist_graph_neighbors()
...
Thanks Jun Kudo for the bug report.
2016-03-01 10:28:28 +09:00
Gilles Gouaillardet
e5d6b97db4
opal: fix pragma for GCC 6 and later
...
GCC 6 and later should ignore -Wpedantic instead of -pedantic
2016-02-29 13:56:22 +09:00
Jeff Squyres
677a31bc9f
examples: update ompi_info bindings checks
...
Use "-q" option to grep/egrep to suppress output (we only need the
exit status). Also, use egrep for the "use mpi" check, because some
versions of ompi_info say 'bindings:use_mpi:yes' and others say
'bindings:use_mpi:"yes' (i.e., with the double quote). This regexp
will work with both versions.
2016-02-28 17:19:54 -08:00
Jeff Squyres
20fade1345
examples: fix check for Fortran "use mpi" bindings
...
The output from "ompi_info --parsable" for the Fortran "use mpi"
bindings apparently has changed over time. It is now:
"yes (full: ignore TKR)"
or "yes (limited: overloading)"
(including the quotes)
So update the test in examples/Makefile to also look for the quote.
2016-02-28 16:30:09 -08:00
rhc54
78a1fd5d54
Merge pull request #1413 from rhc54/topic/iof
...
Fix a segfault that can occur when very short-lived, non-ORTE procs are run
2016-02-28 13:55:15 -08:00
Ralph Castain
263b0c95a8
Fix a segfault that can occur when very short-lived, non-ORTE procs are run
2016-02-28 12:30:20 -08:00
Jeff Squyres
89f225ea7f
Merge pull request #1410 from jsquyres/pr/cxx11-has-jumped-the-shark
...
cxx: "rank" is now a function in C++11
2016-02-27 09:37:13 -05:00
Jeff Squyres
89d0a033b7
cxx: "rank" is now a function in C++11
...
Use "myrank" instead (I tried using ::rank, but had varied
success... so I just renamed the variable).
2016-02-25 15:56:08 -06:00
rhc54
6ae75f007a
Merge pull request #1409 from rhc54/topic/singleton
...
Provide an option to allow isolated singletons
2016-02-25 15:36:43 -06:00
Ralph Castain
cdb494566d
Provide an option to allow isolated singletons
2016-02-25 11:33:26 -06:00
George Bosilca
dbe93b0b19
Use mca_bml_base_get_endpoint
...
Correctly use mca_bml_base_get_endpoint instead of accessing the
endpoint directly.
2016-02-25 11:00:30 -06:00
Sylvain Jeaugey
5f32f49eb8
pml/ob1: Fix segmentation fault on CUDA path.
...
Fix segfault due to mca_pml_ob1_cuda_need_buffers not handling the case of the
endpoint not being there. Calling mca_bml_get_endpoint() seems to fix the problem.
Fixes open-mpi/ompi#1402
2016-02-24 21:32:25 -08:00
rhc54
026cb37c4e
Merge pull request #1400 from rhc54/topic/config
...
Adjust the pmix external component configure error messages
2016-02-24 14:22:59 -06:00
Ralph Castain
d28d3ee901
Make the error message on external pmix library a little clearer by separating out the libevent from the libhwloc checks
2016-02-24 11:20:25 -06:00
Ralph Castain
e8d347d7bd
Add missing includes
2016-02-24 08:56:02 -06:00
Jeff Squyres
be22971c3a
Merge pull request #1398 from jsquyres/pr/test-lib-name-cleanups
...
tests: fix library name
2016-02-23 21:31:39 -06:00
Gilles Gouaillardet
477991b5aa
btl/openib: fix abstraction violation and use opal_memory->memoryc_set_alignment
2016-02-24 09:50:13 +09:00
Gilles Gouaillardet
d8482ce6f4
opal/mca/memory: add a memoryc_set_alignment subroutine to the OPAL memory MCA
...
this commit also (partially) reverts :
- open-mpi/ompi@7de01b347c
- open-mpi/ompi@8b05f308f9
2016-02-24 09:50:12 +09:00
Jeff Squyres
1340d51ddd
tests: fix library name
...
Use @OPAL_LIB_PREFIX@ as appropriate in the library that we link against.
2016-02-23 16:22:59 -08:00
Edgar Gabriel
2c4b93f72b
Merge pull request #1395 from edgargabriel/pr/fcoll-static-large-ops-fix
...
fix the data size counter for large ops for the static fcoll component
2016-02-23 10:26:16 -06:00