Gilles Gouaillardet
8aff67c399
topo/base: correctly support MPI_UNWEIGHTED in mca_topo_base_dist_graph_neighbors()
...
Thanks Jun Kudo for the bug report.
2016-03-01 10:28:28 +09:00
Gilles Gouaillardet
e5d6b97db4
opal: fix pragma for GCC 6 and later
...
GCC 6 and later should ignore -Wpedantic instead of -pedantic
2016-02-29 13:56:22 +09:00
Jeff Squyres
20fade1345
examples: fix check for Fortran "use mpi" bindings
...
The output from "ompi_info --parsable" for the Fortran "use mpi"
bindings apparently has changed over time. It is now:
"yes (full: ignore TKR)"
or "yes (limited: overloading)"
(including the quotes)
So update the test in examples/Makefile to also look for the quote.
2016-02-28 16:30:09 -08:00
rhc54
78a1fd5d54
Merge pull request #1413 from rhc54/topic/iof
...
Fix a segfault that can occur when very short-lived, non-ORTE procs are run
2016-02-28 13:55:15 -08:00
Ralph Castain
263b0c95a8
Fix a segfault that can occur when very short-lived, non-ORTE procs are run
2016-02-28 12:30:20 -08:00
Jeff Squyres
89f225ea7f
Merge pull request #1410 from jsquyres/pr/cxx11-has-jumped-the-shark
...
cxx: "rank" is now a function in C++11
2016-02-27 09:37:13 -05:00
Jeff Squyres
89d0a033b7
cxx: "rank" is now a function in C++11
...
Use "myrank" instead (I tried using ::rank, but had varied
success... so I just renamed the variable).
2016-02-25 15:56:08 -06:00
rhc54
6ae75f007a
Merge pull request #1409 from rhc54/topic/singleton
...
Provide an option to allow isolated singletons
2016-02-25 15:36:43 -06:00
Ralph Castain
cdb494566d
Provide an option to allow isolated singletons
2016-02-25 11:33:26 -06:00
George Bosilca
dbe93b0b19
Use mca_bml_base_get_endpoint
...
Correctly use mca_bml_base_get_endpoint instead of accessing the
endpoint directly.
2016-02-25 11:00:30 -06:00
Sylvain Jeaugey
5f32f49eb8
pml/ob1: Fix segmentation fault on CUDA path.
...
Fix segfault due to mca_pml_ob1_cuda_need_buffers not handling the case of the
endpoint not being there. Calling mca_bml_get_endpoint() seems to fix the problem.
Fixes open-mpi/ompi#1402
2016-02-24 21:32:25 -08:00
rhc54
026cb37c4e
Merge pull request #1400 from rhc54/topic/config
...
Adjust the pmix external component configure error messages
2016-02-24 14:22:59 -06:00
Ralph Castain
d28d3ee901
Make the error message on external pmix library a little clearer by separating out the libevent from the libhwloc checks
2016-02-24 11:20:25 -06:00
Ralph Castain
e8d347d7bd
Add missing includes
2016-02-24 08:56:02 -06:00
Jeff Squyres
be22971c3a
Merge pull request #1398 from jsquyres/pr/test-lib-name-cleanups
...
tests: fix library name
2016-02-23 21:31:39 -06:00
Gilles Gouaillardet
477991b5aa
btl/openib: fix abstraction violation and use opal_memory->memoryc_set_alignment
2016-02-24 09:50:13 +09:00
Gilles Gouaillardet
d8482ce6f4
opal/mca/memory: add a memoryc_set_alignment subroutine to the OPAL memory MCA
...
this commit also (partially) reverts :
- open-mpi/ompi@7de01b347c
- open-mpi/ompi@8b05f308f9
2016-02-24 09:50:12 +09:00
Jeff Squyres
1340d51ddd
tests: fix library name
...
Use @OPAL_LIB_PREFIX@ as appropriate in the library that we link against.
2016-02-23 16:22:59 -08:00
Edgar Gabriel
2c4b93f72b
Merge pull request #1395 from edgargabriel/pr/fcoll-static-large-ops-fix
...
fix the data size counter for large ops for the static fcoll component
2016-02-23 10:26:16 -06:00
Edgar Gabriel
45003ef78d
fix the data size counter for large ops for the static fcoll component
2016-02-23 08:33:50 -06:00
George Bosilca
d6fb56af29
Use the correct printf conversion specifier.
2016-02-23 01:26:27 -06:00
Gilles Gouaillardet
308bbcbad1
ompi/dpm: retrieves OPAL_PMIX_ARCH in heterogeneous mode
...
also remove code duplication by using ompi_proc_complete_init_single()
Thanks Siegmar Gross for reporting this issue, and Ralph for the guidance.
2016-02-22 11:01:06 +09:00
Gilles Gouaillardet
a4aa4c9571
ompi_proc_complete_init_single: make the subroutine public
...
and accept a proc from a different job
2016-02-22 11:01:06 +09:00
rhc54
1df4457af2
Merge pull request #1392 from rhc54/topic/dvm
...
Tools don't create the orte_job_data table, so don't remove jobs from it
2016-02-21 17:42:48 -08:00
Ralph Castain
77f800b7e8
Tools don't create the orte_job_data table, so don't remove jobs from it
2016-02-21 16:29:00 -08:00
Ralph Castain
64b7728f33
Fix typo - do not look at daemon job when considering completion of launch
2016-02-21 14:44:51 -08:00
rhc54
b499d4ba2a
Merge pull request #1391 from rhc54/topic/dvm
...
Convert the orte_job_data pointer array to a hash table so it doesn't…
2016-02-21 13:07:59 -08:00
Ralph Castain
d653cf2847
Convert the orte_job_data pointer array to a hash table so it doesn't grow forever as we run lots and lots of jobs in the persistent DVM.
2016-02-21 11:55:49 -08:00
Ralph Castain
309e23ab3a
Fix minor typo
2016-02-20 01:33:10 -08:00
yohann
59b6d041f8
mtl/ofi: Check allocated pointer.
2016-02-19 16:59:47 -08:00
yohann
bd47062764
mtl/ofi: Fix error handling.
2016-02-19 16:58:41 -08:00
yohann
404987e9b3
mtl/ofi: Fix mismatching types.
2016-02-19 16:57:26 -08:00
yohann
3ad59435ce
mtl/ofi: Prevent possible memory leak.
2016-02-19 16:57:02 -08:00
Ralph Castain
8c92a179c0
Minor memory leak
2016-02-19 15:05:39 -08:00
rhc54
1f7e2d7d41
Merge pull request #1388 from rhc54/topic/iof
...
Cleanup the output-filename options so they work as expected.
2016-02-19 13:56:05 -08:00
Ralph Castain
0c72ba89b9
Cleanup the output-filename options so they work as expected. Have the remote nodes output locally to the files instead of sending it all back to the HNP.
...
Fix Solaris issues by renaming struct field
2016-02-19 12:41:46 -08:00
Edgar Gabriel
b33db517c1
Merge pull request #1387 from edgargabriel/dynamic_gen2-overlap
...
Updates to the dynamic_gen2 component
2016-02-19 12:00:29 -06:00
Edgar Gabriel
92d1b99468
optimize the shuffle step:
...
1. use communicator collectives if possible for performance reasons
2. combined multiple allgathers into a single one
2016-02-19 11:04:04 -06:00
Edgar Gabriel
e63836c653
clean up the mca parameter handling of the component. Add new parameters for number of sub groups and write chunk size. This will allow to perform a systematic parameter study.
2016-02-19 10:15:28 -06:00
Edgar Gabriel
4f400314e0
add the dynamic_gen2 component into the fcoll selection table.
2016-02-19 09:32:54 -06:00
Edgar Gabriel
268d525053
change the tag to be a positive value. handle 0-byte situations correctly.
2016-02-19 08:28:50 -06:00
Edgar Gabriel
ad79012059
first cut on the version which overlaps the communication/computation of 2 iterations.
2016-02-19 08:28:50 -06:00
Nathan Hjelm
e57ce1e1ef
Merge pull request #1384 from hjelmn/xrc_get_fix
...
btl/openib: XRC save SRQ#s on the loopback endpoint
2016-02-18 21:48:45 -07:00
Nathan Hjelm
2031bb6f01
btl/openib: XRC save SRQ#s on the loopback endpoint
...
This commit fixes a bug that can occur when communicating via XRC to
peers on the same node. UDCM was not saving the SRQ numbers on the
loopback endpoint (which shares its ib_addr info with all local peers)
so any messages to local peers use an invalid SRQ number.
Fixes open-mpi/ompi#1383
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-02-18 20:59:11 -07:00
rhc54
bfd4254a7b
Merge pull request #1382 from rhc54/topic/cleanup
...
Cleanup some valgrind complaints about jumps with uninitialized values.
2016-02-18 17:29:37 -08:00
Nathan Hjelm
27e7b6e466
Merge pull request #1381 from hjelmn/ddt_colon_fix
...
orterun: allow DDT if options contain :'s
2016-02-18 17:48:21 -07:00
Nathan Hjelm
820b178384
Merge pull request #1380 from hjelmn/xrc_get_fix
...
btl/openib: XRC fix bug that could cause an invalid SRQ# to be used
2016-02-18 17:33:09 -07:00
Ralph Castain
6e68d758b9
Cleanup some valgrind complaints about jumps with uninitialized values. Fix a few IOF issues reported by Mark Santcroos when submitting jobs from tools. Add the ability to pass directives to the --output-filename option that tell ORTE to (a) not include the jobid in the path to the output files, and (b) not to copy the output to the tool (i.e., just store it in the files).
...
ck
Remove stale debug
Fix a segfault if no subscribers are present
2016-02-18 16:30:37 -08:00
Nathan Hjelm
69de442136
orterun: allow DDT if options contain :'s
...
There is a bug in MPMD detection that disables totalview if a : is
found anywhere on the command line. This includes inside an argument
option or MCA variable value. This commit changes the check to look
for the string " : " instead of the character : which should eliminate
the issue in most cases.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-18 16:56:08 -07:00
Nathan Hjelm
371df45bf8
btl/openib: fix locking bugs with XRC ib_addr lock
...
This bug fixes two issue with the ib_addr lock:
- The ib_addr lock must always be obtained regardless of
opal_using_threads() as the CPC is run in a seperate thread.
- The ib_addr lock is held in mca_btl_openib_endpoint_connected when
calling back into the CPC start_connect on any pending
connections. This will attempt to obtain the ib_addr lock
again. Since this is not a performance-critical part of the code
the lock has been changed to be recursive.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-18 15:55:34 -07:00