Ralph Castain
6a607d42a6
Prevent a segfault on tools if a connection attempt fails - tools don't open the opal/pmix framework and thus have no way of looking up a proc hostname
2015-11-10 09:11:34 -08:00
yosefe
d66b01d380
pml_ucx: implement cancel, and add small optimizations.
2015-11-10 17:40:06 +02:00
Mark Santcroos
8c255452cf
Merge branch 'master' into fix/alpsinfov3
2015-11-10 04:17:42 -05:00
Gilles Gouaillardet
d6ff25b9a2
pml/monitoring: initialize common symbols
2015-11-10 13:58:54 +09:00
Nathan Hjelm
6ae82ff090
Merge pull request #1115 from hjelmn/flist_fix
...
opal_free_list: fix strange size check
2015-11-09 20:55:46 -07:00
Nathan Hjelm
2c02294389
opal_free_list: fix strange size check
...
OPAL free lists can be initialized with a fragment size that differs
from the size of objects from a class. This allows the free list code
to support OPAL objects that have flexible array members.
Unfortunately the free list code will throw out the desired length in
some cases. The code in question was committed in
open-mpi/ompi@90fb58de . The side effects of this are varied and can
cause segmentation faults, assert failures, hangs, etc. This commit
adds a check to ensure the requested size is at least as large as the
class size and makes opal_free_list allocations always honor the
requested fragment size (as long as it is larger than the class
size).
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-09 19:47:55 -07:00
Gilles Gouaillardet
c415ecf39e
test/monitoring: build monitoring_prof lib only if dynamic libs are built
...
Thanks Mark Santcroos for reporting this
2015-11-10 11:33:12 +09:00
Mark Santcroos
5ec2b4d98c
Fix some messages in the process.
2015-11-09 18:03:26 -05:00
Ralph Castain
52ea538bc1
Per fix from Nysal: set the listener_active flag before starting the progress thread, and declare the flag to be volatile
2015-11-09 09:00:59 -08:00
Mark Santcroos
8ec89001b3
Merge branch 'master' into fix/alpsinfov3
2015-11-09 02:45:23 -05:00
Ralph Castain
9b0cdc0de2
Add support for -pernode and -npernode options to orte-submit
2015-11-08 11:34:18 -08:00
Ralph Castain
187fa9b131
Extend the scaling test script to support multiple starters, including mpirun, orterun (if mpirun not present), orte-dvm, and srun. Auto-detect which are p
...
resent and allow the user to run all of them. Auto-detect the number of nodes in the allocation.
2015-11-08 11:34:06 -08:00
Ralph Castain
f2805fb0f9
Provide a mechanism for renaming symbols in the opposite direction - i.e., #define prefix_foo[suffix] foo.
2015-11-07 18:07:09 -08:00
rhc54
c788d7bf88
Merge pull request #1110 from rhc54/topic/renames
...
Update the libevent renaming file to ensure that all public symbols a…
2015-11-07 17:22:56 -07:00
Ralph Castain
73c8c30c5d
Update the scaling.pl test script to support orte-dvm and srun
2015-11-07 13:13:36 -08:00
Ralph Castain
ee9aa67483
Update the libevent renaming file to ensure that all public symbols are covered
2015-11-07 12:52:31 -08:00
Ralph Castain
1f44fef4d6
Add ability to provide a suffix to the symbol renames
2015-11-07 12:37:14 -08:00
Ralph Castain
6864a9b68a
Add a new script for creating symbol hiding "rename" files
2015-11-07 12:11:54 -08:00
rhc54
66b1ef24de
Merge pull request #1108 from rhc54/topic/exitcode
...
Need to delay registration of the waitpid callback until after the fo…
2015-11-07 08:23:13 -07:00
Ralph Castain
18c5cb48ff
Update the scaling test script
2015-11-06 21:51:40 -08:00
Ralph Castain
f1483eb2dc
Need to delay registration of the waitpid callback until after the fork/exec of the child process. Fix the bit testing of process type so that the proper state component gets selected for HNP.
2015-11-06 21:35:24 -08:00
rhc54
7a9b9325a8
Merge pull request #1107 from rhc54/topic/pmix
...
Work on cleaning up memory leaks that are causing orte-dvm to eventua…
2015-11-06 17:16:49 -07:00
Ralph Castain
fed28e4cfc
Add missing file that was previously ignored
2015-11-06 14:37:09 -08:00
Ralph Castain
5f446570d8
Work on cleaning up memory leaks that are causing orte-dvm to eventually run out of memory. Still don't have everything plugged, but getting better. Sync to the PMIx master that includes removal of the pmix_common.h.in file that really didn't need to be generated, and update to the PMIx_server_init API.
2015-11-06 14:15:30 -08:00
Jeff Squyres
e89ecac83c
bml r2: fix exclusivity comparison
...
Fixes open-mpi/ompi#1106
2015-11-06 13:26:32 -08:00
Jeff Squyres
b35b708979
tcp BTL: fix inconsistent whitespace problems
...
No code/logic changes.
2015-11-06 12:41:13 -08:00
Jeff Squyres
300cff2b89
usnic: fix/update the usnic stats
...
1. Fix: old v1.6-era code reset the stats-emitting event to fire twice
for each time period.
1. Add the usNIC device name to the output for differentiating the
output in multi-rail scenarios.
2015-11-06 12:05:34 -08:00
rhc54
d63c448de9
Merge pull request #1102 from rhc54/topic/dvm
...
Ensure that we completely register an nspace prior to launching local…
2015-11-06 09:00:47 -07:00
rhc54
e964608b90
Merge pull request #1103 from rhc54/topic/intercomm
...
Fix intercomm_create
2015-11-06 08:19:02 -07:00
Francois WELLENREITER
b301b49a40
MTL portals4 : remove useless PtlMDBind PtlMDRelease calls for rendez-vous messages
2015-11-06 15:55:44 +01:00
Mark Santcroos
a40b4eb2ee
Support ALPS_APPINFO_VERSION 3.
2015-11-06 09:53:41 -05:00
Ralph Castain
bfdf08ae86
Fix intercomm_create by ensuring that both sides know how to translate jobid to/from nspace
...
Return something just to ensure that pack is happy
2015-11-06 02:19:45 -08:00
Ralph Castain
ec0cc4bf21
Ensure that we completely register an nspace prior to launching local procs as otherwise we may attempt to send it down before it is registered, leading to data corruption
2015-11-05 20:51:56 -08:00
Nathan Hjelm
fda5daf453
Merge pull request #1096 from kawashima-fj/pr/fortran-var-type-fix
...
Fix Fortran variable types
2015-11-05 14:27:40 -07:00
Nathan Hjelm
e0a291812d
Merge pull request #1101 from hjelmn/ib_fix
...
btl/openib: fix access flags
2015-11-05 09:01:28 -07:00
Ralph Castain
68996d6858
Move the argv_free back to the correct place - I blame Jeff for suggesting it was wrong to begin with
2015-11-05 07:57:54 -08:00
Nathan Hjelm
4ddbdad772
btl/openib: fix access flags
...
Per spec for ibv_reg_mr if remote write or remote atomic is requested also
need to specify local write.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-11-04 15:23:11 -07:00
Nathan Hjelm
acf3cb9b9b
Merge pull request #1095 from kawashima-fj/pr/trivial-fixes
...
Some trivial fixes
2015-11-04 09:45:59 -07:00
Mike Dubman
f4316b20bb
Merge pull request #1099 from yosefe/topic/ucx-fix-request-destruct
...
pml_ucx: fix request construct/destruct.
2015-11-04 12:48:37 +02:00
yosefe
45c3d04857
pml_ucx: fix request construct/destruct.
...
We should invoke OBJ_CONTRUCT/OBJ_DESTRUCT only on regular requests
(which are embedded inside UCX requests) and for the completed request.
Persistent requests are already constructed/destructed by the free list.
This fixes an assertion in ompi_request_destruct.
2015-11-04 11:03:46 +02:00
Ralph Castain
169c44258d
Fix missing check
2015-11-03 19:00:28 -08:00
KAWASHIMA Takahiro
d4bdf405bd
opal/threads: Correct nsec -> usec conversion.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
60546c6418
opal/datatype: Fix a macro value typo for heterogeneous.
...
This affects behaiviors only on a heterogeneous environment.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
c09f9f05d3
mpi/tool: Fix an incorrect type cast.
...
This bug caused an invalid result value on `MPI_T_cvar_read`
on big-endian machines or for large (>=2Gi) cvar values.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
2dcb2d711b
Makefile: Move fd.c to SOURCES
from headers
.
...
And reorder fd.h and few.h in alphabetical order.
2015-11-04 11:28:43 +09:00
KAWASHIMA Takahiro
384f4b51d1
fortran: Fix: missing dimension(*)
in (I)NEIGHBOR_ALLTOALLW
.
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
1092eabfab
fortran: Update comment.
...
The structure was changed in commit 9c77c6b
.
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
107c0073dd
fortran: Fix: MPI_UNWEIGHTED
and MPI_WEIGHTS_EMPTY
should be arrays.
...
Without this modification, gfortran throw the following error
if these variables are used for `MPI_DIST_GRAPH_CREATE_ADJACENT` or
`MPI_DIST_GRAPH_CREATE_ADJACENT`.
Error: There is no specific subroutine for the generic
'mpi_dist_graph_create_adjacent' at (1)
2015-11-04 10:38:25 +09:00
KAWASHIMA Takahiro
d5e1f40a1e
fortran: Fix: info
should be an integer parameter.
2015-11-04 10:38:24 +09:00
KAWASHIMA Takahiro
9bf93810d7
fortran: Fix: array dimension of MPI_ARGVS_NULL
.
...
`MPI_ARGVS_NULL` should be a two-dimensional array.
Without this modification, gfortran throw the following error
if `MPI_ARGVS_NULL` is used for `MPI_COMM_SPAWN_MULTIPLE`.
Error: There is no specific subroutine for the generic
'mpi_comm_spawn_multiple' at (1)
2015-11-04 10:38:24 +09:00