Jithin Jose
9c937d44ae
Inline MTL-OFI
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
Conflicts:
ompi/mca/mtl/ofi/mtl_ofi_recv.c
2015-04-03 15:19:30 -07:00
Jithin Jose
50304dfe05
Inline mtl-datatype pack/unpack
...
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:21 -07:00
Jithin Jose
c09582a3ff
- CM blocking send/recv optimizations
...
This patch tries to do as little as possible in the PML CM blocking
send/receive routines. Basically, avoid creating and filling in an
entire request object. An OMPI-level request is still needed, but we
can create that on the stack instead of going to a free list.
Signed-off-by: Andrew Friedley <andrew.friedley@intel.com>
Signed-off-by: Jithin Jose <jithin.jose@intel.com>
2015-04-03 15:19:08 -07:00
Jeff Squyres
5f19436cd2
Merge pull request #508 from jsquyres/pr/usnic-libfabric-eagain-fix
...
usnix: fix the CQ-reading logic for -FI_EAGAIN
2015-04-02 19:08:22 -04:00
Jeff Squyres
d825ec7cc7
usnic: fix the CQ-reading logic for -FI_EAGAIN
2015-04-02 15:56:50 -07:00
Jeff Squyres
5aabee2644
libfabric: a few fixes since 1.0rc3
...
Including a critical atomic initialization fix for the usnic provider.
2015-04-02 15:54:01 -07:00
Howard Pritchard
db680058e3
Merge pull request #507 from hppritcha/topic/coverity_fixes
...
fcoll/static: coverity fixes
2015-04-02 16:12:52 -06:00
Howard Pritchard
05324e32ff
fcoll/static: coverity fixes
...
Fix CIDs 72138, 72139, 72143
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-04-02 14:51:44 -06:00
Ralph Castain
0c043dbdc9
Fix typo in var name
2015-04-02 02:32:42 -07:00
rhc54
6408c87aa0
Merge pull request #506 from rhc54/topic/retry
...
Support attempts to connect async processes
2015-04-02 01:33:06 -07:00
Ralph Castain
a4b466efc4
Support attempts to connect async processes by allowing the oob/tcp connection to retry the attempt to connect to a peer. Off by default, operates if someone specifies how long to wait between retry attempts.
2015-04-01 20:21:23 -07:00
Ralph Castain
9f8ae59162
Properly enclose the different && clauses
2015-04-01 18:48:25 -07:00
Ralph Castain
57c21d5209
Ensure the DVM flows thru the "daemons reported" state
2015-04-01 16:47:34 -07:00
Jeff Squyres
4ad102bb4d
README: whitespace cleanup -- no content change
2015-04-01 15:40:42 -07:00
Jeff Squyres
3b998781df
make_dist_tarball: bump up the minimum versions
2015-04-01 15:05:48 -07:00
Jeff Squyres
d6d8ab01e5
libfabric: the fi_log.h file moved
2015-04-01 14:43:07 -07:00
Jeff Squyres
99754afd25
orterun.c: re-justify the output message text
...
The type-A personality / english lit major in me compells me to
re-justify the text. :-)
2015-04-01 10:57:23 -07:00
Devendar Bureddy
6ddc7ac35c
HCOLL: Fix assertion
...
hcoll context may not be destroyed if it is cached.
2015-04-01 20:33:28 +03:00
Jeff Squyres
26b3c48ccb
usnic: update to API change in libfabric
2015-04-01 06:43:08 -07:00
Jeff Squyres
5e47eb81bf
libfabric: update component configury for new libfabric test
2015-04-01 06:43:08 -07:00
Jeff Squyres
a89a5872c2
libfabric: update to official 1.0.0rc3 release
...
One change was made to the 1.0.0rc3 tarball: remove an errand
debugging printf that accidentally made its way into the tarball (but
isn't in git).
2015-04-01 06:43:08 -07:00
Mike Dubman
8914a9c070
Merge pull request #494 from elenash/modifiers
...
changed mindist mapping policy specifier
2015-04-01 16:31:46 +03:00
Mike Dubman
af63c1815b
Merge pull request #505 from nkogteva/master
...
grpcomm rcd:remove unnecessary malloc warning when number of daemons == 1
2015-04-01 15:41:56 +03:00
Elena
1e913c76c4
changed mindist mapping policy specifier from map-bt dist:device,modifiers to --map-by dist:modifiers -mca rmaps_dist_device device
2015-04-01 15:07:35 +03:00
Nadezhda Kogteva
2d49d9bd45
grpcomm rcd: remove unnecessary malloc warning for case when number of daemons == 1
2015-04-01 11:07:44 +03:00
Mike Dubman
58d002098b
Merge pull request #474 from elenash/master
...
Introduce -tune command line option to set env vars and mca params from ...
2015-04-01 08:23:34 +03:00
Ralph Castain
b468f6a503
Okay, Jeff - use opal_setenv
2015-03-31 20:34:02 -07:00
Ralph Castain
6f9140a341
Add a little more debug to launch
2015-03-31 20:10:21 -07:00
Ralph Castain
e5d96417e7
Update warnings for run-as-root
2015-03-31 17:55:28 -07:00
Ralph Castain
41dd65d6cd
Per Jeff's request, tone down the comments and "standardize" the warning
2015-03-31 17:54:54 -07:00
Ralph Castain
f04eb6a9c0
Extend the root-user protection to some more ORTE tools
2015-03-31 10:34:35 -07:00
Ralph Castain
f863147b05
Per the telecon and chat with Jeff, let root only do the version option without warning. Otherwise, require that the user specifically indicate allow-use-as-root
2015-03-31 10:34:35 -07:00
Nathan Hjelm
b6043ec459
Merge pull request #503 from hjelmn/vader_32bit_fix
...
btl/vader: fix fast box support for 32-bit architectures
2015-03-31 09:12:22 -06:00
Ralph Castain
b209c9efa5
Move the "dvm ready" message to stdout so it is easier to trap
2015-03-30 20:12:56 -07:00
Ralph Castain
6d205a3c80
Ensure that singletons pickup the oob/tcp component
2015-03-30 18:10:08 -07:00
Ralph Castain
2fa56fb329
Ensure that orte-submit picks the correct ess module as it is -never- allowed to be used as a distributed tool
...
Thanks to Mark Santcroos for diagnosing this one.
2015-03-30 18:08:34 -07:00
Nathan Hjelm
17b80a987e
btl/vader: fix fast box support for 32-bit architectures
...
On 32-bit architectures loads/stores of fast box headers may take
multiple instructions. This can lead to a data race between the
sender/receiver when reading/writing the sequence number. This can
lead to a situation where the receiver could process incomplete
data. To fix the issue this commit re-orders the fast box header to
put the sequence number and the tag in the same 32-bits to ensure they
are always loaded/stored together.
Fixes #473
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-03-30 16:28:16 -06:00
rhc54
bc016617a0
Merge pull request #501 from rhc54/topic/sec2
...
Support authentication across security domains
2015-03-30 09:59:43 -07:00
Ralph Castain
79b90a54b6
Remove stale and unused component
2015-03-30 09:56:06 -07:00
Howard Pritchard
0c553c2693
Merge pull request #502 from nkogteva/master
...
sm dstore: set pmix segment size to proper value
2015-03-30 09:05:35 -06:00
Nadezhda Kogteva
a828eada98
sm dstore: set pmix segment size to proper value
2015-03-30 13:34:25 +03:00
Ralph Castain
d07dc362d5
Ensure we can authenticate when crossing security domains by including all available credentials, and letting the receiver use the highest priority one they have in common.
2015-03-28 20:34:26 -07:00
Ralph Castain
b67b3619fc
If we are using the default bindings, and one or more nodes are not setup to support binding, then don't error out - just don't bind.
...
Thanks to Annu Desari for pointing out the problem.
2015-03-28 08:20:24 -07:00
Ralph Castain
2f365720b0
Allow root to request the version and help from mpirun without having to override the run-as-root protection.
...
Thanks to Robert McLay for pointing this out
2015-03-28 08:17:44 -07:00
Ralph Castain
d2d02a1642
ckpt
2015-03-28 07:59:20 -07:00
Jeff Squyres
89e14f5ad6
usnic: fix comment typos
2015-03-27 17:21:53 -07:00
Howard Pritchard
28046cdca7
Merge pull request #499 from hppritcha/topic/alps_configury
...
configury/alps: reduce alps verbosity
2015-03-27 13:19:04 -06:00
Jeff Squyres
2672d8d26b
usnic: update comments explaining fi_av_insert() reaping
...
Asynchronous fi_av_insert()s are somewhat tricky; update the comments
explaining what is going on (and fix some old/stale comments).
2015-03-27 12:01:59 -07:00
Jeff Squyres
6da77ee940
usnic: only warn about each unreachable endpoint once
...
fi_av_insert() is invoked with a context containing each endpoint
USNIC_NUM_CHANNELS times. If the address on that endpoint fails to
resolve / is unreachable / has some error, we'll therefore get
USNIC_NUM_CHANNELS error completions with that same endpoint. We
therefore only want to warn about the unreachability of (and
OBJ_RELEASE) that endpoint the *first* time.
Fixes CSCut46822.
2015-03-27 12:01:59 -07:00
Jeff Squyres
506431d1b6
usnic: count fi_av_insert() completions properly
...
num_left represents the number of times we called fi_av_insert(), and
therefore it indicates the number of non-error completions that we
will receive.
2015-03-27 12:01:59 -07:00