Ralph Castain
64ec498a20
Add a declspec
2015-03-05 19:48:27 -08:00
Ralph Castain
eaa666bd57
Instantiate debug output variable
2015-03-05 12:25:49 -08:00
Ralph Castain
7ce0a9931c
Updates to the notifier interfaces to support system events
2015-03-05 10:39:25 -08:00
Gilles Gouaillardet
7de3f35b90
pml/rsh: fix misc memory leaks
...
as reported by Coverity with CIDs 71091, 71230, 71231, 72274, 72389,
1196718 and 1196719
2015-03-05 20:03:37 +09:00
Gilles Gouaillardet
33352e9506
schizo: fix misc memory leak
...
as reported by Coverity with CID 1196722
2015-03-05 14:06:18 +09:00
Gilles Gouaillardet
d1b2f043ff
fix misc memory leaks
...
as already reported by Coverity with CIDs
71818, 71819, 72250, 715767, 1196749 and 1274002
2015-03-05 13:58:05 +09:00
Gilles Gouaillardet
42f5a36ee3
rmaps/seq: fix misc memory leaks
...
as reported by Coverity with CIDs 1269886 and 1269887
2015-03-02 15:31:11 +09:00
Gilles Gouaillardet
0c7a2846d1
rmaps/rank_file: fix misc memory leaks
...
as reported by Coverity with CIDs 72250 and 1196774
2015-03-02 15:31:11 +09:00
Gilles Gouaillardet
c15b919635
rmaps/lama: fix misc memory leaks
...
as reported by Coverity with CIDs 719263, 719264, 1196712 and 1269842
2015-03-02 15:31:11 +09:00
Gilles Gouaillardet
456baeb71b
rmaps/base: fix misc memory leaks
...
as reported by Coverity with CIDs 1196751, 1196754, 1196755 and 1269866
2015-03-02 15:31:11 +09:00
Gilles Gouaillardet
d8f3b378b3
orte/oob: fix misc memory leaks
...
as reported by Coverity as CIDs 1196748, 1196749 and 1269895
2015-03-02 15:31:11 +09:00
Mike Dubman
dbc15009b6
Merge pull request #415 from alinask/topic/fix_fork_support_flow
...
Fix the calls to ibv_fork_init and remove btl_openib_want_fork_support.
2015-02-26 21:50:11 +02:00
Nathan Hjelm
883d09376f
Fix coverity #1271536
2015-02-25 11:35:45 -07:00
rhc54
efbb57430b
Merge pull request #419 from nkogteva/master
...
grpcomm brcks: fix copy-paste bug which affects performance
2015-02-25 07:39:55 -08:00
Alina Sklarevich
e4c4e7df5e
Fix the calls to ibv_fork_init and remove btl_openib_want_fork_support.
...
In order to have an effect, ibv_fork_init should be called in the
beginning of the verbs initialization flow - before the calls to the
ibv_create_qp and ibv_create_cq verbs.
These functions are called from the oob/ud code and by the time the
other verbs components (btl openib, pml yalla, ...) call ibv_fork_init,
it's too late. This commit forces the call to ibv_fork_init (if it's
requested) right at the beginning of all the components that are using
verbs.
(ibv_fork_init() can be safely called multiple times)
This commit also removes the btl_openib_want_fork_support mca parameter
and adds a new mca parameter instead - opal_verbs_want_fork_support.
Through this new parameter, fork support may be requested for ALL
components.
The default value for this parameter is set to 1.
Before this commit the btl_openib_want_fork_support parameter didn't
provide fork support for the openib btl if its value was set to 1.
(because when openib called ibv_fork_init, it was already after the
calls to ibv_create_* in oob/ud and thereofre it failed).
2015-02-25 10:58:50 +02:00
Jeff Squyres
a85a392896
Merge pull request #422 from jsquyres/topic/coverity-fixes
...
Some Coverity fixes
2015-02-24 17:00:10 -05:00
Jeff Squyres
05f00aface
plm base: ensure mca_base_var_get_value() and mca_base_var_find() succeed
...
This was CID 993712
2015-02-24 15:48:50 -05:00
Ralph Castain
451bd16a10
Remove dead code
2015-02-24 12:41:12 -08:00
Jeff Squyres
398ae15533
rmaps_base_frame: remove dead code
...
This was CID 1196641
2015-02-24 15:24:11 -05:00
Jeff Squyres
71ae0ad5ec
oob_tcp_component: add #if OPAL_ENABLE_IPV6 around IPv6-specific code
...
This was CID 1196629
2015-02-24 15:24:11 -05:00
Jeff Squyres
0bd2783b91
oob_usock: don't try to close the socket if it didn't open
...
This was CID 1196663
2015-02-24 15:24:09 -05:00
Jeff Squyres
e2223cd9bf
plm_rsh: ensure cwd array is \0-terminated
...
This was CID 72257
2015-02-24 15:24:08 -05:00
Nathan Hjelm
ed78553512
Update opal_free_list_t usage to reflect new class interface.
...
Please verify your components have been updated correctly. Keep in
mind that in terms of threading:
OPAL_FREE_LIST_GET -> opal_free_list_get_st
OPAL_FREE_LIST_RETURN -> opal_free_list_return_st
I used the opal_using_threads() variant anytime it appeared multiple
threads could be operating on the free list. If this is not the case
update to _st. If multiple threads are always in use change to _mt.
2015-02-24 10:05:44 -07:00
Nadezhda Kogteva
c4d6ca6468
grpcomm brcks: fix copy-paste bug which affects performance
2015-02-24 17:06:39 +02:00
Jeff Squyres
226a814c9d
grpcomm_brks: fix minor compiler warning (rc used before set)
...
Also check for OBJ_NEW returning NULL.
2015-02-23 09:04:45 -08:00
Jeff Squyres
600858609e
grpcomm_rcd: fix minor compiler warning (rc used before set)
...
Also check for OBJ_NEW returning NULL.
2015-02-23 09:03:07 -08:00
Howard Pritchard
bf89131f9e
add owner files to opa/ompi/orte mca directories
...
This commit adds an owner file in each of the component directories
for each framework. This allows for a simple script to parse
the contents of the files and generate, among other things, tables
to be used on the project's wiki page. Currently there are two
"fields" in the file, an owner and a status. A tool to parse
the files and generate tables for the wiki page will be added
in a subsequent commit.
2015-02-22 15:10:23 -07:00
Jeff Squyres
ec62766a71
notifier base: remove unused variables
2015-02-20 07:06:13 -08:00
Elena
48eae25b8f
fixed issue with grpcomm rcd and brks algorithms which led to performance issues: data just for part of processes was unpacked and stored locally during fence, therefore clients were forced to ask daemons for data directly during get request
2015-02-20 16:41:25 +02:00
Ralph Castain
852fbca020
Shut coverity up
2015-02-17 21:17:23 -08:00
Ralph Castain
207cc74f87
Correct name of help file
2015-02-17 16:03:20 -08:00
Ralph Castain
78245e8a33
Continue massaging of the notifier framework. Convert it to an event-driven interface. Add the ability to report job state if requested. Cleanup object declarations.
2015-02-17 12:51:11 -08:00
Ralph Castain
22f1d29b82
Re-introduce the ORTE notifier framework for logging errors that would otherwise result in abort for persistent systems. Thanks to L. Rajeshnarayanan of Intel for the contribution
...
Subsequent commits will integrate this capability with the state and errmgr frameworks.
2015-02-16 12:46:58 -08:00
Ralph Castain
116fcaff2c
Start adding support for cmd line options to orte-submit
2015-02-10 12:13:21 -08:00
Ralph Castain
063e4c9989
Cleanup the pretty-print of odls cmds as some were missing. Add a new cmd to terminate the DVM, which the HNP will use to trun around and issue an xcast to the DVM.
2015-02-10 08:27:13 -08:00
Ralph Castain
3ae3b96c17
Fix master compilation - a buried header dependency must have been removed.
2015-02-10 07:22:10 -08:00
Howard Pritchard
b62d9c2c70
ess/alps: fix compile issue for pgi
...
remote -fi-noident cflag option. Wasn't helping anyway
and caused pgi compiles to break.
2015-02-09 20:49:04 -08:00
Ralph Castain
a3275aa867
Once again, fix the blasted singleton comm_spawn
2015-02-05 17:34:25 -08:00
Ralph Castain
f28238af59
Fix a race condition seen by Absoft during finalize. Stop the orte progress thread without cleaning it up, thus allowing the frameworks to still cancel their posted recv's. Then cleanup the memory footprint afterwards.
2015-02-05 11:41:37 -08:00
Jeff Squyres
938b8e1dad
schitzo: fix free of uninitialized value
...
The "param" value is not assigned before this free() statement. So
remove it.
(yay clang compiler warnings)
2015-02-04 15:50:24 -05:00
Ralph Castain
251084a2da
When a tool requests the spawn of a new job, then exclusively forward output to that tool - the DVM should not output its own copy as well.
2015-02-04 07:59:47 -08:00
Ralph Castain
2b0b012460
Continue refinement of the DVM operations. Send the spawn request to the right place (it helps) as it isn't a comm_spawn request and has to be treated a little differently. Ensure IO gets forwarded back to the tool. Ensure the tool outputs show_help locally as there is no place to send it.
2015-02-04 06:21:54 -08:00
Ralph Castain
7299cc3ab9
Cleanup the communications handshake so that orte-submit properly terminates upon job completion, and properly sends the terminate command to orte-dvm
2015-02-03 07:25:43 -08:00
Ralph Castain
ec5ccb76cf
Enable persistent ORTE DVM so users can execute multiple OMPI jobs within an allocation without restarting the DVM every time.
2015-01-30 11:00:43 -08:00
Ralph Castain
b838df9eb8
Get slurm to stay out of the way on singletons
2015-01-27 09:29:43 -06:00
Ralph Castain
294ebc907a
Fix singleton operations so they can work inside a slurm environment
2015-01-27 09:29:42 -06:00
Ralph Castain
3eca55caec
Continue fixing singletons in slurm environments
2015-01-27 09:29:42 -06:00
Ralph Castain
88c38f87d2
Get the orteds to use schizo as well
2015-01-27 09:29:42 -06:00
Ralph Castain
028b00154d
Complete implementation of the schizo framework to support OMPI component
2015-01-27 09:29:42 -06:00
Ralph Castain
11c92eefe6
ckpt
2015-01-27 09:29:42 -06:00