A bunch of fixes from the /tmp/iof-fixes branch that fix up ''some''
(but not ''all'') of the problems that we have seen with iof:
* Reading very large files via stdin redirected to orteun (Sun saw
this)
* Reading a little bit of a large file redirected to orterun's stdin
and then either closing stdin or exiting the process
The Big Change was to make the proxy iof (the one running in non-HNP
orteds) send back a "I'm closing the stream" ACK back to the service
iof. This tells the HNP that there will be nothing more coming from
that peer, and therefore the iof forward should be removed.
Many other minor cleanups/fixes, terminology changes, and
documentation additions are included in this commit as well. However,
there are still some pretty big outstanding issues with IOF that are
not addressed either by #967 or this commit. A few examples:
* IOF was designed to allow multiple subscribers to a single stream.
We're not entirely sure that this works (for one thing, there is
nothing in the ORTE/OMPI code base that uses this functionality).
* There are also resources leaked when processes/jobs exit (per
Ralph's first comment on this ticket).
* There is no feedback to close orterun's stdin when all subscribers
to the corresponding stream have closed stdin.
This commit was SVN r14967.
The following Trac tickets were found above:
Ticket 967 --> https://svn.open-mpi.org/trac/ompi/ticket/967
Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it.
I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn).
This commit was SVN r12597.
- move files out of toplevel include/ and etc/, moving it into the
sub-projects
- rather than including config headers with <project>/include,
have them as <project>
- require all headers to be included with a project prefix, with
the exception of the config headers ({opal,orte,ompi}_config.h
mpi.h, and mpif.h)
This commit was SVN r8985.
the svc component so that it can disable the rml exception callback, fixing
a race condition in the shutdown mechanism of orte.
This should probably go to the v1.0 branch.
This commit was SVN r8893.