This completes the minor changes required to the PLS components. Basically, there is a small change required to the parameter list of the orted cmd functions. I caught and did it for xcpu and poe, in addition to the components listed in my email - so I think that only leaves xgrid unconverted.
The orted fail-to-start mods will also make changes in the PLS components, but those can be localized so they come in one at a time.
This commit was SVN r14499.
Test for system limits (where known) prior to doing things like fork and pipe since some systems aren't very nice about it when we try to exceed such limits.
This commit was SVN r14494.
There is a binomial algorithm in the code (i.e., the HNP would send to a subset of the orteds, which then relay it on according to the typical log-2 algo), but that has a bug in it so the code won't let you select it even if you tried (and the mca param doesn't show, so you'd *really* have to try).
This also involved a slight change to the oob.xcast API, so propagated that as required.
Note: this has *only* been tested on rsh, SLURM, and Bproc environments (now that it has been transferred to the OMPI trunk, I'll need to re-test it [only done rsh so far]). It should work fine on any environment that uses the ORTE daemons - anywhere else, you are on your own... :-)
Also, correct a mistake where the orte_debug_flag was declared an int, but the mca param was set as a bool. Move the storage for that flag to the orte/runtime/params.c and orte/runtime/params.h files appropriately.
This commit was SVN r14475.
This bug(?) become apparent due to the installdirs commit since these tools
were not finding the proper libraries since the paths were wonkey.
It all looks good now. :)
This commit was SVN r14461.
finally brings in functionality that is already on the 1.2 branch, and
was developed and tested in the v1.2ofed branch (and other places).
Short version of new features:
* Support for ibv_fork_init()
* Automatically fill in the openib BTL bandwidth value by
querying the HCA port
* Installdirs functionality
* Fixes to always use -I in the Fortran wrapper compilers (#924)
* Gleb's mpool updates
* Remove some kruft in btl/openib/configure.m4, therefore
fixing the harmless warnings noted in #665
* Bunches of updates to the Linux RPM spec file
I.e., effectively the same thing that r14411 brought to the v1.2
branch.
Also effectively brought in r14432 and r14433 (some fixes on top of
the original r14411 commit to v1.2). Still need to bring in the moral
equivalent of r14445 after this commit (fixes to installdirs).
This commit was SVN r14449.
The following SVN revision numbers were found above:
r14411 --> open-mpi/ompi@83b31314ae
r14432 --> open-mpi/ompi@a48f160595
r14433 --> open-mpi/ompi@68f346d2bc
r14445 --> open-mpi/ompi@13d366b827
Create a sentinel value in the metadata file to clearly indicate
that the sequence number is complete (versus in progress). This
way we do not try to restart from an invalid sequence number
which can lead to badness.
This commit was SVN r14423.
the program it try to spawn is missing.
Description of the problem: When the rsh pls try to spawn a local
process which is missing (such as a removed orted) the orterun
deadlock.
Description of the fix: The forked child deal with finding the
program to be executed. If it fails to find it, then instead of
calling exit (as a normal forked program is expected to do) it
continue the execution using a execution path it was never
expected to use (back in orterun and then main). Bad things
happens as expected. Forcing the child to use exit when it fails
to find the orted (and forcing the child to use exit everywhere
instead of return) correct the logic of the rsh pls and make it
behave as expected.
This commit was SVN r14377.
Per discussions with Brian and Ralph, make a slight correction in
where components are installed. Use $pkglibdir, not $libdir/openmpi,
so that when compiled in the orte trunk, components are installed to
the right directory (because the component search patch is checking
$pkglibdir).
This commit was SVN r14345.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r14289
follow the same behavior as before, the changes just make sure the check is done
in linear time and the memory usage is kept to a minimum.
This commit was SVN r14331.
as the IOF components lack the required finalize function. Instead rely on the
module finalize. Read the comment or more informations.
This commit was SVN r14323.
Collect the base 'orted' command line into a base function since most of the
PLS components were duplicating this code. Add AMCA parameter command line
component to the base set.
Add Aggregate MCA parameter support to the following PLS components:
- gridengine
- process
- slurm
- poe
- tm
Improve support for 'rsh' component.
Did/could not support the following components:
- bproc
- proxy
- xcpu
- cnos
- xgrid
The above components had peculiar needs that made it non-trivial to add an
option. The authors of these components need to help in supporting this
new option.
I was only able to test the SLURM and RSH components due to system availability.
The others should work without problem.
This commit was SVN r14284.
The following Trac tickets were found above:
Ticket 976 --> https://svn.open-mpi.org/trac/ompi/ticket/976