Allow the POE RAS to be compled for linux as well as AIX.
The POE RAS is really a Loadleveler RAS, and IU now has
a cluster that uses Loadleveler in a Linux environment (BigRed).
This seems to be the only thing we need to do so far to run
Open MPI on BigRed. Yay :)
This commit was SVN r11600.
set to 1 or 0 instead of the user defined number or default (128).
This caused the PLS to deadlock when using '--debug-daemons' with
more than 2 processes. :(
svn blame says that it was broken in r11347
It is *not* a problem on v1.1 or v1.2 branches.
Bug spotted by Tim Mattox and myself.
This commit was SVN r11575.
The following SVN revision numbers were found above:
r11347 --> open-mpi/ompi@f52c10d18e
create a process component which use CreateProcess to spawn the child.
Special care should be taken in order to correctly redirect the stdin,
stdout and stderr of the child process.
This commit was SVN r11405.
- Remove extra NULL argument from rsh module.
This commit was SVN r11377.
The following SVN revision numbers were found above:
r11347 --> open-mpi/ompi@f52c10d18e
- use the OPAL functions for PATH and environment variables
- make all headers C++ friendly
- no unamed structures
- no implicit cast.
Plus a full implementation for the orte_wait functions.
This commit was SVN r11347.
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.
This commit was SVN r11270.
Other changes:
1. Remove the old xcpu components as they are not functional.
2. Fix a "bug" in orterun whereby we called dump_aborted_procs even when we normally terminated. There is still some kind of bug in this procedure, however, as we appear to be calling the orterun job_state_callback function every time a process terminates (instead of only once when they have all terminated). I'll continue digging into that one.
This will require an autogen/configure, I'm afraid.
This commit was SVN r11228.
Clean up the remainder of the size_t references in the runtime itself. Convert to orte_std_cntr_t wherever it makes sense (only avoid those places where the actual memory size is referenced).
Remove the obsolete oob barrier function (we actually obsoleted it a long time ago - just never bothered to clean it up).
I have done my best to go through all the components and catch everything, even if I couldn't test compile them since I wasn't on that type of system. Still, I cannot guarantee that problems won't show up when you test this on specific systems. Usually, these will just show as "warning: comparison between signed and unsigned" notes which are easily fixed (just change a size_t to orte_std_cntr_t).
In some places, people didn't use size_t, but instead used some other variant (e.g., I found several places with uint32_t). I tried to catch all of them, but...
Once we get all the instances caught and fixed, this should once and for all resolve many of the heterogeneity problems.
This commit was SVN r11204.
Fixed a few very minor compiler complaints in the pls_gridengine_module.c file. ISO C is less forgiving about where variables get declared.
This commit was SVN r11156.
- indent / whitespace cleanup
- don't set --daemon-debug when pls debug is given, as it seems to make
the daemons abort.
This commit was SVN r11113.
The following SVN revision numbers were found above:
r11109 --> open-mpi/ompi@da7df6d257
correctly with MPI_Comm_spawn.
The problem wiht MPI_Comm_spawn was that the 'parent' process was
rmgr.create'ing and then rmgr.launch'ing the children via the rmgr proxy
component. The HNP saw these commands and processed them normally, but
since we never went through the HNP's rmgr (urm component) spawn()
logic the triggers and key/value pairs were never created. So the
children were launched correctly, but since the HNP did not
have any triggers setup, never triggered the xcast for the
children to finish orte_init().
This fix puts the trigger and key/value pair initialization in
rmgr_urm_spawn() for the 'mpirun a.out' case, *and* in the
rmgr_base_unpack routine that deals with the creation of the
job for the child as requested by the proxy component. This
will allow the triggers to be registered for the proxy's request
which only happens during MPI_Comm_spawn*
Small change for a lot of debugging. Notice that his reverts r11037
to its previous version, and adds a newline to handle the spawn
cases.
This commit was SVN r11046.
The following SVN revision numbers were found above:
r11037 --> open-mpi/ompi@5813fb7d2a
By reverting this file (changeset from commit r10708) to its previous
version fixes the problem.
This should be moved to the v1.1 branch where it is also broken.
This commit was SVN r11037.
The following SVN revision numbers were found above:
r10708 --> open-mpi/ompi@febc143d8c