1
1

15 Коммитов

Автор SHA1 Сообщение Дата
Sven Stork
a86deb460e - export required symbols
This commit was SVN r13810.
2007-02-27 09:43:32 +00:00
Rich Graham
f6c99d0207 set orte_odls_base.components_available to false if no odls components are
available.  Startup now works if no odls components are availble.

This commit was SVN r13339.
2007-01-27 15:37:13 +00:00
Ralph Castain
8ca415a0c5 Remove duplicate orte_odls declaration
This commit was SVN r12776.
2006-12-06 13:44:41 +00:00
Ralph Castain
897744cdeb Two major changes to the runtime:
1. implement and enable the non-described buffer operations. I will send out a more detailed explanation separately. However, this mode of operation (which is now the default) significantly reduces message size during startup. If you want the described buffers, set the mca param "-mca dss_describe_buffer 1".

2. revise the xcast system to support both linear and binomial tree broadcast methods. Since we are seeing scenarios where the binomiall tree can cause problems, I have made the linear method the default. To run with the binomial tree, set the mca param "-mca oob_xcast_mode binomial".

3. add some detailed timing reports to the xcast operation. These are enabled via "-mca oob_xcast_timing 1".

4. add some more unit tests for the dss and gpr (focused on support for the non-described buffer)

This commit was SVN r12722.
2006-12-01 22:30:39 +00:00
Ralph Castain
ca5b4358fa Need to revise the display-map-at-launch option so it is active not only for the initial launch, but applies to any subsequent comm_spawn events too.
Add placeholders for the new orte tools. These don't actually do anything yet - in fact, I have set the .ompi_ignore so that you won't compile them (I have set a .ompi_unignore for me). Please let me know if you encounter any trouble with this - the ompi_ignore's should protect everyone.

This commit was SVN r12616.
2006-11-17 02:58:46 +00:00
Ralph Castain
f7fc19a2ca Create the ability to re-use existing daemons. Included in the commit:
1. new functionality in the pls base to check for reusable daemons and launch upon them

2. an extension of the odls API to allow each odls component to build a notify message with the "correct" data in it for adding processes to the local daemon. This means that the odls now opens components on the HNP as well as on daemons - but that's the price of allowing so much flexibility. Only the default odls has this functionality enabled - the others just return NOT_IMPLEMENTED

3. addition of a new command line option "--reuse-daemons" to orterun. The default, for now, is to NOT reuse daemons. Once we have more time to test this capability, we may choose to reverse the default. For one thing, we probably want to investigate the tradeoffs in start time for comm_spawn'd processes that reuse daemons versus launch their own. On some systems, though, having another daemon show up can cause problems - so they may want to set the default as "reuse".

This is ONLY enabled for rsh launch, at the moment. The code needing to be added to each launcher is about three lines long, so I'll be doing that as I get access to machines I can test it on.

This commit was SVN r12608.
2006-11-15 21:12:27 +00:00
Ralph Castain
4636125e2d Modify the RMGR components to allow job setup with a given jobid, and add another attribute so that we can setup triggers without launching.
Add some debugging output to the ODLS default module, and the orted.

Remove the nodename data from the ODLS info report - that info is already stored in the registry by the RMAPS framework upon completing the mapping procedure.

Add another test program that does an ORTE-only dynamic spawn (gasp!). Looks just like comm_spawn - just no MPI involved.

Modify the ODLS to release the processor when we "kill" local procs in a more scalable fashion. It previously had a sleep in it that Jeff's prior commit removed. However, he introduced some Windows code into the non-Windows component (protected by "if"s, but unnecessary). This is a more general solution he proposed - included here so I could get things to compile properly.

This commit was SVN r12579.
2006-11-13 18:51:18 +00:00
Ralph Castain
4e50cdae52 This commit accomplishes two things:
1. Fix the "hang" condition when an application isn't found. It turned out that the ODLS had some difficulty with the process actually not having been started - hence, it never called the waitpid callback. As a result, the "terminated" trigger didn't fire, and so mpirun didn't wake up. With this change, the HNP's errmgr forces the issue by causing the trigger to fire itself when an abort condition occurs.

2. Shift the recording of the pid and the nodename from mpi_init to the orted launcher. This allows programs such as Eclipse PTP to get the pids even for non-MPI applications. In the case of bproc, the pls handles this chore since we don't use orteds in that system.

This commit was SVN r12558.
2006-11-11 04:03:45 +00:00
Ralph Castain
c07d4e2510 Cleaner rendition now extended to other environments. Remove MCA params for backend procs that can cause trouble. Specifically, any directives on the selection of components for RDS, RAS, RMAPS, PLS, and RMGR can be bad mojo on the backend.
This patch will cause a problem for cnos, however, as there we want to specifically tell the backends to be "null". I'm working on that issue.

This commit was SVN r12225.
2006-10-20 16:50:13 +00:00
George Bosilca
7982a23bde ORTE_DECLSPEC should be ...
This commit was SVN r12206.
2006-10-20 02:23:54 +00:00
Ralph Castain
ab196c3121 Okay, this fixes the problem of MCA params spreading too far. Sorry for the multiple corrections.
This commit was SVN r12201.
2006-10-19 22:51:02 +00:00
George Bosilca
017af37291 Keep only the useful _DECLSPEC and _DECLSPEC some globals.
This commit was SVN r12042.
2006-10-06 07:24:34 +00:00
George Bosilca
b7579b09c7 Correctly handle the ORTE_DECLSPEC attribute.
This commit was SVN r12041.
2006-10-06 07:08:17 +00:00
George Bosilca
fd76e56279 One protection against C++ compilers is more than enough.
This commit was SVN r11998.
2006-10-05 05:24:43 +00:00
Ralph Castain
37dfdb76eb Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done.
This commit was SVN r11661.
2006-09-14 21:29:51 +00:00