George Bosilca
ed3caa9fc7
mca_base_param_reg_int require a pointer to a component not a name.
...
This commit was SVN r12281.
2006-10-24 16:41:51 +00:00
George Bosilca
7dec7731ce
Instead of size_t use orte_std_cntr_t. Remove all warnings.
...
This commit was SVN r12280.
2006-10-24 16:40:49 +00:00
Ralph Castain
8636ac6a4d
Fix ticket 353 - print out a nice message that the combination of debug-daemons and num_concurrent in the pls rsh launcher will cause deadlock and exit
...
This commit was SVN r12279.
2006-10-24 15:59:02 +00:00
Tim Prins
cb622db7c9
Fixes trac:352
...
Only close off stdout/stderr from the daemons if we are not debugging the slurm pls and --debug-daemons was not passed.
This commit was SVN r12276.
The following Trac tickets were found above:
Ticket 352 --> https://svn.open-mpi.org/trac/ompi/ticket/352
2006-10-24 13:05:13 +00:00
Tim Prins
93d61d01fb
Fix for a problem on SLURM we have neen having since r12243 where mpirun would hang after the process had finished. It turns out that we were always reporting the name of the daemon wrong, but we simply never noticed as we never used it, until r12243. This makes it so we report the name of the daemon correctly.
...
This commit was SVN r12274.
The following SVN revision numbers were found above:
r12243 --> open-mpi/ompi@153e38ffc9
2006-10-24 01:41:28 +00:00
George Bosilca
2a863df0a5
Newline is required by some compilers at the end of a file.
...
This commit was SVN r12244.
2006-10-21 05:56:04 +00:00
Ralph Castain
153e38ffc9
Lesson to be learned: if you send an ack to a recv'd command, better not send it to the same tag it came from - at least, not if there is a persistent recv on that tag!
...
Fix the persistent daemon problem where it was exiting when a job completed. Problem was that the persistent daemon would order the job daemons to exit. They would then send an 'ack' back to the persistent daemon - but the ack consisted of an echo of the "exit" command, which was recv'd by the wrong listener who treated it as a properly sent cmd....and exited.
This commit was SVN r12243.
2006-10-21 02:53:19 +00:00
Ralph Castain
c07d4e2510
Cleaner rendition now extended to other environments. Remove MCA params for backend procs that can cause trouble. Specifically, any directives on the selection of components for RDS, RAS, RMAPS, PLS, and RMGR can be bad mojo on the backend.
...
This patch will cause a problem for cnos, however, as there we want to specifically tell the backends to be "null". I'm working on that issue.
This commit was SVN r12225.
2006-10-20 16:50:13 +00:00
Ralph Castain
02efd07b60
Fix the MCA param passing issue, at least for rsh at the moment. I will clean this up and move it to the other environments once I shift back to a local computer.
...
This commit was SVN r12224.
2006-10-20 15:27:29 +00:00
Ralph Castain
b07a6b1d7a
Fix a major typo that caused remote launch to crash - had something inside the wrong brace
...
This commit was SVN r12221.
2006-10-20 14:30:23 +00:00
George Bosilca
2aa3e51223
Nothing relevant. Only a set of castings to have a clean compile on
...
Windows. The cl.exe compiler is pretty good at complaining about
any kind of non explicit cast.
This commit was SVN r12207.
2006-10-20 02:25:50 +00:00
Ralph Castain
3f55d6897a
Remove the memory debugging options. Fix what appears to be a typo in a help file.
...
This commit was SVN r12107.
2006-10-12 00:44:48 +00:00
Brian Barrett
29c91cf2f3
* Fix issue in odls_bproc where we were using vpid instead of the number of
...
processes launched locally for the stdio file names. This was causing
the expected files to not exist and bproc_vexecmove_io to fail.
* Clean up a bunch of debugging output in the bproc pls
This commit was SVN r12102.
2006-10-11 20:34:12 +00:00
Ralph Castain
f91a95b3fe
Fix the bug that caused mpirun to hang when a remote executable wasn't found using the rsh launcher. Will now test on a remote node
...
This commit was SVN r12095.
2006-10-11 18:43:13 +00:00
Ralph Castain
2da8245be0
Correctly propagate no-daemonize
...
This commit was SVN r12093.
2006-10-11 17:53:17 +00:00
Ralph Castain
e5877cc459
Add the proper valgrind params
...
This commit was SVN r12092.
2006-10-11 17:48:41 +00:00
George Bosilca
b56636c855
orte_pls belong to the PLS framework, therefore it should only be defined
...
in pls.h.
This commit was SVN r12089.
2006-10-11 17:12:22 +00:00
Ralph Castain
27e305347c
Add a couple of options to orterun that support debugging of daemons for memory corruption.
...
Ensure that the environment provided to local application processes isn't "polluted" by the orteds
This commit was SVN r12087.
2006-10-11 15:18:57 +00:00
Brian Barrett
f5b8f1f2f0
Work around Automake not knowing how to properly configure libtool to build
...
Objective C libraries
Refs trac:483
This commit was SVN r12080.
The following Trac tickets were found above:
Ticket 483 --> https://svn.open-mpi.org/trac/ompi/ticket/483
2006-10-10 20:14:26 +00:00
Ralph Castain
0e9dc590b7
Fix typo that didn't make it over from testing on vogon
...
This commit was SVN r12068.
2006-10-09 20:37:39 +00:00
Ralph Castain
cebdc51762
Remove a debugging output
...
This commit was SVN r12066.
2006-10-09 02:10:52 +00:00
Ralph Castain
2e09128337
Many thanks to Jeff for tracking down the typo causing the orte_job_map_t destuctor to fail!!
...
Restore the OBJ_RELEASE calls to cleanup map objects.
This commit was SVN r12064.
2006-10-07 22:44:00 +00:00
Jeff Squyres
efe28d62e9
Fix some compiler errors. I have *not* checked this for correctness;
...
but it does now compile.
This commit was SVN r12062.
2006-10-07 19:10:56 +00:00
Ralph Castain
ae79894bad
Bring the map fixes into the main trunk. This should fix several problems, including the multiple app_context issue.
...
I have tested on rsh, slurm, bproc, and tm. Bproc continues to have a problem (will be asking for help there).
Gridengine compiles but I cannot test (believe it likely will run).
Poe and xgrid compile to the extent they can without the proper include files.
This commit was SVN r12059.
2006-10-07 15:45:24 +00:00
George Bosilca
cda46efd2a
Some missing DECLSPEC
...
This commit was SVN r12047.
2006-10-06 15:21:52 +00:00
George Bosilca
c79c436c8d
Cleanups. Remove all __WINDOWS__ checks as this module will never
...
get compiled on Windows.
This commit was SVN r12011.
2006-10-05 06:17:30 +00:00
George Bosilca
dbe7f8ac32
Always return bool.
...
This commit was SVN r12009.
2006-10-05 05:45:18 +00:00
George Bosilca
03083cc1f6
Don't release the values[0] before it get initialized.
...
This commit was SVN r11999.
2006-10-05 05:25:18 +00:00
Ralph Castain
cd7d87aa7b
Define the map data types for dss compatibility. Setup to debug bproc
...
This commit was SVN r11955.
2006-10-03 17:40:00 +00:00
Ralph Castain
9eb14425b7
The last of the debug messages that keep hiding. My apologies.
...
This commit was SVN r11937.
2006-10-02 18:43:32 +00:00
Ralph Castain
3fd67a038f
Bring comm_spawn and persistent operations online. Still some minor problems, though - so don't use them yet, please.
...
This won't affect anything except those two scenarios.
This commit was SVN r11936.
2006-10-02 18:29:15 +00:00
Ralph Castain
12328395ae
Missed a couple of debug statements
...
This commit was SVN r11935.
2006-10-02 15:46:41 +00:00
Ralph Castain
7494a7a83f
Clean out some debugging statements that were inadvertently left in the commit
...
This commit was SVN r11933.
2006-10-02 15:03:18 +00:00
Ralph Castain
559b9b0ae8
Continue beating on comm_spawn. Setup to debug bproc.
...
This commit was SVN r11932.
2006-10-02 14:58:22 +00:00
Ralph Castain
65593cd67e
Fix a few Cyrador warnings
...
This commit was SVN r11930.
2006-10-02 13:00:32 +00:00
Ralph Castain
121f834776
Continue bringing comm_spawn back online. Ensure all RM frameworks post their HNP receives. Fix the rmgr proxy component.
...
Still need some work on the proxy component, and on job termination for persistent daemon case.
This commit was SVN r11928.
2006-10-02 00:46:31 +00:00
Brian Barrett
e464adcd51
Need to tell the daemon how many procs it will start (always 1, because of
...
the way we do the fake mapping thing...
This commit was SVN r11924.
2006-10-01 23:25:22 +00:00
Brian Barrett
95ba51fbd4
* Clean up debugging output so that it's useful
...
* Error message in an NSError object is localizedDescription, not
localizedErrorReason. The latter is a decription of how the error
can occur, which is usually nothing in XGrid frameworks.
* Clean up silly error in finding the Kerberos Service Principal
when using Kerberos authenticaion
* Print useful error message when a connection unexpectedly closes,
as this is usually authentication related...
This commit was SVN r11923.
2006-10-01 22:43:17 +00:00
Brian Barrett
9807a38458
Always initialize the base output stream, but only set verbose if requested.
...
Otherwise, the PLS components have pay more attention to debugging streams
than the rest of the OMPI source code
This commit was SVN r11922.
2006-10-01 22:37:30 +00:00
Ralph Castain
db6a93fa63
Fix a couple of reported issues:
...
1. PLS finalize was not being called. Now ensure that happens during orte_finalize.
2. Errmgr proxies were sending their messages to the wrong tag - typical cut/paste error.
This commit was SVN r11891.
2006-09-29 12:45:50 +00:00
Brian Barrett
8943f583bf
quiet some debugging output
...
This commit was SVN r11813.
2006-09-26 04:10:07 +00:00
Brian Barrett
d8d55a760f
First attempt at Kerberos support for the XGrid process starter
...
refs trac:345
This commit was SVN r11812.
The following Trac tickets were found above:
Ticket 345 --> https://svn.open-mpi.org/trac/ompi/ticket/345
2006-09-26 03:54:38 +00:00
Brian Barrett
9733c8e3bd
Update XGrid RAS and PLS to the new infrastructure. Not yet super well
...
tested, but starting to get there...
This commit was SVN r11810.
2006-09-26 03:26:45 +00:00
Ralph Castain
f906af983a
Forgot to change the silly Makefile.am names - sorry Cyrador!
...
This commit was SVN r11670.
2006-09-15 04:52:20 +00:00
Ralph Castain
37dfdb76eb
Here is the major MAD-cure commit. I have written plenty about it, so I refer you here to those messages for a description of everything that was done.
...
This commit was SVN r11661.
2006-09-14 21:29:51 +00:00
George Bosilca
17afe7dc9f
Do it on the correct way as this is normally compiled as a module.
...
This commit was SVN r11660.
2006-09-14 21:22:41 +00:00
George Bosilca
01c5a115b2
Don't export the POE module. Only the component have to be exported (visible).
...
This commit was SVN r11659.
2006-09-14 21:20:31 +00:00
Josh Hursey
908f31fe9f
Fix a code clarity issue in the POE PLS.
...
Allow the POE RAS to be compled for linux as well as AIX.
The POE RAS is really a Loadleveler RAS, and IU now has
a cluster that uses Loadleveler in a Linux environment (BigRed).
This seems to be the only thing we need to do so far to run
Open MPI on BigRed. Yay :)
This commit was SVN r11600.
2006-09-09 05:13:15 +00:00
Josh Hursey
160120b4c5
Fix a cut-n-paste error that causes the 'num_concurrent' to be
...
set to 1 or 0 instead of the user defined number or default (128).
This caused the PLS to deadlock when using '--debug-daemons' with
more than 2 processes. :(
svn blame says that it was broken in r11347
It is *not* a problem on v1.1 or v1.2 branches.
Bug spotted by Tim Mattox and myself.
This commit was SVN r11575.
The following SVN revision numbers were found above:
r11347 --> open-mpi/ompi@f52c10d18e
2006-09-08 15:17:17 +00:00
Jeff Squyres
0f11584a6c
* Update svn:ignore
...
* Remove svn:executable from non-executable files
This commit was SVN r11555.
2006-09-07 17:17:40 +00:00