Ralph Castain
0ba845fed2
Continue development of regular expression support by implementing it for slurm launches. Works for both initial (cmd line and non-cmd line) and comm_spawn launch.
...
Additional work required to fully enable static port support when using cmd line regular expression launch system.
This commit was SVN r21502.
2009-06-23 20:25:38 +00:00
George Bosilca
bca8015b94
Add the proc_get_daemon capability to the bproc launcher.
...
This commit was SVN r21501.
2009-06-23 20:21:55 +00:00
George Bosilca
7339530061
Remove the prototype of a non-existant function.
...
This commit was SVN r21500.
2009-06-23 19:50:23 +00:00
George Bosilca
225f2b01c9
Don't release uninitialized objects.
...
This commit was SVN r21499.
2009-06-23 19:47:58 +00:00
Jeff Squyres
ecaa00ba73
Patch from Nadia/Bull from the opal-sos HG branch:
...
orte_session_dir_finalize doesn't clean the right directories.
orte_session_dir_cleanup neither.
This patch fixes several issues:
1. orte_session_dir_cleanup():
1. when jobid is not a wildcard, jobid is used to build the job
session dir (instead of ORTE_LOCAL_JOBID).
1. ORTE_SUCCESS is unconditionally returned (instead of rc that
might have been previously set to another value).
1. orte_session_dir_finalize():
1. convert_jobid_to_string is not the right call to get the job
session dir.
1. in some places orte_process_info.top_session_dir is directly
used, without being prefixed with the base directory.
Factorized the code sections that build the job_session_dir into a
single orte_build_job_session_dir() function that is now called by
both orte_session_dir_finalize() and orte_session_dir_cleanup().
Signed-off-by: Nadia Derbey <Nadia.Derbey@bull.net>
This commit was SVN r21498.
2009-06-23 16:07:41 +00:00
Nysal Jan
938599cb2d
Fix build failure with latest IBM XL C/C++ v10.1 compiler. Also this seems like cleaner code.
...
This commit was SVN r21497.
2009-06-23 14:08:04 +00:00
Ralph Castain
7a802a9d3a
Move the plm designation to the argv from the env to support those systems not setup to pass env via rsh.
...
This commit was SVN r21495.
2009-06-22 18:08:45 +00:00
George Bosilca
7f24b41051
This function doesn't have to be globally visible.
...
This commit was SVN r21494.
2009-06-22 17:14:54 +00:00
Jeff Squyres
246caafe06
Correct the logic of the check for the env variable
...
OMPI_MCA_memory_ptmalloc2_disable and also add an explicit check for
FAKEROOTKEY (see http://bugs.debian.org/531522 ).
This commit was SVN r21489.
2009-06-20 11:22:06 +00:00
Ralph Castain
c199dbb241
Revert r21480 - we already did open/select the PLM on non-HNP daemons. This commit broke slave launches on Torque and SLURM as it caused the PLM to be open/selected twice.
...
The open/select of the PLM is done in orte/mca/ess/base/ess_base_std_orted.c. It only is done when the PLM MCA param is set directing a specific PLM be selected. The function
orte_plm_base_orted_append_basic_args
clears the params passed to the daemon of any PLM selection passed to the HNP. Each PLM then adds a PLM directive if-and-only-if backend PLM support is desired. At present, Torque, SLURM, and rsh all specify this support and direct that the backend orted open the "rsh" PLM.
This commit was SVN r21488.
The following SVN revision numbers were found above:
r21480 --> open-mpi/ompi@ed585bce8a
2009-06-20 03:58:00 +00:00
Camille Coti
ed585bce8a
Initialize the PML if we are a non-HNP daemon.
...
If we do not initialize the PML, non-HNP daemons will not be able to use its functions. For example, RSH needs it when the tree_spawn mode is
enabled: daemons call orte_pml.remote_spawn() function to spawn their children in the deployment tree.
This commit was SVN r21480.
2009-06-19 18:50:06 +00:00
Jeff Squyres
f42727707b
Per http://bugs.debian.org/531522 , add an MCA param/environment
...
variable to allow the disabling of the ptmalloc2 component at init
time.
This commit was SVN r21479.
2009-06-19 10:50:23 +00:00
Ralph Castain
110c95fb1c
Make the match for the "env" option to mpi_show_mca_params to be less strict - match it if the string at least starts with "env" so "enviro" and "environ" all match.
...
This commit was SVN r21478.
2009-06-19 03:43:53 +00:00
Matthias Jurenz
edce15d08f
Changes in VT documentation:
...
- added note for BFD license issues
- minor cleanups
This commit was SVN r21472.
2009-06-18 14:33:04 +00:00
Jeff Squyres
c39998db17
Also show the "you might not have enough registered memory" warning
...
message earlier in the openib BTL startup sequence
This commit was SVN r21469.
2009-06-18 12:24:39 +00:00
Ralph Castain
771ce035a5
Complete implementation of regular expression generator and parser - now handles leading zero's and suffix in node names.
...
This commit was SVN r21468.
2009-06-18 04:36:00 +00:00
Jeff Squyres
2a5813ac2d
Silence a compiler warning.
...
This commit was SVN r21459.
2009-06-17 12:26:38 +00:00
Matthias Jurenz
d7aa9abc4e
Added configure option '--with[out]-bfd' to control usage of BFD library to get symbol information for GNU, Intel, and Pathscale compiler instrumentation
...
This commit was SVN r21458.
2009-06-17 08:51:26 +00:00
Ralph Castain
8db7a9f9a7
Add regular expression generator to encode complete nid/pid maps - decoder to come.
...
This commit was SVN r21455.
2009-06-17 02:54:20 +00:00
Ralph Castain
85e55c5087
Make the IOF macros match for debug vs optimized builds
...
This commit was SVN r21453.
2009-06-16 22:30:53 +00:00
Rolf vandeVaart
36a560506c
Fix error message to match code default.
...
This commit was SVN r21452.
2009-06-16 20:59:53 +00:00
Rolf vandeVaart
633b996a0f
Add sys/wait.h so we can compile on Solaris.
...
This commit was SVN r21451.
2009-06-16 19:48:43 +00:00
Ralph Castain
e9fc0a74fb
Silence compiler warnings
...
This commit was SVN r21445.
2009-06-16 13:34:31 +00:00
Ralph Castain
74bd80afd9
Do not preload binaries or files if the app isn't being executed on this node
...
This commit was SVN r21444.
2009-06-16 03:12:30 +00:00
Rainer Keller
5e6061af02
- Few fixes and comments
...
This commit was SVN r21443.
2009-06-15 21:12:04 +00:00
Ralph Castain
d1dd8c2653
Ensure we accurately count the number of new daemons to be launched, especially if we are restarting processes.
...
Have the resilient mapper also setup for new daemons in case the PLM needs them.
This commit was SVN r21437.
2009-06-15 13:55:01 +00:00
Ralph Castain
49d6cefe07
Fix a typo in the lanl platform files
...
This commit was SVN r21435.
2009-06-15 13:41:43 +00:00
Matthias Jurenz
24afd66352
Reverting previous commit and going back to r21178
...
This commit was SVN r21433.
The following SVN revision numbers were found above:
r21178 --> open-mpi/ompi@fb3bbb8021
2009-06-15 11:41:33 +00:00
Abhishek Kulkarni
63845591a0
This fix adds a missing topic (cant-open-logfile) to help-orte-top.txt
...
and changes another incorrectly titled topic from pid-required
to no-contact-given.
This commit was SVN r21431.
2009-06-13 23:05:44 +00:00
Ralph Castain
b087336c6c
Refinements to platform file
...
This commit was SVN r21429.
2009-06-12 19:48:30 +00:00
Ralph Castain
c0c56e30c9
Add a missing function to the resilient mapper so it defines daemons in case they are needed
...
This commit was SVN r21428.
2009-06-12 19:48:13 +00:00
Ralph Castain
44bb265a52
Add a new MPI_Info key to preposition OMPI libraries - implementation underway, but this just defines and passes the new key
...
This commit was SVN r21425.
2009-06-12 17:53:13 +00:00
Ralph Castain
170327e575
Reorg the rmaps components to collect shared code for byslot and bynode mapping in the base so we quit duplicating it in every mapper
...
This commit was SVN r21424.
2009-06-12 17:52:17 +00:00
Ralph Castain
1ee4acb247
Cleanup how we handle pointer arrays in the odls base fns to avoid potential segfaults
...
This commit was SVN r21423.
2009-06-12 17:51:23 +00:00
Ralph Castain
89b7e20a8b
Add some Cisco-related platform files
...
This commit was SVN r21422.
2009-06-12 17:50:14 +00:00
Jeff Squyres
814a8f5e0f
* Fix #1916 : endian problems in iwarp wireup on big endian machines
...
(now works on both big and little endian machines)
* Be a little more flexible when looking for active devices in
btl_openib_component.c
* Add device name and port number to lots of verbose and help
messages
* Add a bunch of verbose messages to give insight into what is
occurring during all the CPC wireups
This commit was SVN r21418.
2009-06-11 17:30:30 +00:00
Ralph Castain
4881cd0df3
Revert the prior change out from the individual .h files - the problem was in the Makefile.am's, causing the make dist to fail.
...
This commit was SVN r21414.
2009-06-11 03:15:47 +00:00
Ralph Castain
91ab2b3e4f
Specify complete path to included header files so it compiles in all environments
...
This commit was SVN r21412.
2009-06-11 02:46:30 +00:00
Jeff Squyres
6777f01380
Also look for /dev/ipath
...
This commit was SVN r21410.
2009-06-11 00:35:21 +00:00
Avneesh Pant
295c0c36bc
Add new QLogic adapter to device parameter file.
...
This commit was SVN r21409.
2009-06-10 23:37:38 +00:00
Ralph Castain
70b8c89b44
Fix slave spawn, which was hanging because the local daemon never saw the slave job report - it doesn't do it in the normal way, and so the slave launch system itself has to "fake it".
...
Also complete implementation to printout app_context objects so we see all the fields.
This commit was SVN r21408.
2009-06-10 19:01:08 +00:00
Ralph Castain
f24cefe3d2
Shift the check for adequate file descriptors to after we check if this proc is part of the job to be launched - no point in doing the check more often than absolutely required
...
This commit was SVN r21406.
2009-06-10 15:18:48 +00:00
Ralph Castain
87d7d693f0
Add a notifier call when the oob retries are exceeded so sys admins are aware of the problem
...
This commit was SVN r21405.
2009-06-10 15:17:16 +00:00
Terry Dontje
fa9f356c8c
This commit fixes trac:1931. By separating out structures and defines used by
...
the debugger plugins into files suffixed by _dbg.h.
This commit was SVN r21404.
The following Trac tickets were found above:
Ticket 1931 --> https://svn.open-mpi.org/trac/ompi/ticket/1931
2009-06-10 14:57:28 +00:00
Ralph Castain
f966d9f972
Fix visibility issues with opal_graph functions.
...
Fix the carto test so it can compile - need to update input file so it can run
This commit was SVN r21403.
2009-06-09 15:02:57 +00:00
Ralph Castain
c3c1ab1337
Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate.
...
Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer.
This commit was SVN r21402.
2009-06-09 14:33:35 +00:00
George Bosilca
f9a510fd8a
There is no need for an atomic read if we are not in a threaded case.
...
This commit was SVN r21394.
2009-06-08 23:55:52 +00:00
George Bosilca
77a6f27d44
Update the call to orte_plm_base_create_jobid based on the new interface.
...
This commit was SVN r21393.
2009-06-08 23:53:53 +00:00
Ralph Castain
86d55d7ebf
Fix tight loops over comm_spawn by checking to see if the system has enough child procs and file descriptors available before attempting to launch. If not, introduce a 1sec delay and then test again. This provides a chance for the orted to complete processing of proc terminations from other children, hopefully creating room for the new proc(s).
...
Update the loop_spawn test to remove a sleep so that it runs at max speed, letting the new code catch when we overrun ourselves and wait for room to be cleared for the next comm_spawn.
This commit was SVN r21390.
2009-06-08 18:28:26 +00:00
Shiqing Fan
5a90b3068e
Two type casts.
...
This commit was SVN r21388.
2009-06-07 12:51:46 +00:00