Shiqing Fan
5d0f4dc88d
- Clean up the unreferenced variables.
...
- Change the arguments for launch failed function according to changeset r18611.
This commit was SVN r18795.
The following SVN revision numbers were found above:
r18611 --> open-mpi/ompi@7bee71aa59
2008-07-03 10:11:08 +00:00
George Bosilca
07cb54995b
Reactivate the daemon spin from the command line.
...
This commit was SVN r18794.
2008-07-02 01:46:58 +00:00
Shiqing Fan
a3e1718126
Missing one argument for calling this function.
...
This commit was SVN r18793.
2008-07-01 18:01:22 +00:00
Lenny Verkhovsky
c143c95ff9
Partial rankfile slots allocation fix
...
This commit was SVN r18787.
2008-07-01 08:54:20 +00:00
Ralph Castain
6f85e34d66
Detect homo/hetero scenarios in the nidmap, setup to take appropriate actions in the basic grpcomm module.
...
NOT for inclusion in v1.3
This commit was SVN r18786.
2008-07-01 02:44:57 +00:00
Ralph Castain
bbaf000db2
Singletons need to construct their own nidmap and cannot use the std function in the base
...
This commit was SVN r18777.
2008-06-30 13:28:56 +00:00
Ralph Castain
158040cf3b
First step: be kind to Jeff's disk space - let's abort without dumping core files all over the place
...
This commit was SVN r18751.
2008-06-26 16:10:03 +00:00
Ralph Castain
9cebe0ca96
Ckpt the bproc support. All compiles now except for PLM module
...
This commit was SVN r18744.
2008-06-26 03:48:22 +00:00
Brian Barrett
cbd6749c22
Move the lock initialization back to orte_init so that the finalize lock
...
is properly initialized and available in all cases (like ompi_info, where
the ess is never actually initialized). Fixes trac:1364.
This commit was SVN r18733.
The following Trac tickets were found above:
Ticket 1364 --> https://svn.open-mpi.org/trac/ompi/ticket/1364
2008-06-25 03:18:37 +00:00
Ralph Castain
578d1c15c6
Allow the ESS to return the hostname and arch for a specified daemon instead of just for application procs. Uses the same API - just need to detect that the specified proc is a daemon and lookup its corresponding node in the nidmap.
...
This commit was SVN r18722.
2008-06-24 17:53:10 +00:00
Ralph Castain
b118779c08
It is okay for us to init the ORTE mca params multiple times. Indeed, it is absolutely required by orterun as the first time has to be done prior to parsing the command line, which means that the mca values haven't been parsed yet!
...
Add ability for sys admins to prohibit putting session directories under specified locations. Thus, they can now protect parallel file systems from foolish user mistakes.
This commit was SVN r18721.
2008-06-24 17:50:56 +00:00
Ralph Castain
17fcd72b5d
Restore bproc code - if someone wants to maintain it, then more power to them...but it would definitely be easier if the old code is in the trunk. This is all .ompi_ignore'd except for me so I can play with making it compile again in my copious free time.
...
This commit was SVN r18716.
2008-06-24 01:27:22 +00:00
Ralph Castain
3e61a3f92e
Sandbox for next-gen launch
...
This commit was SVN r18715.
2008-06-24 01:25:51 +00:00
Ralph Castain
f799ea225f
Orterun creates a "clean" copy of its environment for use in launching procs. This includes properly setting LD_LIBRARY_PATH and PATH, among other things. Unfortunately, our PLM modules were using the local environ instead of the saved copy, thus missing a number of things that really should have been included. From what I see, we got away with the error because the PLM's were duplicating all that setup logic themselves - I'll clean this up over the next few days.
...
Meantime, correct the PLM's so they use the correct environ for launching.
This commit was SVN r18713.
2008-06-23 22:39:36 +00:00
Ralph Castain
0fa9d88009
Set $PWD for the application proc to match the cwd. If the user specifies a working dir via -wdir, this ensures that the enviro variable matches what they get from getcwd. Note that any subsequent calls to chdir in the user's program will break that equivalence - we can only ensure it starts out matching!
...
This commit was SVN r18709.
2008-06-23 18:25:41 +00:00
Ralph Castain
acbcbb81b5
Add some debugging output to the modex set_proc_attr function to see what is being added to the modex
...
This commit was SVN r18708.
2008-06-23 18:24:08 +00:00
Jeff Squyres
24c3aa1d77
Really fix "make dist". Really.
...
This commit was SVN r18704.
2008-06-21 18:04:38 +00:00
Jeff Squyres
930667ac73
Ensure that orte-checkpoint and orte-restart man pages are always
...
included in the distribution tarball. This ''appears'' to be an
Automake bug -- I have submitted a bug report to the bug-automake list:
http://lists.gnu.org/archive/html/bug-automake/2008-06/msg00019.html
This commit was SVN r18696.
2008-06-20 18:19:01 +00:00
Ralph Castain
c693d3a5d1
I hadn't honestly considered before that an MPI process might attempt to call functions in the routed framework intended solely for daemons and HNPs. By design, MPI processes are not allowed to route RML/OOB messages, and hence the routed module in an MPI process has no knowledge whatsoever of how a message will reach its destination (except in the direct module). Thus, it has no way to return a valid routing tree, update a routing tree, or get wireup info.
...
This commit ensures that attempts to access information that is unknowable or undefined returns appropriate invalid or not_supported values to avoid unexpected behavior and/or segfaults.
This commit was SVN r18692.
2008-06-20 03:26:13 +00:00
Ralph Castain
5ebe10ebf1
Fix a bad typo - need to look at the node array as the arch array hasn't been built yet
...
This commit was SVN r18689.
2008-06-19 21:34:39 +00:00
Ralph Castain
174b9f1482
Ensure this module works in heterogeneous environments.
...
Note: this module is under development, which is why it is not set as the default. Use at your own risk!
This commit was SVN r18688.
2008-06-19 19:40:47 +00:00
Ralph Castain
26c9ad5799
Clean-up the DSS API to remove two functions that are supposed to be used solely internally to the DSS. These were likely exposed because we need to call them when packing/unpacking declared types, but this means that developers may accidentally use the wrong functions, causing the DSS buffer to get confused. Instead, return the system to the way it used to work and hide those functions.
...
This commit was SVN r18684.
2008-06-19 18:46:25 +00:00
Josh Hursey
b78ae13bf3
add back a missing a header taken away in r18664
...
This commit was SVN r18682.
The following SVN revision numbers were found above:
r18664 --> open-mpi/ompi@0532d799d6
2008-06-19 16:08:27 +00:00
Jeff Squyres
e4172a3c44
Shift the AM "if" logic down from orte/tools/Makefile.am down to the
...
individual orte/tools/*/Makefile.am files. This causes "make" to
travese into every directory, even if it's not going to build anything
in that directory (which is a good thing). It also helps cleanup and
dist issues.
This also affects orte-checkpoint and orte-restart, but I couldn't get
--with-ft to compile properly; I'll pass along a heads-up to Josh to
ensure that I didn't break anything.
This commit was SVN r18680.
2008-06-19 14:46:10 +00:00
Ralph Castain
571f483c39
Ensure that we don't breakpoint the debugger until -after- all procs have reported their contact info so we can successfully send the release message
...
This commit was SVN r18678.
2008-06-19 14:37:46 +00:00
Ralph Castain
3b5e80fa61
Shift responsibility for preconnecting the oob to the orte routed framework, which is the only place that knows what needs to be done. Only the direct module will actually do anything - it uses the same algo as the original preconnect function.
...
This commit was SVN r18677.
2008-06-19 13:48:26 +00:00
Jeff Squyres
7e45b24001
MPIR_being_debugged is an int, not a bool.
...
This commit was SVN r18676.
2008-06-19 13:31:34 +00:00
Ralph Castain
b56f8ced4f
Ensure params are registered prior to parsing global cmd line options in orterun so that debugger options are properly captured and acted upon.
...
Ensure that routes to remote procs are set on the HNP before completing launch so that the debugger message can be sent. Solves a race condition that can exist in those environments where the HNP does not have local procs.
This commit was SVN r18674.
2008-06-19 02:58:14 +00:00
Ralph Castain
955d117f5e
Add a new grpcomm module that mimics the old 1.2 behavior - it -always- does a modex because it always includes the architecture. Hence, we called it "blind-and-dumb" since it doesn't look to see if this is required - moniker of "bad". :-)
...
Update the ESS API so we can update the stored arch's should the modex include that info. Update ompi/proc to check/set the arch for remote procs, and add that function call to mpi_init right after the modex is done.
Setup to allow other grpcomm modules to decide whether or not to add the arch to the modex, and to detect if other entries have been made. If not, then the modex can just fall through. Begin setting up some logic in the "basic" module to handle different arch situations.
For now, default to the "bad" module so we will work in all situations, even though we may be sending around more info than we really require.
This fixes ticket #1340
This commit was SVN r18673.
2008-06-18 22:17:53 +00:00
Ralph Castain
282a220e7e
Update the debugger interface per email thread with Jeff and Brian. Handoff to them for final test and validation
...
This commit was SVN r18670.
2008-06-18 15:28:46 +00:00
George Bosilca
8e7c35e76c
These symbols are only available via the module/component structure, so they
...
don't have to be globally visible.
This commit was SVN r18666.
2008-06-18 08:20:02 +00:00
George Bosilca
0f9b9c0aff
Remove a warning and add arequired header (otherwise we cannot compile when
...
--disable-debug is specified).
This commit was SVN r18665.
2008-06-18 08:10:02 +00:00
Ralph Castain
0532d799d6
Complete implementation of the --without-rte-support configure option. Working with Brian, this has been tested on RedStorm.
...
Some minor changes to help facilitate debugger support so that both mpirun and yod can operate with it. Still to be completed.
This commit was SVN r18664.
2008-06-18 03:15:56 +00:00
Brian Barrett
7712b07ac4
Add perl based wrapper compilers for cross-compile environments. The default
...
is still to use the C based wrapper compilers (which have many more features
and are more well tested). The Perl compilers are enabled with the option
--enable-script-wrapper-compilers, which also ignores the option
--disable-binaries (ie --enable-script-wrapper-compilers --disable-binaries
will result in perl-based wrapper compilers being installed, but no other
binaries being installed).
This commit was SVN r18655.
2008-06-13 22:52:25 +00:00
Ralph Castain
a87aa442e3
Remove last remaining reference to iof_flush - it was #if'd out anyway. The existing flush code appears to have several critical problems. Given the impending rework of the IOF subsystem, there is no point in trying to fix it here.
...
This commit was SVN r18649.
2008-06-11 16:25:46 +00:00
Ralph Castain
1f41069ac9
Fix CID 752 - if we can't find the daemon job object, we have to ensure we exit without attempting to dereference it
...
This commit was SVN r18647.
2008-06-11 14:49:58 +00:00
Ralph Castain
13ea4e4673
Be consistent - since we don't strdup the other values for param, don't strdup this one.
...
This allows r18645 to fix the memory corruption issue, but also allows us to resolve the memory leaks cited by CID 1039
This commit was SVN r18646.
The following SVN revision numbers were found above:
r18645 --> open-mpi/ompi@53d83ba1c5
2008-06-11 14:42:47 +00:00
Pak Lui
53d83ba1c5
Take out a couple of free's.
...
This commit fixes trac:1343
This commit was SVN r18645.
The following Trac tickets were found above:
Ticket 1343 --> https://svn.open-mpi.org/trac/ompi/ticket/1343
2008-06-11 14:02:49 +00:00
Ralph Castain
d61fe87d04
Use the opal_show_help system if orte_show_help has not been initialized
...
This fixes ticket #1342
This commit was SVN r18644.
2008-06-11 12:50:40 +00:00
Ralph Castain
f9d809748c
Glad someone found that last error - caused me to review the code and find a couple of other cleanups! Nothing major, but just ensure that things flow smoothly since we had a "shadowed" variable.
...
This commit was SVN r18643.
2008-06-10 19:15:59 +00:00
Camille Coti
67cd1849f7
*map was still NULL in the else statement, inducing a segmentation fault when a field of the structure was accessed to.
...
This commit was SVN r18642.
2008-06-10 19:00:57 +00:00
Ralph Castain
1a422995ae
Fix two Coverity complaints CID 813 (value defined and not used) and 1039 (resource leak). While doing so, found and fixed another less obvious memory leak.
...
This commit was SVN r18641.
2008-06-10 17:53:28 +00:00
Brian Barrett
4127bd0dcc
fix two other mistakes in the cnos ess
...
This commit was SVN r18632.
2008-06-09 22:28:26 +00:00
George Bosilca
f72ab90b16
Allow xgrid to compile again.
...
This commit was SVN r18631.
2008-06-09 21:51:41 +00:00
Brian Barrett
11cd3a7cba
Fix problem where local rank always had different architecture than remote
...
ranks on Red Storm
This commit was SVN r18630.
2008-06-09 21:46:03 +00:00
Ralph Castain
8d9ff44134
Add visibility required for some environments and configs
...
This commit was SVN r18629.
2008-06-09 21:28:19 +00:00
Ralph Castain
03ab4f5c64
Make the ifdef name mirror the change in filename
...
This commit was SVN r18626.
2008-06-09 20:36:55 +00:00
Ralph Castain
c13cadc3c7
Refs trac:1255
...
This commit repairs the debugger initialization procedure. I am not closing the ticket, however, pending Jeff's review of how it interfaces to the ompi_debugger code he implemented. There were duplicate symbols being created in that code, but not used anywhere. I replaced them with the ORTE-created symbols instead. However, since they aren't used anywhere, I have no way of checking to ensure I didn't break something.
So the ticket can be checked by Jeff when he returns from vacation... :-)
This commit was SVN r18625.
The following Trac tickets were found above:
Ticket 1255 --> https://svn.open-mpi.org/trac/ompi/ticket/1255
2008-06-09 20:34:14 +00:00
Ralph Castain
2cc8b2c51f
Add yet another test, this one for proper error behavior when someone call an MPI function after calling MPI_Finalize.
...
Add a minor debug that outputs the orterun exit status to stderr when orte_debug is set.
This commit was SVN r18622.
2008-06-09 19:21:20 +00:00
Ralph Castain
bf5c34d10a
The rsh launcher is one place where multi-word MCA params would have to be passed via the orted cmd line. In such a case, we have to explicitly include quote marks about the param value. Add that capability here.
...
This commit fixes trac:1200
This commit was SVN r18621.
The following Trac tickets were found above:
Ticket 1200 --> https://svn.open-mpi.org/trac/ompi/ticket/1200
2008-06-09 19:07:19 +00:00