1
1
Граф коммитов

114 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
88b3e6f8bd - Fix bug in orterun where --prefix didn't show up in the help output
(reported by Cisco)
- While in orterun, add a feature that multiple users have asked for:
  if you specify an absolute pathname to orterun, such as
  "/path/to/bin/orterun ...", it's equivalent to "orterun --path
  /path/to ..."

This commit was SVN r9181.
2006-02-28 11:52:12 +00:00
Josh Hursey
93e00415d5 A bunch of edits for clarity and precision.
Still needs some work, but getting closer

This commit was SVN r9098.
2006-02-21 04:17:56 +00:00
Josh Hursey
a3712f7a65 A cleanup checkpoint:
- Explained <program> and made a consistancy change in the Quick Start section.
 - Change references to 'app schema' to Open MPI 'app context'
 - Audit the command line arguments for --foo, -foo stuff.

This commit was SVN r9097.
2006-02-21 00:48:31 +00:00
Jeff Squyres
186704a23b A few updates
This commit was SVN r9089.
2006-02-18 04:17:18 +00:00
Josh Hursey
02c999776b Removed all of the LAM stuff.
This needs to be gone over a few more times before it is allowed to see
daylight, but has come a long way.  Some sections may be off more than a little,
but the general idea is there.

Need to audit to make sure we don't call the ORTE VHNP's daemons :)

This commit was SVN r9078.
2006-02-17 03:47:52 +00:00
Josh Hursey
2938545220 Checkpoint.
Finished adding and pruning all the the Options.

Cleaned up a bunch of man syntax, so it should be 'more' readable (making the
assumption that man source is ever readable :p).

I am moving on to the "description" and "see also" sections next.

This commit was SVN r9077.
2006-02-16 23:38:03 +00:00
Jeff Squyres
c2c2daa966 Change the behavior of orterun (mpirun, mpirexec) to search for
argv[0] and the cwd on the target node (i.e., the node where the
executable will be running in all systems except BProc, where the
searches are run on the node where orterun is invoked).
- fork pls now does cwd and argv[0] search in orted
- bproc pls does cwd and argv[0] search in orterun
- cwd behavior slightly different:
  - if user specifies a -wdir to orterun, we chdir() to there; if we
    can't for some reason, abort
  - if user does not specify a -wdir, try to chdir() to the dir where
    orterun was invoked.  If we can't for some reason (e.g., it
    doesn't exist on the target node), then try to chdir($HOME).  If
    we can't do that, then just live with whatever default directory
    we were put in.

This commit was SVN r9068.
2006-02-16 20:40:23 +00:00
Tim Woodall
039fe0ad29 change process group only in bproc case, as this is really
a workaround for a bproc4 bug

This commit was SVN r9064.
2006-02-16 16:19:37 +00:00
Jeff Squyres
d741b7f37f We're adding some specific and complex functionality to orteun, so it
really needs to be documented (in part so that users stop asking us
how to do it!).  

This is a first cut at an orterun.1 man page.  It is 95% copied from
LAM's mpirun.1 lam page -- I just edited the very top and am handing
this off to Josh to finish the first cut.  Then we'll add specific
docs about the behavior of some of the finer details.  This is not
listed in the Makefile.am yet because it's so incomplete/incorrect
(w.r.t. OMPI), so I don't want it included in the tarball or installed
[yet].

This commit was SVN r9058.
2006-02-16 13:29:37 +00:00
Tim Woodall
fc751171cd bproc cleanup from release branch
This commit was SVN r9054.
2006-02-16 00:16:22 +00:00
David Daniel
e82c470b32 - Change the exit status set by mpirun when an application process is
killed by a signal.  The exit status is now set to signo + 128, which
  conforms with the behavior of (almost) all shells.

This commit was SVN r9050.
2006-02-15 22:41:29 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Ralph Castain
892b396d70 Ensure that standard triggers are defined for all job/process states so that user's can subscribe to those they want to use. Modify the way that is done to avoid over-burdening the standard launch sequence since it doesn't need alerts from all those triggers.
This commit was SVN r8938.
2006-02-08 17:40:11 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Jeff Squyres
ed0fa9720d Incorporate fix suggested by Chris Gottbratch.
This commit was SVN r8750.
2006-01-19 15:21:53 +00:00
George Bosilca
992daf7522 Remove all unused defines from the Makefile.
This commit was SVN r8734.
2006-01-18 21:21:29 +00:00
Brian Barrett
c96f870674 * Merge of wrapper compiler updates from the bwb-wrapper-fix branch (r8690 -
r8698), with changes below:

  - Split wrapper flags into those required for each of the three projects,
    and cleaned up some cruft (including the LIBMPI_EXTRA_*FLAGS) through-
    out the build system
  - Added opal_init_util and opal_finalize_util to allow init / cleanup
    of all the opal code that doesn't require the MCA system
  - Create standalone key=value file parser, based on the one that used
    to be in the mca param parser, so that it can be shared in multiple
    places
  - Add wrapper datafiles for opal, orte, and ompi wrappers, and add
    wrapper compiler with support for all the old features

This commit was SVN r8699.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r8690
  r8698
2006-01-16 01:48:03 +00:00
George Bosilca
d91650ea85 Do not use explicitly "ln -s" as on some systems it does not work properly ...
(windows). Instead use the LN_S variable exported by the Makefile (set to
"ln -s" on all Unixes and to "cp -p" on windows).

When we remove an executable use the correct extension for its name
(add $(EXEEXT) to the name).

This commit was SVN r8616.
2005-12-31 12:33:44 +00:00
George Bosilca
f9b07f1912 Protect the includes.
This commit was SVN r8532.
2005-12-17 22:05:10 +00:00
Jeff Squyres
e184fd6801 Make sure that what we find is executable
This commit was SVN r8513.
2005-12-15 20:31:20 +00:00
Brian Barrett
389e378054 * use opal_init / opal_finalize in orteprobe so that ordering doesn't get out of
sync with opal....

This commit was SVN r8341.
2005-11-30 21:40:11 +00:00
Brian Barrett
79bf8843d2 * update memory hooks interface to allow for callbacks on both allocations
and dealllocations, per request from Galen and Tim

This commit was SVN r8303.
2005-11-29 04:46:14 +00:00
Brian Barrett
fee6409708 fix compiler warning and compiler error in totalview code...
This commit was SVN r8207.
2005-11-20 18:41:45 +00:00
Jeff Squyres
8d96c21311 Good weekend brainless activity -- implement the orterun command line
debugger scheme described in
http://www.open-mpi.org/community/lists/users/2005/10/0214.php.  This
makes our user-level debugger scheme much more vendor-independent
(although the "-tv" option will still work for backwards compatibility
-- it'll just be a synonum of "--debug").

This commit was SVN r8206.
2005-11-20 16:06:53 +00:00
Brian Barrett
878676218e Rename opal/memory to opal/memoryhooks because XLC++ on Mac OS X is broken.
When compiling C++ code that includes something that looks for the C++
header file "memory" (stupid C++ headers not having .h extensions), it
goes through the header file search path, which includes $(topsrcdir)/opal,
so it finds the directory $(topsrcdir)/opal/memory/ and tries to load
that as the memory header file and all goes downhill.

This commit was SVN r8111.
2005-11-11 00:26:27 +00:00
Brian Barrett
86e2adc43a * it appears that including event.h without calling opal_init annoys XLC on
OS X (you get an undefined symbol opal_event_lock).  Since the code is
  all #if 0'ed out, #if 0 out the header for now as well.

  I believe console and openmpi are to be removed from OMPI before 1.0
  release, so this doesn't need to go to the 1.0 branch

This commit was SVN r8089.
2005-11-10 15:24:57 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Josh Hursey
e7d5ecf016 Comment out the C/N notation parsing. Interior comment has more details.
This commit was SVN r7980.
2005-11-03 18:15:47 +00:00
Tim Woodall
60754acae8 - modified rmaps data structures to point directly to ras node
- modified rsh to NOT query for each nodes mapping, as all data is
  already available in the rmaps structures

This commit was SVN r7894.
2005-10-27 17:04:10 +00:00
Jeff Squyres
89931ac05f - Correct typo in comment
- Add DIST_SUBDIRS to ompi/tools/Makefile.am

This commit was SVN r7780.
2005-10-17 11:55:55 +00:00
Brian Barrett
1302cb4072 The next in a long line of crazed build system changes from Brian. This was
originally suggested by Ralf Wildenhues, to try to speed autogen, configure,
and make (and possibly even make install).  Use automake's include directive
to drastically reduce the number of Makefile files (although the number of
Makefile.am files is the same - most are just included in a top-level
Makefile.am).  Also use an Automake SUBDIRs feature to eliminate the
dynamic-mca tree, which was no longer really needed.  This makes adding
a framework easier (since you don't have to remember the dynamic-mca
tree) and makes building faster (as make doesn't have to recurse through
the dynamic-mca tree)

This commit was SVN r7777.
2005-10-17 00:21:10 +00:00
Jeff Squyres
0629cdc2d7 Bring back the changes from /tmp/jjhursey-rmaps. Specific merge
command:

svn merge -r 7567:7663 https://svn.open-mpi.org/svn/ompi/tmp/jjhursey-rmaps .

(where "." is a trunk checkout)

The logs from this branch are much more descriptive than I will put
here (including a *really* long description from last night).  Here's
the short version:

- fixed some broken implementations in ras and rmaps
- "orterun --host ..." now works and has clearly defined semantics
  (this was the impetus for the branch and all these fixes -- LANL had
  a requirement for --host to work for 1.0)
- there is still a little bit of cleanup left to do post-1.0 (we got
  correct functionality for 1.0 -- we did not fix bad implementations
  that still "work")
  - rds/hostfile and ras/hostfile handshaking
  - singleton node segment assignments in stage1
  - remove the default hostfile (no need for it anymore with the
    localhost ras component)
  - clean up pls components to avoid duplicate ras mapping queries
  - [possible] -bynode/-byslot being specific to a single app context 

This commit was SVN r7664.
2005-10-07 22:24:52 +00:00
Jeff Squyres
65f1adfedc Add "-tv" option to orterun:
orterun -tv -np 4 foo

which will turn around and re-exec:

      totalview orterun -a -np 4 foo

This commit was SVN r7636.
2005-10-05 10:24:34 +00:00
Josh Hursey
50e128ab83 Take out the --map command line arguemnt, since it is not handled properly
at the moment.

Also remove all references to --map, and (C, N) command line options in the 
help file. These references will be put back in when these options are 
implemented.

This commit was SVN r7574.
2005-10-01 15:51:20 +00:00
Jeff Squyres
fcef1774d5 Per advice from Ralf W., change the pkgdata declarations in
Makefile.am's to be a *slightly* more correct (and, more importantly,
less error-prone) construct.

This commit was SVN r7554.
2005-09-30 13:32:39 +00:00
Brian Barrett
e0c3775551 * remove some duplicate dependencies that were making Solaris mad
This commit was SVN r7549.
2005-09-30 04:13:26 +00:00
Josh Hursey
d39841174d Must release the lock before entering the non blocking recv, since
it is possible that if the receive has been arrived the callback will
be called before recv_buffer_nb() returns. This causes deadlock
as we try to acquire the lock, but already hold it.

This was causing orterun and orteds to stall in certian situations.
Became evident when stress testing dynamics with remote nodes.

This commit was SVN r7543.
2005-09-29 14:24:11 +00:00
Josh Hursey
a23370c007 Converted some MCA parameters from the old version to the new.
Have the ras_base_schedule_policy MCA parameter working once again. before it 
would only do slot based allocation, even if the MCA parameter was set properly.

Currently you can specify to orterun a node allocation by either:
-mca ras_base_schedule_policy node
-bynode

and slot allocation (which is the default) by:
-mca ras_base_schedule_policy slot
-byslot

This commit was SVN r7513.
2005-09-27 02:54:15 +00:00
Tim Woodall
4a813c1d38 support --host option (in addition to -host or -H)
This commit was SVN r7483.
2005-09-22 16:08:40 +00:00
Ralph Castain
5686e8119e Move the error name macro to the errmgr framework. Add a second level of tracing. Remove an obsolete file.
This commit was SVN r7445.
2005-09-20 17:09:11 +00:00
Tim Woodall
c25ffb343a restore host option
This commit was SVN r7443.
2005-09-20 13:36:16 +00:00
Tim Woodall
f0cec8ac0c Both -H and -host options are allowed to specify hostlist (now supported for bproc -
will look at rsh)

This commit was SVN r7440.
2005-09-20 13:31:13 +00:00
Jeff Squyres
41ba191e9a Temporarily comment out the -arch and -host options since we do not
yet have an rmapper that can handle that information.

This commit was SVN r7438.
2005-09-20 08:56:02 +00:00
Ralph Castain
bfef5928a1 Add a second trace option to pass an argument
This commit was SVN r7433.
2005-09-19 20:22:22 +00:00
Ralph Castain
86a43b1d29 Add trace to the daemons and orterun so we can tell when their callbacks are being exercised.
This commit was SVN r7432.
2005-09-19 17:20:01 +00:00
George Bosilca
60f9edf17c Create the mutex and the condition only once.
This commit was SVN r7430.
2005-09-19 16:01:29 +00:00
Tim Woodall
41b6fc166e setup callback before actually launching - otherwise this is
a definate race condition

This commit was SVN r7413.
2005-09-16 20:45:25 +00:00
Josh Hursey
8bf587475b Added a flag to orte_rmgr_base_proc_stage_gate_subscribe() allowing the
caller to specify a subset of the state variables that it can can subscribe to.
This is specified with one of three special flags defined in rmgr/rmgr_types.h

This is useful when we only care about a subset of the state changes, such as
in orted which only needs to know when a job has terminated or aborted.

This commit was SVN r7356.
2005-09-13 21:14:34 +00:00
Josh Hursey
8dfcc41efd Bootproxy daemons should persist after their local children have exited,
waiting instead for the SOH to indicate that the jobid has terminated.

In a scheduled environment, if your program has a section of MPI code
followed by a section of computation that some processes execute while
other proceses terminate normally. This patch keeps the scheduler from 
terminating all of the processes and the allocation if all of the processes 
on an allocated node exit well before other processes on other nodes.

This commit was SVN r7333.
2005-09-13 02:37:34 +00:00
Josh Hursey
7d403cf1b4 whoops forgot a bit of debug. ;(
This commit was SVN r7313.
2005-09-12 16:58:10 +00:00