1
1
Граф коммитов

143 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a90f8feb35 Need to initialize the buffer in the contact_info command.
This commit was SVN r10563.
2006-06-29 14:57:10 +00:00
Josh Hursey
793bbc667a bringing over orte-clean from tmp/jjhursey-ft-cr branch
per a request.

Currently it is not working well. That will soon change
as it just needs a bit of attention and testing to
make it lots-mo-betta.

This commit was SVN r10556.
2006-06-28 22:33:54 +00:00
Josh Hursey
9c0a279522 Moved the 'orte-ps' command from the tmp/jjhursey-ft-cr branch
per a request for its functionality into the main trunk.

This command provides basic information about a running job. It
needs a bit of attention, but works fine in its current iteration.

Please play with it, and lets try to work out all the left over bugs.

Pending action for this tool:
It has been requested that the tool be changed slightly to allow
it to be called via a function call from internal libraries
(e.g. orteconsole).

This commit was SVN r10554.
2006-06-28 22:06:13 +00:00
Brian Barrett
b6663c64c7 * fix for bug #161 - add man page info for recently added features
This commit was SVN r10514.
2006-06-26 22:16:39 +00:00
Brian Barrett
86861bc1c3 * add --quiet option, and surpress a couple of the status messages in
orterun if it is actually enabled.  For ticket #129.

This commit was SVN r10497.
2006-06-26 18:21:45 +00:00
Brian Barrett
4e8abb943b * fix up signal handling code so that one function handles SIGUSR1 and
SIGUSR2.  This can be extended later if needed to include other
  signals we should forward to the user processes (TSTP and CONT,
  perhaps?)
* Since the signal handlers don't actually run in signal context, we
  can use malloc/fprintf/etc.  So clean up some of the signal handler
  code so that we don't keep message buffers around for the life of
  the process

This commit was SVN r10496.
2006-06-26 15:12:52 +00:00
Brian Barrett
9766c01e50 * Per discussion at quarterly meeting and bug #91, print out the bug
contact point when printing version and help strings

This commit was SVN r10484.
2006-06-22 19:48:27 +00:00
Brian Barrett
5c89dc6946 Fix for ticket #91
mpirun/orterun now has an option to print the version number.  If -V/--version
is given, it will print the version number.  If it's the only option, we
exit cleanly.  Otherwise, we continue on as if --version wasn't given
(except we've printed the version number).
--This line, and th se below, will be ignored--

M    orte/tools/orterun/orterun.c
M    orte/tools/orterun/help-orterun.txt

This commit was SVN r10276.
2006-06-09 17:21:23 +00:00
Ralph Castain
ee5a626d25 Add ability to trap and propagate SIGUSR1/2 to remote processes. There are a number of small changes that hit a bunch of files:
1. Changed the RMGR and PLS APIs to add "signal_job" and "signal_proc" entry points. Only the "signal_job" entries are implemented - none of the components have implementations for "signal_proc" at this time. Thus, you can signal all of the procs in a job, but cannot currently signal only one specific proc.

2. Implemented those new API functions in all components except xgrid (Brian will do so very soon). Only the rsh/ssh and fork modules have been tested, however, and only under OS-X.

3. Added signal traps and callback functions for SIGUSR1/2 to orterun/mpirun that catch those signals and call the appropriate commands to propagate them out to all processes in the job.

4. Added a new test directory under the orte branch to (eventually) hold unit and system level tests for just the run-time. Since our test branch of the repository is under restricted access, people working on the RTE were continually developing their own system-level tests - thus making it hard to help diagnose problems. I have moved the more commonly-used functions here, and added one specifically for testing the SIGUSR1/2 functionality.

I will be contacting people directly to seek help with testing the changes on more environments. Other than compile issues, you should see absolutely no change in behavior on any of your systems - this additional functionality is transparent to anyone who does not issue a SIGUSR1/2 to mpirun.

Ralph

This commit was SVN r10258.
2006-06-08 18:27:17 +00:00
Jeff Squyres
1d6902296c Additions to the tm, slurm, and rsh pls modules to handle the --prefix
option as discussed on the devel-core mailing list.  The Big
Difference is that instead of hard-coding the strings "/lib" and
"/bin" in to append to the prefix, we append the basename of the local
libdir and bindir.  Hence, if your libdir is $prefix/lib64, we'll
append /lib64 to construct the remote node's LD_LIBRARY_PATH (etc.).

Also appended the orterun.1 man page to include a description of
--prefix, how it is constructed, what it handles / what it does not,
etc.

This commit was SVN r9930.
2006-05-16 14:14:12 +00:00
Brian Barrett
52369307f8 Add a feature to the build system that Terry from Sun and I talked about
in San Jose.  Allow the configure option --disable-binaries to build OMPI,
but not build or install the support binaries (so basically, just build
the libraries).

This commit was SVN r9777.
2006-04-29 02:16:41 +00:00
Brian Barrett
ce72140633 Remove dependency libraries from these Makefile.ams - the libraries will
automagically bring in the libraries through the top-level library (so
liborte automatically brings in libopal, etc.).  Otherwise, we get some
warnings on Solaris

This should go to the v1.1 branch

This commit was SVN r9666.
2006-04-20 17:53:43 +00:00
Brian Barrett
62afa63ded Initialize length to 0 instead of -1 (size_t might be unsigned and therefore
-1 is an issue).

This should go to the v1.1 branch...

This commit was SVN r9665.
2006-04-20 15:42:36 +00:00
Ralph Castain
c79c1714de Okaaayyy....let's see if this restores the "prefix" command line option. No idea what the problem was with the other option, but it isn't critical right now, so I'll figure it out later.
This commit was SVN r9542.
2006-04-06 07:53:38 +00:00
Ralph Castain
0ba8851a47 Fix the univ_exist option
This commit was SVN r9535.
2006-04-05 17:18:06 +00:00
Ralph Castain
b9bdb2125e Fix and upgrade the console to support better debugging. Activate "dump" commands to display registry content. Remove the blasted opal_output default prefix that made the dump output illegible. Properly connect to existing daemons and/or start new ones.
This commit was SVN r9528.
2006-04-04 11:05:52 +00:00
Brian Barrett
99e4c89183 * some typo fixes for orterun manpage
* Install orterun manpage as mpirun.1 and mpiexec.1 as well as orterun.1

This commit was SVN r9444.
2006-03-29 01:04:43 +00:00
Jeff Squyres
07b0e559f2 Fix copyright
This commit was SVN r9443.
2006-03-29 00:53:11 +00:00
Josh Hursey
35eb1a2970 Added a section on "Specifying Hosts" to the man page.
This commit was SVN r9432.
2006-03-27 23:46:38 +00:00
Jeff Squyres
bc96040e1c - Add Cisco copyright
- Add comment explaining why we used INT_MAX
- Update NEWS

This commit was SVN r9415.
2006-03-24 15:39:09 +00:00
Jeff Squyres
a843ce4c23 Clean up a minor memory leak
This commit was SVN r9413.
2006-03-24 15:28:42 +00:00
Ralph Castain
08db67cdf8 Fix the app_context problem for app_files too....
Again, this should be checked by Jeff.

This commit was SVN r9393.
2006-03-23 17:55:25 +00:00
Ralph Castain
2a18ebd9e1 Fix the app_context problem.
NOTE: JEFF SHOULD CHECK THIS!

I found that orterun was not tracking the index number of the app_contexts it was creating. Hence, the app_context->idx field was always sitting at zero. This index is used by the mapper to decide which app_context to use for each process - thus, with the value of each index being zero, the mapper only used the first app_context that was created. All others were ignored.

Not sure when this might have gotten changed. Could be it was a problem that always existed, but didn't get exposed until something else was changed.

Anyway, it seems to work now - could stand further testing.

This commit was SVN r9389.
2006-03-23 16:53:11 +00:00
Josh Hursey
22bac7ae95 a test commit. one more try
This commit was SVN r9350.
2006-03-21 00:39:29 +00:00
Josh Hursey
d64aab529f a test commit. no real changes here. Removing added char.
This commit was SVN r9349.
2006-03-21 00:37:13 +00:00
Josh Hursey
c8f9108c18 a test commit. no real changes here
This commit was SVN r9348.
2006-03-21 00:33:20 +00:00
Josh Hursey
66edc64be0 Minor comment change
This commit was SVN r9316.
2006-03-16 19:00:03 +00:00
Josh Hursey
7fcfd87cd5 Minor date change
This commit was SVN r9315.
2006-03-16 18:59:13 +00:00
Jeff Squyres
80bc1850bf Ensure that --prefix takes precedence over /path/to/orterun
This commit was SVN r9183.
2006-02-28 14:44:40 +00:00
Jeff Squyres
88b3e6f8bd - Fix bug in orterun where --prefix didn't show up in the help output
(reported by Cisco)
- While in orterun, add a feature that multiple users have asked for:
  if you specify an absolute pathname to orterun, such as
  "/path/to/bin/orterun ...", it's equivalent to "orterun --path
  /path/to ..."

This commit was SVN r9181.
2006-02-28 11:52:12 +00:00
Josh Hursey
93e00415d5 A bunch of edits for clarity and precision.
Still needs some work, but getting closer

This commit was SVN r9098.
2006-02-21 04:17:56 +00:00
Josh Hursey
a3712f7a65 A cleanup checkpoint:
- Explained <program> and made a consistancy change in the Quick Start section.
 - Change references to 'app schema' to Open MPI 'app context'
 - Audit the command line arguments for --foo, -foo stuff.

This commit was SVN r9097.
2006-02-21 00:48:31 +00:00
Jeff Squyres
186704a23b A few updates
This commit was SVN r9089.
2006-02-18 04:17:18 +00:00
Josh Hursey
02c999776b Removed all of the LAM stuff.
This needs to be gone over a few more times before it is allowed to see
daylight, but has come a long way.  Some sections may be off more than a little,
but the general idea is there.

Need to audit to make sure we don't call the ORTE VHNP's daemons :)

This commit was SVN r9078.
2006-02-17 03:47:52 +00:00
Josh Hursey
2938545220 Checkpoint.
Finished adding and pruning all the the Options.

Cleaned up a bunch of man syntax, so it should be 'more' readable (making the
assumption that man source is ever readable :p).

I am moving on to the "description" and "see also" sections next.

This commit was SVN r9077.
2006-02-16 23:38:03 +00:00
Jeff Squyres
c2c2daa966 Change the behavior of orterun (mpirun, mpirexec) to search for
argv[0] and the cwd on the target node (i.e., the node where the
executable will be running in all systems except BProc, where the
searches are run on the node where orterun is invoked).
- fork pls now does cwd and argv[0] search in orted
- bproc pls does cwd and argv[0] search in orterun
- cwd behavior slightly different:
  - if user specifies a -wdir to orterun, we chdir() to there; if we
    can't for some reason, abort
  - if user does not specify a -wdir, try to chdir() to the dir where
    orterun was invoked.  If we can't for some reason (e.g., it
    doesn't exist on the target node), then try to chdir($HOME).  If
    we can't do that, then just live with whatever default directory
    we were put in.

This commit was SVN r9068.
2006-02-16 20:40:23 +00:00
Tim Woodall
039fe0ad29 change process group only in bproc case, as this is really
a workaround for a bproc4 bug

This commit was SVN r9064.
2006-02-16 16:19:37 +00:00
Jeff Squyres
d741b7f37f We're adding some specific and complex functionality to orteun, so it
really needs to be documented (in part so that users stop asking us
how to do it!).  

This is a first cut at an orterun.1 man page.  It is 95% copied from
LAM's mpirun.1 lam page -- I just edited the very top and am handing
this off to Josh to finish the first cut.  Then we'll add specific
docs about the behavior of some of the finer details.  This is not
listed in the Makefile.am yet because it's so incomplete/incorrect
(w.r.t. OMPI), so I don't want it included in the tarball or installed
[yet].

This commit was SVN r9058.
2006-02-16 13:29:37 +00:00
Tim Woodall
fc751171cd bproc cleanup from release branch
This commit was SVN r9054.
2006-02-16 00:16:22 +00:00
David Daniel
e82c470b32 - Change the exit status set by mpirun when an application process is
killed by a signal.  The exit status is now set to signo + 128, which
  conforms with the behavior of (almost) all shells.

This commit was SVN r9050.
2006-02-15 22:41:29 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Ralph Castain
892b396d70 Ensure that standard triggers are defined for all job/process states so that user's can subscribe to those they want to use. Modify the way that is done to avoid over-burdening the standard launch sequence since it doesn't need alerts from all those triggers.
This commit was SVN r8938.
2006-02-08 17:40:11 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Jeff Squyres
ed0fa9720d Incorporate fix suggested by Chris Gottbratch.
This commit was SVN r8750.
2006-01-19 15:21:53 +00:00
George Bosilca
992daf7522 Remove all unused defines from the Makefile.
This commit was SVN r8734.
2006-01-18 21:21:29 +00:00
Brian Barrett
c96f870674 * Merge of wrapper compiler updates from the bwb-wrapper-fix branch (r8690 -
r8698), with changes below:

  - Split wrapper flags into those required for each of the three projects,
    and cleaned up some cruft (including the LIBMPI_EXTRA_*FLAGS) through-
    out the build system
  - Added opal_init_util and opal_finalize_util to allow init / cleanup
    of all the opal code that doesn't require the MCA system
  - Create standalone key=value file parser, based on the one that used
    to be in the mca param parser, so that it can be shared in multiple
    places
  - Add wrapper datafiles for opal, orte, and ompi wrappers, and add
    wrapper compiler with support for all the old features

This commit was SVN r8699.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r8690
  r8698
2006-01-16 01:48:03 +00:00
George Bosilca
d91650ea85 Do not use explicitly "ln -s" as on some systems it does not work properly ...
(windows). Instead use the LN_S variable exported by the Makefile (set to
"ln -s" on all Unixes and to "cp -p" on windows).

When we remove an executable use the correct extension for its name
(add $(EXEEXT) to the name).

This commit was SVN r8616.
2005-12-31 12:33:44 +00:00
George Bosilca
f9b07f1912 Protect the includes.
This commit was SVN r8532.
2005-12-17 22:05:10 +00:00
Jeff Squyres
e184fd6801 Make sure that what we find is executable
This commit was SVN r8513.
2005-12-15 20:31:20 +00:00
Brian Barrett
389e378054 * use opal_init / opal_finalize in orteprobe so that ordering doesn't get out of
sync with opal....

This commit was SVN r8341.
2005-11-30 21:40:11 +00:00