1
1
Граф коммитов

249 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
f13690f16e The prototype of ompi_help has been changed.
This commit was SVN r7218.
2005-09-07 17:15:00 +00:00
Brian Barrett
ed56e743b7 * update configure.ac to use the modern version of AC_INIT and
AM_INIT_AUTOMAKE, instead of the deprecated version.
* Work around dumbness in modern AC_INIT that requires the version
  number to be set at autoconf time (instead of at configure time, as
  it was before).  Set the version number, minus the subversion r number,
  at autoconf time.  Override the internal variables to include the r
  number (if needed) at configure time.  Basically, the right thing
  should always happen.  The only place it might not is the version
  reported as part of configure --help will not have an r number.
* Since AM_INIT_AUTOMAKE taks a list of options, no need to specify
  them in all the Makefile.am files.
* Addes support for subdir-objects, meaning that object files are put
  in the directory containing source files, even if the Makefile.am is
  in another directory.  This should start making it feasible to
  reduce the number of Makefile.am files we have in the tree, which
  will greatly reduce the time to run autogen and configure.

This commit was SVN r7211.
2005-09-07 05:54:53 +00:00
Josh Hursey
a5e5924217 Added a custom arguments MCA param for Slurm PLS.
This allows the user to specify certain options to srun when an application
is launched with this PLS.

A useful example is the need to set the time to wait from when the first
process completes and when slurm kills remaining processes:

  pls_slurm_args=--wait=1200

This commit was SVN r7206.
2005-09-06 21:52:28 +00:00
Jeff Squyres
c5dc8762a2 Remove useless SUBDIRS line
This commit was SVN r7203.
2005-09-06 21:31:50 +00:00
Jeff Squyres
383d9f58e7 Be [slightly] more descriptive. :-)
This commit was SVN r7198.
2005-09-06 16:57:11 +00:00
Ralph Castain
47bf2574e1 Ensure that subscriptions for a specific requestor/subscription return id only get registered once. It appears that sometimes the system registers a subscription for the same return location multiple times. This prevents getting multiple callbacks when that happens. Still need to track down why it is happening at all.
This commit was SVN r7197.
2005-09-06 16:33:41 +00:00
Rainer Keller
a13d513bbe - Rename of unused define, sitting in tree.
This commit was SVN r7196.
2005-09-06 16:17:08 +00:00
Rainer Keller
a36347d728 - Support -prefix specification on mpirun/orterun cmd-line per
app_context:
  mpirun -np 2 -prefix /path/to/ompi/on/machineA ./exec1 : \
         -np 2 -prefix /path/to/ompi/on/machineB ./exec2

- Allow with -mca pls_rsh_assume_same_shell 0, the checking for the
  SHELL-variable on the actual node (currently 1st node).
  Sets the prefix, PATH and LD_LIBRARY_PATH for bash/ksh and 
  csh/tcsh.

This commit was SVN r7195.
2005-09-06 16:10:05 +00:00
Ralph Castain
7fc67f57a5 Little logic cleanup and handle thread locking correctly.
This commit was SVN r7192.
2005-09-06 14:04:43 +00:00
George Bosilca
648ef2ae5c One of the latest gcc version bark about a variable being use uninitialized. It was kind of right, because the
variable was protected by another one ... But with few modifications I get rid of this warning.

This commit was SVN r7189.
2005-09-06 03:13:03 +00:00
Brian Barrett
4e20c83204 * fix memory leak in ORTE startup
This commit was SVN r7186.
2005-09-05 21:03:02 +00:00
Jeff Squyres
3645816aef - Add copyrights
- Minor style fixes

This commit was SVN r7184.
2005-09-05 18:51:59 +00:00
Rainer Keller
588a62cb90 - Missed file in last commit
This commit was SVN r7179.
2005-09-04 20:55:27 +00:00
Rainer Keller
192625d2a1 - Once again: uninteresting cleanup to get diff smaller.
This commit was SVN r7178.
2005-09-04 20:54:19 +00:00
Ralph Castain
12daecb826 More cleanup
This commit was SVN r7167.
2005-09-03 01:22:11 +00:00
Ralph Castain
4b5b3b4164 Properly handle the argv array and clean it up when done.
This commit was SVN r7166.
2005-09-03 00:15:21 +00:00
Josh Hursey
78da530fd2 Fix a bug that Tim highlighted in which orted coredumps when an orterun is
CTRL-C'd. 
We were calling orte_finalize recursively which caused a segv when it tried to 
use a freed framework (orte_rmgr in this case).

I added a status flag to orte_universe_info to indicate where we are in the code.
This was needed to determine if we should call orte_abort or not when shutting
down in the tcp oob.

This commit was SVN r7160.
2005-09-02 21:07:21 +00:00
Ralph Castain
0f797fd40b Few more cleanups
This commit was SVN r7159.
2005-09-02 21:01:59 +00:00
Ralph Castain
4ed7752681 Continue cleaning up memory leaks during launch
This commit was SVN r7158.
2005-09-02 20:41:24 +00:00
Ralph Castain
f352890732 Cleaning up memory leaks for proxy operations.
This commit was SVN r7157.
2005-09-02 19:26:21 +00:00
Ralph Castain
4bd25e0292 Few minor memory leak cleanups
This commit was SVN r7156.
2005-09-02 18:50:01 +00:00
Jeff Squyres
a7fbb0f95e Put in comments about why these assignments exist
This commit was SVN r7146.
2005-09-02 10:27:23 +00:00
Jeff Squyres
7e4f696501 Fix silly compiler warnings
This commit was SVN r7145.
2005-09-02 10:26:41 +00:00
Ralph Castain
66a215eae1 More memory cleanup...
1. Valgrind is good for something - chasing down memory leaks in registry led me to re-visit the dictionary functions and discover that I wasn't keeping track of the number of dictionary entries on each segment! Resulted in wasted time searching blank entries as well as leaked memory. This has now been fixed.

2. Fixed the orte_bitmap test. The init function for that class has been eliminated and the constructor adjusted to provide that functionality.

This commit was SVN r7136.
2005-09-02 00:26:58 +00:00
Ralph Castain
76e622a552 Clean up a few memory leaks - more to go...
This commit was SVN r7134.
2005-09-01 17:38:04 +00:00
Ralph Castain
cb128ab87b Be a little more friendly and tell the user we couldn't reach their specified universe... :-)
This commit was SVN r7132.
2005-09-01 15:52:37 +00:00
Ralph Castain
d0f7dafc47 Revise the universe connection logic. Two cases are now handled:
1. user does NOT specify the universe name. For the default universe case, if we detect an existing default universe and cannot connect to it, we quietly create an alternative default name by adding the pid to the orte_default_universe name and move on - we no longer provide a warning message for this case.

2. user specified a universe name. If we detect an existing universe of that name and cannot connect to it, we consider this an error condition and abort.

This commit was SVN r7131.
2005-09-01 15:50:38 +00:00
Ralph Castain
03e45e6723 Two quick additions:
1. Added OMPI_PROC_ARCH as a defined registry key and added the code so that the architecture info gets properly transmitted across all processes using the startup message.

2. Added an OMPI_MODEX_KEY definition and removed the hard-coded "modex" key from pml_modex_exchange

This commit was SVN r7129.
2005-09-01 15:05:03 +00:00
Jeff Squyres
3962c53e2e - Add to AM_CPPFLAGS $(OPAL_LTDL_CPPFLAGS) where necessary in order to
add a -I to find the included ltdl.h (vs. a system-installed ltdl.h)
- Clean up kruft in a bunch of Makefile.am's to remove now-unnecessary
  AM_CPPFLAGS settings to get static-components.h for each framework
- Move the component_repository API functions out of opal/mca/base/base.h
  and into opal/mca/base/mca_base_component_repository.h in order to
  decrease unnecessary dependencies (e.g., before this, almost
  everything in the tree depended on ltdl.h, which is unnecessary --
  only a small number of files really need ltdl.h)

This commit was SVN r7127.
2005-09-01 12:16:36 +00:00
Jeff Squyres
4c59058053 - Add some logic to configure to make a version of CFLAGS that doesn't
include any optimization flags
- Use these flags to always compile ompi/debuggers/* and orterun so
  that parallel debuggers (such as Totalview) can always see the
  debugging symbols (see comments in ompi/debuggers/Makefile.am and
  orte/tools/orterun/Makefile.am)
- Remove some obsolete LAM-named variables from configure.ac

This commit was SVN r7125.
2005-09-01 10:37:20 +00:00
Ralph Castain
96f4bb7a63 Hey, sports fans!! Guess what??
Here's the huge registry check-in you've all been waiting for with baited breath. The revised version sends a single message to all processes at the various stage gates, thus making the startup much more scalable. I could provide you with all the tawdry details, but won't for now - you are welcome to ask, though, and I'll merrily bore your ears to tears.

In addition, the commit contains the following:

1. set the ignore properties on ompi/debuggers and orte/mca/pls/poe

2. Added simplified subscribe and put functions to the registry's API. I have also converted all of the ompi functions that registered subscriptions to the new API, and caught their associated put's as well.

In a follow-on commit, I'll be adding support for George's hetero arch registry subscription (wanted to get this one in first).

This commit was SVN r7118.
2005-09-01 01:07:30 +00:00
Tim Woodall
d7ff284888 correct selection logic
This commit was SVN r7116.
2005-08-31 21:51:52 +00:00
Tim Woodall
33eabd500c close/re-init soh
This commit was SVN r7115.
2005-08-31 21:51:10 +00:00
Tim Woodall
35f96af472 non-destructive read of buffer
This commit was SVN r7114.
2005-08-31 21:21:54 +00:00
David Daniel
c6054662d5 Forgot to add new header to sources
This commit was SVN r7109.
2005-08-31 16:21:58 +00:00
David Daniel
a5eff8fc78 A little more clean-up. TotalView now works with --enable-debug build.
Tested with:
pls = rsh
totalview.6.6.0-2
Linux cadillac82.ccstar.lanl.gov 2.4.24 #1 SMP Thu Jul 1 15:28:04 MDT
2004 i686 i686 i386 GNU/Linux

This commit was SVN r7108.
2005-08-31 16:15:59 +00:00
Jeff Squyres
284328afe3 Add missing .h file so that it is included in the tarball
This commit was SVN r7107.
2005-08-31 11:01:28 +00:00
Rainer Keller
27f1174d0e - Only return the nodes actually allocated to the job.
(necessary when orted handles several jobs simultaneously).

This commit was SVN r7105.
2005-08-31 07:09:47 +00:00
George Bosilca
d64a702a5b There is a missing header. --enable-picky help to track down such kind of errors.
This commit was SVN r7102.
2005-08-31 00:47:52 +00:00
David Daniel
995641c1e6 Don't initialize proctable more than once (since the stage gate 1 trigger
seems to get fired at least twice).

This commit was SVN r7101.
2005-08-31 00:21:55 +00:00
David Daniel
ced11250e4 Basic totalview support for orterun. Close to working, but need to
check hostnames are obtained correctly.

This commit was SVN r7096.
2005-08-30 17:29:43 +00:00
George Bosilca
53ccf0e58c POE is working. It can spawn jobs, redirect the output and is able to kill the job (with or without CTRL_C).
This commit was SVN r7093.
2005-08-30 16:13:55 +00:00
David Daniel
6cb97e6ade Reverting totalview support to *not* use the as yet unimplemented
orte_jobgrp_t.  Now just need to work out where to call it...

This commit was SVN r7092.
2005-08-30 12:59:04 +00:00
Rainer Keller
d7901c97a5 - Del whitespaces, to make coming patch smaller.
This commit was SVN r7089.
2005-08-30 06:58:37 +00:00
Brian Barrett
77ebdf1c6f * Add some debugging output Ralph asked for when an unknown error code is
passed to opal_error

This commit was SVN r7087.
2005-08-29 23:36:53 +00:00
Brian Barrett
d8e5d80892 * add a reasonable first wack at a suppressions file for Valgrind to ignore
some stuff that we can't do anything about
* fix some more memory leaks in session_dir code

This commit was SVN r7086.
2005-08-29 23:05:52 +00:00
Brian Barrett
bf8a3632bb * bunch more memory leak / block in use fixes
This commit was SVN r7085.
2005-08-29 21:35:01 +00:00
Jeff Squyres
7d895a4f08 Add missing header file
This commit was SVN r7071.
2005-08-28 11:50:43 +00:00
Brian Barrett
fc71fd5744 * fix place where Jeff changed an exit to a return and we really wanted
it to be an exit.
* Put the srun process (or what is about to become the srun process) in
  it's own process group so that group-wide signals (such as the 
  SIGINT sent by hitting cntl-c in a shell) are not sent to the srun
  process. 

This commit was SVN r7068.
2005-08-27 17:08:48 +00:00
George Bosilca
5b59ffbe4f Handle multiple IP addresses for the OOB TCP module. We check the addresses in order, and we give up if
and only if all of them failed.

This commit was SVN r7067.
2005-08-27 17:03:19 +00:00