1
1
Граф коммитов

1087 Коммитов

Автор SHA1 Сообщение Дата
Josh Hursey
8fd6d4ba09 add a newline so output is cleaner/clearer
This commit was SVN r14229.
2007-04-05 17:45:03 +00:00
Ralph Castain
e95539a16a Add two new test codes - orte_loop_spawn/child - to help debug issues surrounding multiple calls to comm_spawn
This commit was SVN r14217.
2007-04-04 21:02:18 +00:00
Jeff Squyres
2cbcb4abf1 Remove the French and strip the tests down to essentials (no need for
buffer attaching/detaching, for example).

This commit was SVN r14216.
2007-04-04 15:38:23 +00:00
Ralph Castain
d5b5cd2d3c Add test code for multiple comm_spawn calls.
Add ERROR_LOG calls to more clearly document failures in the rsh launcher.

This commit was SVN r14214.
2007-04-04 13:24:39 +00:00
Jeff Squyres
fe58753a23 Add a little documentation to iof.h.
This commit was SVN r14208.
2007-04-03 18:17:35 +00:00
George Bosilca
f2a6b9394f Deal with the include spree. Protect "environ" on Windows.
Some others minors modifications in order to make it
compile [again] on Windows.

This commit was SVN r14188.
2007-04-01 16:16:54 +00:00
George Bosilca
01a4f56369 Mostly DECLSPEC cleanups and some include corrections.
This commit was SVN r14186.
2007-04-01 16:08:27 +00:00
Tim Prins
2f74160a37 Fix some more memory leaks
This commit was SVN r14175.
2007-03-30 13:43:50 +00:00
George Bosilca
d367d9017c Need the definition of opal_output_close.
This commit was SVN r14167.
2007-03-29 01:18:26 +00:00
Tim Prins
9cb455272b Fix a pile of memory leaks in ORTE.
Fix a major memory leak in the SLURM RAS, and cleanup a bit of code there.

This commit was SVN r14164.
2007-03-29 00:50:56 +00:00
Sven Stork
44ead58103 - export component structure
This commit was SVN r14139.
2007-03-26 13:46:00 +00:00
Ralph Castain
0d98264097 Fix the nolocal option on the OMPI trunk
This commit was SVN r14138.
2007-03-24 16:16:16 +00:00
Galen Shipman
48d1fa830d A race condition exists on the free list of pending connections because
OPAL_FREE_LIST_WAIT/RETURN will not use locks in a non-threaded build
conditionaly use locks if non-threaded around the OPAL_FREE_LIST_WAIT/RETURN 
seems to fix the issue 
Tested at 4K processes and seems to work.. 

This commit was SVN r14135.
2007-03-23 15:19:03 +00:00
Brian Barrett
d454395b51 Need to fall back on the event listen mode if the MCA parameter said use the
listen thread, but we're not the HNP.  This is better than not starting up
any listen mode, which is what we were doing before :/

This commit was SVN r14133.
2007-03-23 13:29:18 +00:00
Jeff Squyres
bcdfbacaa4 Oops -- typo from previous commit. :-(
This commit was SVN r14130.
2007-03-23 00:51:50 +00:00
Jeff Squyres
2105f444ec Add missing header file
This commit was SVN r14129.
2007-03-23 00:47:30 +00:00
Jeff Squyres
a3dd0f2e08 Connect --nolocal up to the MCA param rmaps_base_schedule_local, as it
should be (it's a mistake that it got left out).

This commit was SVN r14127.
2007-03-22 19:29:47 +00:00
Sven Stork
6111ca1152 - Let's try to detect the default nodefile directory because it can different
for different sites. If we cannot detect the default then we fall back to 
  the hard coded path.

This commit was SVN r14121.
2007-03-22 15:26:16 +00:00
Galen Shipman
e654604a25 remove invalid comment
This commit was SVN r14118.
2007-03-22 03:51:36 +00:00
Josh Hursey
3492fdeae3 Fix a couple of compiler warnings (errors?) caught by ICC testing at Cisco.
This commit was SVN r14080.
2007-03-20 14:12:13 +00:00
Rainer Keller
1322f9f346 - Further attributes mainly for opal/* functions, marking
__opal_attribute_nonnull__, __opal_attribute_warn_unused_result__,
   __opal_attribute_malloc__, __opal_attribute_sentinel__ and
   __opal_attribute_format__

This commit was SVN r14078.
2007-03-20 13:01:32 +00:00
Pak Lui
803655b555 * incorporated some of Jeff's comment regarding this fix.
This commit was SVN r14070.
2007-03-19 21:59:48 +00:00
Pak Lui
da4d41e0e7 * fixed the missing fclose and eliminate the call to get_slot_count
since it is not needed

This commit was SVN r14066.
2007-03-19 17:47:30 +00:00
Rich Graham
d2e799f6b5 add some stub functions for the cnos environment.
This commit was SVN r14065.
2007-03-19 17:35:46 +00:00
Josh Hursey
101a2abd09 - Be more careful with parens
- Run the destructor *before* shutting things down.

This commit was SVN r14064.
2007-03-19 17:33:20 +00:00
Brian Barrett
ea08a555f9 Fixed a compile error on OS X 10.3 introduced with 1.1.5 / 1.2. Thanks
to Marius Schamschula for reporting the issue.

This commit was SVN r14063.
2007-03-19 17:25:54 +00:00
Josh Hursey
a181c987cc Remove some old references to ft_enable parameter that no longer exists.
This was replaced by the "-am ft-enable-cr" AMCA parameter.

This commit was SVN r14055.
2007-03-17 20:02:42 +00:00
Josh Hursey
d03073e87d Make sure to protect the finalize call so tools like ompi_info
do not segv.

This commit was SVN r14054.
2007-03-17 19:47:54 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Jeff Squyres
c000ee5328 Fixes trac:921
* Do not empty the list of in-flight frags during _close(); the OOB
   callback will still occur (_send_cb()) and try to remove the frag
   from the list, which will then result in an assert failure (debug
   builds).  
 * Add one more fix for a possible problem -- add an extra RETAIN /
   RELEASE pair on the endpoint to ensure that it is not actually
   freed before all in-flight frags have drained.

This commit was SVN r13953.

The following Trac tickets were found above:
  Ticket 921 --> https://svn.open-mpi.org/trac/ompi/ticket/921
2007-03-07 20:12:22 +00:00
Tim Prins
fe3ea0085f Fix minor memory leaks
This commit was SVN r13946.
2007-03-07 01:09:38 +00:00
Jeff Squyres
7b72ded10c Patch from Gotz Waschk to recognize zsh.
This commit was SVN r13907.
2007-03-03 01:42:03 +00:00
Li-Ta Lo
a0e5b6a27c minor clean up and treespawn support
This commit was SVN r13876.
2007-03-01 22:32:37 +00:00
Josh Hursey
0404444dbe * Added 2 new MCA parameters
- mca_base_param_file_prefix
     (Default: NULL)
     This is the fullname of the "-am" mpirun option. Used to specify a ':'
     separated list of AMCA parameter set files.
  - mca_base_param_file_path
     (Default: $SYSCONFDIR/amca-param-sets/:$CWD)
     The path to search for AMCA files with relative paths. A warning will be
     printed if the AMCA file cannot be found.

* Added a new function "mca_base_param_recache_files" the re-reads the file
configurations. This is used internally to help bootstrap the MCA system.

* Added a new orterun/mpirun command line option '-am' that aliases for the
mca_base_param_file_prefix MCA parameter

* Exposed the opal_path_access function as it is generally useful in other
places in the code.

* New function "opal_cmd_line_make_opt_mca" which will allow you to append a
new command line option with MCA parameter identifiers to set at the same
time. Previously this could only be done at command line declaration time.

* Added a new directory under the $pkgdatadir named "amca-param-sets" where all
the 'shipped with' Open MPI AMCA parameter sets are placed. This is the first
place to search for AMCA sets with relative paths.

* An example.conf AMCA parameter set file is located in
contrib/amca-param-sets/.

* Jeff Squyres contributed an OpenIB AMCA set for benchmarking.

Note: You will need to autogen with this commit as it adds a configure param.
  Sorry :(

This commit was SVN r13867.
2007-03-01 13:39:20 +00:00
Rainer Keller
0889ebd59f - Eliminate warnings, that PGI-6.2.5 issues with -Minform=inform
This commit was SVN r13840.
2007-02-28 08:36:34 +00:00
George Bosilca
4bab882d17 These 2 ORTE_DECLSPEC are not required.
This commit was SVN r13825.
2007-02-27 15:45:40 +00:00
Sven Stork
d8a369936e - Fix more symbols that should be exported.
This commit was SVN r13824.
2007-02-27 15:17:17 +00:00
Sven Stork
a86deb460e - export required symbols
This commit was SVN r13810.
2007-02-27 09:43:32 +00:00
Tim Prins
c6f2efe4b8 These are orte functions, the structure should be named as such
This commit was SVN r13765.
2007-02-22 23:29:31 +00:00
George Bosilca
d29423b1f7 orted_globals_t should be global.
This commit was SVN r13684.
2007-02-16 18:16:06 +00:00
Brian Barrett
f6a5d58885 Rather than set the connect event timeout number to something big and hoping
its bigger than the timeout for the connect() call, just don't register
the handler by default and fall back to connect() timing out.  Should give
much happier performance on big clusters.

This commit was SVN r13639.
2007-02-13 18:36:50 +00:00
Pak Lui
085826d94a * Remove the code for putting the bogus exit status of the user proc.
Also remove the smr set_proc_state since it's covered elsewhere.

This commit was SVN r13625.
2007-02-12 23:59:27 +00:00
Brian Barrett
8b28e5b33d Allow the OOB to connect between all MPI applications during MPI_INIT
without also establishing MPI connectivity. 

This commit was SVN r13595.
2007-02-09 20:17:37 +00:00
Brian Barrett
262cbbc5c9 Back out r13593, which contained a change that shouldn't be committed.
This commit was SVN r13594.

The following SVN revision numbers were found above:
  r13593 --> open-mpi/ompi@81472363ea
2007-02-09 20:13:02 +00:00
Brian Barrett
81472363ea Allow the OOB to connect between all MPI applications during MPI_INIT
without also establishing MPI connectivity.

This commit was SVN r13593.
2007-02-09 20:11:40 +00:00
Pak Lui
2d6b3776bf * fix the SEGV described in trac #892 that the exit_status in the 200 range
causes a strsignal to show NULL as a result. Still trying to determine
  why exit_status is in that range.

This commit was SVN r13583.
2007-02-09 16:39:30 +00:00
Ralph Castain
5818a32245 Bring in a forgotten speed improvement for the TM launcher that was developed during SNL Tbird testing last year. Remove the redundant and slow calls to TM to resolve hostnames. Instead, read the host info from the PBS file during the RAS, and then just use that info in the PLS (rather than getting it again).
Adjust the RMAPS mapped_node object to propagate the required launch_id info now included in the ras_node object. This provides support for those few systems that don't use nodename to launch, but instead want some id (typically an index into the array of allocated nodes). This value gets set for each node in the RAS - the RMAPS just propagates it for easy launch.

This commit was SVN r13581.
2007-02-09 15:06:45 +00:00
George Bosilca
79d76b044a ORTE_DECL everything that can be used outside the base directory. I
woner why this file is called private when it's included by all PLS ...

This commit was SVN r13573.
2007-02-09 03:16:19 +00:00
George Bosilca
7750ed22e0 Correct the Windows part of the universe detection.
This commit was SVN r13547.
2007-02-07 22:37:28 +00:00
Pak Lui
ccff0a6e65 * minor fix to correct the pid that always shows up as 0 in the abort
error message. e.g: 

  mpirun noticed that job rank 2 with PID 0 on node burl-ct-v440-4
  exited on signal 15 (Terminated).

This commit was SVN r13537.
2007-02-07 17:46:19 +00:00