1
1
Граф коммитов

9352 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
241545a098 Back out r14073 - it speeds up TCP latency / bandwidth but at the same time
it kills ROMIO and one-sided performance when using only TCP.  The problem
is that it only allows those two to be progressed every couple of seconds,
leading to what looks like hangs in the one-sided tests (and the ROMIO stuff,
although people seem to not notice that at this point).

This commit was SVN r14142.

The following SVN revision numbers were found above:
  r14073 --> open-mpi/ompi@64fbbc20b8
2007-03-26 15:56:23 +00:00
Sven Stork
548c511700 - export required symbol
This commit was SVN r14140.
2007-03-26 13:54:20 +00:00
Sven Stork
44ead58103 - export component structure
This commit was SVN r14139.
2007-03-26 13:46:00 +00:00
Ralph Castain
0d98264097 Fix the nolocal option on the OMPI trunk
This commit was SVN r14138.
2007-03-24 16:16:16 +00:00
Galen Shipman
48d1fa830d A race condition exists on the free list of pending connections because
OPAL_FREE_LIST_WAIT/RETURN will not use locks in a non-threaded build
conditionaly use locks if non-threaded around the OPAL_FREE_LIST_WAIT/RETURN 
seems to fix the issue 
Tested at 4K processes and seems to work.. 

This commit was SVN r14135.
2007-03-23 15:19:03 +00:00
Josh Hursey
7c4ca3c420 remove some stale code
This commit was SVN r14134.
2007-03-23 14:11:12 +00:00
Brian Barrett
d454395b51 Need to fall back on the event listen mode if the MCA parameter said use the
listen thread, but we're not the HNP.  This is better than not starting up
any listen mode, which is what we were doing before :/

This commit was SVN r14133.
2007-03-23 13:29:18 +00:00
Jeff Squyres
bcdfbacaa4 Oops -- typo from previous commit. :-(
This commit was SVN r14130.
2007-03-23 00:51:50 +00:00
Jeff Squyres
2105f444ec Add missing header file
This commit was SVN r14129.
2007-03-23 00:47:30 +00:00
Jeff Squyres
a3dd0f2e08 Connect --nolocal up to the MCA param rmaps_base_schedule_local, as it
should be (it's a mistake that it got left out).

This commit was SVN r14127.
2007-03-22 19:29:47 +00:00
Sven Stork
6111ca1152 - Let's try to detect the default nodefile directory because it can different
for different sites. If we cannot detect the default then we fall back to 
  the hard coded path.

This commit was SVN r14121.
2007-03-22 15:26:16 +00:00
Gleb Natapov
e5450613b5 Add new SM BTL parameter btl_sm_cb_max_num. If set to value greater then zero
it limits the number of circular buffers allocated between each pair of peers.
This allows for more tight memory usage control.

This commit was SVN r14120.
2007-03-22 12:21:42 +00:00
Gleb Natapov
efe0323d35 Initialize fifos at SM BTL init time instead of waiting for first send. This
waist slightly more memory, but prevents problem when fifo cannot be allocated
later during a job run when memory resource is exhausted.

This commit was SVN r14119.
2007-03-22 12:18:44 +00:00
Galen Shipman
e654604a25 remove invalid comment
This commit was SVN r14118.
2007-03-22 03:51:36 +00:00
Galen Shipman
ace68b1883 Change the way we handle unexpected messages,
if less than or equal  pml_ob1_unexpected_limit just buffer in the PML level recv
fragment else allocate a buffer via the bucket allocator 

This commit was SVN r14117.
2007-03-22 01:00:34 +00:00
Tim Mattox
43ef61b808 Update the NEWS file for more 1.2.1 changes.
This commit was SVN r14115.
2007-03-21 20:39:51 +00:00
Gleb Natapov
c389c47d79 Fix SM connectivity calculations.
This commit was SVN r14109.
2007-03-21 13:29:19 +00:00
Jeff Squyres
3e2031e0e3 Finally commit something that has been sitting around in one of my
development trees since last year (had to wait for some intel tests to
run yesterday, so I finally took the time to finish this work):

 * Improve MPI API argument checking by also checking for NULL values
   (especially helps when invalid Fortran MPI handles are passed,
   because the various MPI_*f2c functions are supposed to return an
   "invalid" MPI handle [meaning NULL] when this happens).  So now
   OMPI will generate an MPI exception rather than a segv.
 * Removed a few redundant DATATYPE_NULL checks.
 * Also check for some other forms of "invalid" handles (e.g., already
   been freed, etc.) in some cases.  We could probably be a bit more
   stringent in this regard if we really wanted to.
 * Change MPI_Get_processor_name to zero out the string up to
   MPI_MAX_PROCESSOR_NAME characters, because the MPI spec says that
   the string must be at least that long.  We were already passing
   that length to gethostname(), anyway.

This commit was SVN r14100.
2007-03-21 11:10:42 +00:00
Gleb Natapov
a1a14aa4c3 Add memory barriers during SM btl initialization.
This commit was SVN r14099.
2007-03-21 10:25:10 +00:00
Gleb Natapov
435565590f Don't relay on opcode to decide how to progress pending message.
This commit was SVN r14098.
2007-03-21 07:59:59 +00:00
Josh Hursey
299332ecac fix small compiler warning
This commit was SVN r14097.
2007-03-21 04:44:54 +00:00
Tim Mattox
26b8858029 Tweak a NEWS entry.
This commit was SVN r14095.
2007-03-21 01:10:50 +00:00
Tim Mattox
72b90cb866 Update the NEWS file for v1.2.1.
This commit was SVN r14093.
2007-03-21 00:58:31 +00:00
Brian Barrett
464d536928 remove debugging printf
This commit was SVN r14088.
2007-03-20 21:28:28 +00:00
Josh Hursey
3492fdeae3 Fix a couple of compiler warnings (errors?) caught by ICC testing at Cisco.
This commit was SVN r14080.
2007-03-20 14:12:13 +00:00
Rainer Keller
1322f9f346 - Further attributes mainly for opal/* functions, marking
__opal_attribute_nonnull__, __opal_attribute_warn_unused_result__,
   __opal_attribute_malloc__, __opal_attribute_sentinel__ and
   __opal_attribute_format__

This commit was SVN r14078.
2007-03-20 13:01:32 +00:00
Sven Stork
d67565b042 - use include path relative to opal/include or this header file will not work when installed "--with-devel-headers"
This commit was SVN r14077.
2007-03-20 12:38:06 +00:00
George Bosilca
8c9e4baa47 Add multi-link capabilities to the TCP BTL. This is useful for systems where the
latency is high and the network relatively fast. This will allow for more kernel
level buffering, which allow overlap between system calls and communications.
Somehow, even on fast clusters there is an improvement (non significant).

This patch create multiple modules for the same device, which in turn will
create multiple sockets between the peers. By default the number of BTL by
device is set to 1, so there is no fundamental difference with the current
version. Change the value of btl_tcp_links to enable multiple links between
peers.

This commit was SVN r14076.
2007-03-20 11:50:17 +00:00
George Bosilca
0edd770644 Nothing really relevant.
This commit was SVN r14075.
2007-03-20 11:21:23 +00:00
George Bosilca
4332295b32 Typos.
This commit was SVN r14074.
2007-03-20 11:18:05 +00:00
George Bosilca
64fbbc20b8 Switch the event engine to a blocking mode if there is no high performance
networks available.

This commit was SVN r14073.
2007-03-20 11:15:08 +00:00
Rainer Keller
249abd29c2 - Mark some deprecated functions (two still commented) and fix to
not use opal_cmd_line_make_opt anymore.

This commit was SVN r14072.
2007-03-20 10:08:58 +00:00
Gleb Natapov
e551c5f1a3 Get rid of separate sm BTL for different shared memory base addresses. Now,
when we precalculate most of the addresses there is no point to have separate
BTL for this. The sm_progress() code become much more simple as a result.

This commit was SVN r14071.
2007-03-20 08:15:58 +00:00
Pak Lui
803655b555 * incorporated some of Jeff's comment regarding this fix.
This commit was SVN r14070.
2007-03-19 21:59:48 +00:00
Josh Hursey
7ab741c1e2 - Add some debugging hooks for the CR runtime MCA params
- Add signal handler BLCR register (helps with debugging)
- ifdef out the cr_request_file section for checkpointing self.
  There is a bug with the 0.4.2 version of BLCR such that this
  does not handle moving checkpoint files around.
  I'm following up with the BLCR folks on this one (and checking
  the newest release).

This commit was SVN r14069.
2007-03-19 21:18:03 +00:00
Jelena Pjesivac-Grbovic
d6402b6898 Adding in-order binary tree algorithm for non-commutative reduce operations.
I tested algorithm with intel and ibm tests and it passed again - so it should work.

This commit was SVN r14068.
2007-03-19 21:03:57 +00:00
Pak Lui
da4d41e0e7 * fixed the missing fclose and eliminate the call to get_slot_count
since it is not needed

This commit was SVN r14066.
2007-03-19 17:47:30 +00:00
Rich Graham
d2e799f6b5 add some stub functions for the cnos environment.
This commit was SVN r14065.
2007-03-19 17:35:46 +00:00
Josh Hursey
101a2abd09 - Be more careful with parens
- Run the destructor *before* shutting things down.

This commit was SVN r14064.
2007-03-19 17:33:20 +00:00
Brian Barrett
ea08a555f9 Fixed a compile error on OS X 10.3 introduced with 1.1.5 / 1.2. Thanks
to Marius Schamschula for reporting the issue.

This commit was SVN r14063.
2007-03-19 17:25:54 +00:00
Jeff Squyres
a754caf85f Updates from v1.1 tree
This commit was SVN r14060.
2007-03-19 13:04:49 +00:00
Josh Hursey
e1a18fa149 Patch from Gleb
Always set opcode appropriately before calling ibv_post_send.

This commit was SVN r14056.
2007-03-18 13:33:15 +00:00
Josh Hursey
a181c987cc Remove some old references to ft_enable parameter that no longer exists.
This was replaced by the "-am ft-enable-cr" AMCA parameter.

This commit was SVN r14055.
2007-03-17 20:02:42 +00:00
Josh Hursey
d03073e87d Make sure to protect the finalize call so tools like ompi_info
do not segv.

This commit was SVN r14054.
2007-03-17 19:47:54 +00:00
Josh Hursey
6d29146748 fix dumb logic break in the PML selection finalization
This commit was SVN r14053.
2007-03-17 16:33:43 +00:00
Josh Hursey
dadca7da88 Merging in the jjhursey-ft-cr-stable branch (r13912 : HEAD).
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.

This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.

This commit closes trac:158

More details to follow.

This commit was SVN r14051.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r13912

The following Trac tickets were found above:
  Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
2007-03-16 23:11:45 +00:00
Josh Hursey
924cb0af11 revert Sanity check...
This commit was SVN r14048.
2007-03-16 22:15:21 +00:00
Josh Hursey
a26e636e81 Sanity check...
This commit was SVN r14047.
2007-03-16 22:14:47 +00:00
Brian Barrett
1229ea7d4f Update news to relate to r14045
This commit was SVN r14046.

The following SVN revision numbers were found above:
  r14045 --> open-mpi/ompi@01d6121c7f
2007-03-16 21:29:05 +00:00
Brian Barrett
01d6121c7f * The MoreBacktrace code supplied by Apple doesn't work on 64 bit Intel
builds, so disable it there
  * On 10.4.8 (and possibly others), siginfo is NULL in the signal
    callback on 64 bit Intel builds, so account for that in the signal
    callback.

This commit was SVN r14045.
2007-03-16 21:27:19 +00:00