Back out r14073 - it speeds up TCP latency / bandwidth but at the same time
it kills ROMIO and one-sided performance when using only TCP. The problem
is that it only allows those two to be progressed every couple of seconds,
leading to what looks like hangs in the one-sided tests (and the ROMIO stuff,
although people seem to not notice that at this point).
This commit was SVN r14144.
The following SVN revision numbers were found above:
r14073 --> open-mpi/ompi@64fbbc20b8
r14142 --> open-mpi/ompi@241545a098
it kills ROMIO and one-sided performance when using only TCP. The problem
is that it only allows those two to be progressed every couple of seconds,
leading to what looks like hangs in the one-sided tests (and the ROMIO stuff,
although people seem to not notice that at this point).
This commit was SVN r14142.
The following SVN revision numbers were found above:
r14073 --> open-mpi/ompi@64fbbc20b8
it limits the number of circular buffers allocated between each pair of peers.
This allows for more tight memory usage control.
This commit was SVN r14120.
waist slightly more memory, but prevents problem when fifo cannot be allocated
later during a job run when memory resource is exhausted.
This commit was SVN r14119.
if less than or equal pml_ob1_unexpected_limit just buffer in the PML level recv
fragment else allocate a buffer via the bucket allocator
This commit was SVN r14117.
development trees since last year (had to wait for some intel tests to
run yesterday, so I finally took the time to finish this work):
* Improve MPI API argument checking by also checking for NULL values
(especially helps when invalid Fortran MPI handles are passed,
because the various MPI_*f2c functions are supposed to return an
"invalid" MPI handle [meaning NULL] when this happens). So now
OMPI will generate an MPI exception rather than a segv.
* Removed a few redundant DATATYPE_NULL checks.
* Also check for some other forms of "invalid" handles (e.g., already
been freed, etc.) in some cases. We could probably be a bit more
stringent in this regard if we really wanted to.
* Change MPI_Get_processor_name to zero out the string up to
MPI_MAX_PROCESSOR_NAME characters, because the MPI spec says that
the string must be at least that long. We were already passing
that length to gethostname(), anyway.
This commit was SVN r14100.
latency is high and the network relatively fast. This will allow for more kernel
level buffering, which allow overlap between system calls and communications.
Somehow, even on fast clusters there is an improvement (non significant).
This patch create multiple modules for the same device, which in turn will
create multiple sockets between the peers. By default the number of BTL by
device is set to 1, so there is no fundamental difference with the current
version. Change the value of btl_tcp_links to enable multiple links between
peers.
This commit was SVN r14076.
when we precalculate most of the addresses there is no point to have separate
BTL for this. The sm_progress() code become much more simple as a result.
This commit was SVN r14071.
This merge adds Checkpoint/Restart support to Open MPI. The initial
frameworks and components support a LAM/MPI-like implementation.
This commit follows the risk assessment presented to the Open MPI core
development group on Feb. 22, 2007.
This commit closes trac:158
More details to follow.
This commit was SVN r14051.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r13912
The following Trac tickets were found above:
Ticket 158 --> https://svn.open-mpi.org/trac/ompi/ticket/158
In that case, sendcount and sendtype are not valid and we need to use
recvcount and recvtype.
This commit fixes trac:943. Reviewed by Jelena Pjesivac-Grbovic.
This commit was SVN r14022.
The following Trac tickets were found above:
Ticket 943 --> https://svn.open-mpi.org/trac/ompi/ticket/943
Queue_empty is determined by the reader, and is it's local view.
However, the writer may continue writing to this queue. The decision
to go on to the next cb_fifo is done in an atomic region, checking the
writer's view. The writer also "changes it's view" in an atomic
region protected by the same lock.
This commit was SVN r13968.
test, Sun's darray test, and an internal LANL test code. I would not
assume it will work properly on other codes, as I'm still not sure I
completely understand what the standard says this function is supposed to
do.
Refs trac:65
This commit was SVN r13967.
The following Trac tickets were found above:
Ticket 65 --> https://svn.open-mpi.org/trac/ompi/ticket/65
- fixing line lengths and some of the comments
- possible bug fix (but I do not think we exposed it in any tests so far)
temporary buffers were allocated as multiples of extent instead of
true_extent + (count -1) * extent.
Everything is still passing Intel tests over tcp and btl mx up to 64 nodes.
This commit was SVN r13956.
The original code was not compensating for the space used by the header.
When memory got tight, the allocator would return a pointer to memory that
did not exist resulting in a SEGV for the application. This is a partial
fix for ticket #929.
Reviewed by Rich Graham.
This commit was SVN r13950.
create a duplicate type, because any duplicate type lose the PREDEFINED flag.
An MPI_LB (respectively MPI_UB) without the PREDEFINED tag is useless, as it's
not the a marker anymore. The solution is to return the same pointer, but once
the reference count has been increased. In order for this to work, I allowed
the destruction to check for the reference count of an object before complaining
about destroying a predefined type.
This fixed ticket #317.
This commit was SVN r13942.
Currently 3 algorithms are available:
- non-overlapping, reduce + scatterv, (works for non-commutative operations)
- recursive halving algorithm (copied from basic module)
- ring algorithm (similar to allreduce ring, for large messages)
This commit was SVN r13929.
MCA_BTL_SM_FRAG_SEND) and status success/fail in low bits of pointers we
are passing through circular buffer. The rank that receives ACK doesn't need
to look into data it received and this is a big win since this data is not in
the cache of the rank's CPU. (Note that we can use low bits of pointers because
free_list always return pointers aligned at least to cache line size).
This commit was SVN r13922.