got a whole lot smaller, decreasing the memory footprint of the
running application. How much it's a good question. Here is a
breakdown:
- in mca_bml_base_endpoint_t: 3 *size_t + 1 * uint32_t
- in mca_bml_base_btl_t: 1 * int + 1 * double - 1 * float
+ 6 * size_t + 9 * (void*)
The decrease in mca_bml_base_endpoint_t is for each peer and the
decrease in mca_bml_base_btl_t is for each BTL for each peer.
So, if we consider the most convenient case where there is only
one network between all peers, this decrease the memory foot print
per peer by
9*size_t + 9*(void*) + 2 * int32_t + 1 * double - 1 * float.
On a 64 bits machine this will be 156 bytes per peer.
Now we access all these fields directly from the underlying BTL
structure, and as this structure is common to multiple BML endpoint,
we are a lot more cache friendly. Even if this do not improve the
latency, it makes the SM performance graph a lot smoother.
This commit was SVN r19659.
There was an argument that was barely used, and on return at the PML
level it contained nothing usable. It has been removed, so now we're
using less memory ...
This commit was SVN r19657.
(related to the presence of posix threads and ptmalloc2) is now a
little outdated: since we don't build ptmalloc2 as part of libopal
anymore, the openib BTL's requirements are not directly tied to
ptmalloc2's anymore. Specifically, I altered the test to:
1. At compile time, if no threads are found, the ptmalloc2 component
is going to be built, '''and the ptmalloc2 component is going to be
inside libopal,''' then refuse to build the openib BTL.
1. At run time, if no threads were available at compile time and the
ptmalloc2 component is part of the process, then refuse to use the
openib BTL.
Fixes trac:1537.
This commit was SVN r19652.
The following Trac tickets were found above:
Ticket 1537 --> https://svn.open-mpi.org/trac/ompi/ticket/1537
always in a heterogeneous way in order to be able to support extern32. It
doesn't really matter as it is outside the critical path.
This commit was SVN r19651.
Rationale:
1. This value has already changed since v1.2 (v1.2 MPI_MAX_PORT_NAME
== 36). Hence, this commit simply increases the value from a
previous change.
1. The changes does increase OMPI's memory footprint slightly, but
only when using MPI-2 dynamics. So it is expected that the change
will have minimal impact on the overall footprint.
1. The change is helpful for nodes that have 4 or more IP networks
(e.g., regular ethernet and multiple IP-over-<pick your favorite
high-speed network> networks). Without this change, invoking
MPI_COMM_SPAWN on hosts with 4 or more IP networks will fail
because we'll exceed 256 bytes for the port name. Some OMPI
developer test clusters already have this kind of configuration
(e.g., Cisco); it is expected that this is not too common in the
real world yet, but with "manycore" coming, having multiple
IP-based networks in a single server will likely become more
common.
This commit was SVN r19638.
we already have them in orte_process_info. Refs trac:1523.
This commit was SVN r19615.
The following Trac tickets were found above:
Ticket 1523 --> https://svn.open-mpi.org/trac/ompi/ticket/1523
help messages so that users only see the message once instead of N
times when their MPI app crashes.
Note that there is a tradeoff here -- we now call malloc in this
particular "show the error" code path. This shouldn't usually be a
problem, because the errors typically displayed through this mechanism
are MPI API argument problems (e.g., sending a negative count to
MPI_SEND), and not memory errors. But such API argument errors could
be a consequence of of a prior memory error, so there's a nonzero
chance that the error failure will fail to print because malloc
failed. In this case, the user can disable help message aggregation
(via the orte_base_want_aggregate MCA parameter) and we'll fall back
to the no-malloc code path (but without aggregation).
Note that we won't aggregate before MPI_INIT or after MPI_FINALIZE.
So if you call an MPI function before MPI_INIT / after MPI_FINALIZE,
you'll still see the error message N times. Nothing we can do about
that; we need ORTE to do the aggregation properly (which is obviously
unavailable before MPI_INIT / after MPI_FINALIZE).
This commit was SVN r19611.
Terry and George in the non-sparse-groups scenarios. Fixes trac:1464.
Will file a new ticket to actually resolve IDs when sparse groups are
used.
This commit was SVN r19610.
The following Trac tickets were found above:
Ticket 1464 --> https://svn.open-mpi.org/trac/ompi/ticket/1464
Thanks to George and Jeff for pointing out a better way to do this.
This commit was SVN r19573.
The following SVN revision numbers were found above:
r19566 --> open-mpi/ompi@351c3a3a86
figure it out at runtime (really meaning: we'll still default to "0"
unless something explicitly overrides to 1, such as the openib BTL).
This way, ompi_info doesn't confusingly report mpi_leave_pinned==0 for
mpi_leave_pinned, but we end up running with mpi_leave_pinned==1.
Fixes trac:1502.
This commit was SVN r19571.
The following Trac tickets were found above:
Ticket 1502 --> https://svn.open-mpi.org/trac/ompi/ticket/1502
remove the unconditional opal_output's when mmap() fails, and instead,
conditionally output the failure message via btl_base_verbose settings.
This commit was SVN r19547.
file. breaks windows compilation. see r19502
This commit was SVN r19544.
The following SVN revision numbers were found above:
r19502 --> open-mpi/ompi@ce42e749a0
The new component fixes a number of problems with the old component. The core algorithm is the same, but by changing the data strucutres a bit we have improved performance and memory utilization.
There are still a couple corner cases that still need some work. However, I did not want to delay bringing this into the trunk (and v1.3 branch) for too much longer.
This commit was SVN r19537.
This fixes trac:1477.
Help provided by Jeff and Terry.
This commit was SVN r19533.
The following Trac tickets were found above:
Ticket 1477 --> https://svn.open-mpi.org/trac/ompi/ticket/1477
checking for contiguous datatypes by using our native DDT engine
(rather than several MPI_* calls). The majority of the work is in the
IO ROMIO module.c file, but there's a small part in
adio/common/iscontig.c that we're also submitting upstream.
This commit was SVN r19509.
mostly don't use this mechanism, as we have to be thread safe in order to
really take full advantage of it. The unexpected handler is called by one
of th MX threads, and we do not have control on the moment. In a non threaded
case, this will completely destroy our recv requests queues, so the safest
approach is to don't the unexpected handler.
This commit was SVN r19496.