George wrote the initial patch, I extended it slightly and am responsible for all bugs found.
Refs trac:587
This commit was SVN r13023.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
This is somewhat limited currently for expample, if you have 3 ports on Node A and 5 ports
on Node B then the peers will use 3 ports to communicate with each other.
This is on a subnet basis, so for any pair of nodes we take the
intersection of the available ports within a subnet.
We use subnets to determine reachability for lazy connection establishment. So
if Node A and Node B each have two HCA's (on seperate networks) then the
subnet's must be distinct, otherwise we will try to wire up HCA's on seperate
networks.
This commit was SVN r12978.
* Make sure that the pval always writes to the correct portion of the
lval. This only matters on 32 bit big endian machines.
* On 32 bit machines when assigning to pval, the other 4 bytes of lval
weren't being written, which could lead to bogus data
We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes. Refs trac:587
This commit was SVN r12974.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
is allocated on a per comm_world instance, with the lowest rank
in comm_world on the given host creating and initializing the file,
and then notifying the remaining files via the OOB.
Reviewed: Ralph Castain, Brian Barrett
Addressing ticket #674.
This commit was SVN r12949.
completion of the RDMA operation associated with the fragment. The
PML will call the BML free which in turn will call the BTL free. The MX
BTL will not release the fragment if it not tagged with 0xff.
This commit was SVN r12947.
inuse decrement for the increment that was at the start of the procs loop.
Otherwise, the inuse count can end up higher than it actually is and a btl
can end up in the progress loop when it isn't active to any peer.
Refs trac:543
This commit was SVN r12938.
The following Trac tickets were found above:
Ticket 543 --> https://svn.open-mpi.org/trac/ompi/ticket/543
the data was buffered by the MX library. If it's the case then we declare
the send as completed and disable the completion event for the mx request.
This commit was SVN r12935.
protocol over the MX BTL. Now, we have only one matching, the one in Open
MPI.
The problem is that when the unexpected handler is triggered, not all the
message is on the host memory. In the best case we get one MX fragment (internal
MX fragment), in the worst we get NULL. The only way to fit this with the
design of the PML is to force the eager protocol at the MX internal fragment
size, and to limit the send/receive protocol at the same size. Tests show
the outcome is not far from optimal (if the pipeline depth is increased
a little bit).
Set MX_PIPELINE_LOG in order to allow MX to use internal fragments of 4K.
This commit was SVN r12930.
performance on a 2G Myrinet card, as it look like pipelining the messages
by 1M is faster than a simple send/receive. However, when using a 10G card
the send/receive will limit the maximum bandwidth to 2.5Gbs. The reason is
the scarce bus resources that have to be shared between the Myrinet hardware
and the memcpy operation. The PUT protocol remove the memcpy, we now have a
true zero-copy mechanism. But, there is no pipelining yet as it look like the
RDMA pipeline somehow disappeared from the OB1 PML ...
This commit was SVN r12925.
It contains four algorithms:
Bruck (ciel(logP) steps), Recursive Doubling (log(P) for power-of-2 processes), Ring (P-1 steps),
and Neighbor Exchange (P/2 steps for even number of processes).
All algorithms passed occ, IMB-2.3, and intel verification tests from ompi-tests/ for up to 56 processes.
The fixed decision function is based on results collected over MX on the Grig cluster at
the University of Tennessee at Knoxville.
I have also added (and commented out) copy of MPICH2 decision function for allgather
(from their IJHPCA 2005 paper).
This commit was SVN r12910.
which can cause segfaults on shutdown. Calling mx_finalize() isn't enough
to shutdown the thread, so must close endpoints as well.
Refs trac:513
This commit was SVN r12908.
The following Trac tickets were found above:
Ticket 513 --> https://svn.open-mpi.org/trac/ompi/ticket/513
if the remote architecture differs from the local architecture and the
btl doesn't support heterogeneous transport.
Refs trac:587
This commit was SVN r12879.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).
This commit was SVN r12878.
Move the req_mtl structure back to the end of each of the structures in
the CM PML. The req_mtl structure is cast into a mtl_*_request_structure
for each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was
moved as an optimization at some point, which caused buffer sends to fail...
Refs trac:669
This commit was SVN r12873.
The following SVN revision numbers were found above:
r12871 --> open-mpi/ompi@597598b712
The following Trac tickets were found above:
Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
CM PML. The req_mtl structure is cast into a mtl_*_request_structure for
each MTL, which is larger than the req_mtl itself. The cast will cause
the *_request to overwrite parts of the heavy requests if the req_mtl
isn't the *LAST* thing on each structure (hence the comment). This was
moved as an optimization at some point, which caused buffer sends to
fail...
Refs trac:669
This commit was SVN r12871.
The following Trac tickets were found above:
Ticket 669 --> https://svn.open-mpi.org/trac/ompi/ticket/669
Add ability for ini files to recognize "use_eager_rdma" flag. Set the
default to "no" (because we should assume that HCAs cannot support the
property necessary for using RDMA for eager messages -- that the last
byte of the message is guaranteed to be written to memory last --
unless proven otherwise. For example, iWARP cards apparently do not
provide this guarantee), and then set all Mellanox and IBM HCAs to
override the default to enable this behavior on these cards.
This commit was SVN r12851.
The following Trac tickets were found above:
Ticket 366 --> https://svn.open-mpi.org/trac/ompi/ticket/366
I found only two places that were looking at the tokens:
1. the odls - we used the tokens to separately process the globals container data from everything else. In this case, I left the subscription that returned the globals data alone, but "stripped" the subscription that returned the launch data for the procs. These subscriptions have nothing to do with the xcast message.
2. the pml_base_modex - the callback function was getting process names from the returned tokens. Actually, this function was doing a very bad thing - it was assuming that the first token returned was *always* the process name. This is currently true, but is one of those assumptions that someone could have easily changed - and suddenly found the system inexplicably failing. I modified the function to (a) get the name sent back to us, (b) "stripped" the value structures of tokens and segment strings, and (c) correctly obtained process names from the returned values. I also reindented the heck out of the code so it was legible (at least, to my old eyes).
This commit was SVN r12813.
usually is ok on little-endian systems, as the upper 32 bits will likely
be ignored, but on 32-bit big-endian systems, lval is complete junk.
Use ival if 32 bit mode, lval if 64.
Mixing of 32 and 64 bit architectures won't work without more changes.
This commit was SVN r12802.
* Do not add new procs to the global list during modex callback or
when sharing orte names during accept/connect. For modex, we
cache the modex info for later, in case that proc ever does get
added to the global proc list. For accept/connect orte name
exchange between the roots, we only need the orte name, so no
need to add a proc structure anyway. The procs will be added
to the global process list during the proc exchange later in
the wireup process
* Rename proc_get_namebuf and proc_get_proclist to proc_pack
and proc_unpack and extend them to include all information
needed to build that proc struct on a remote node (which
includes ORTE name, architecture, and hostname). Change
unpack to call pml_add_procs for the entire list of new
procs at once, rather than one at a time.
* Remove ompi_proc_find_and_add from the public proc
interface and make it a private function. This function
would add a half-created proc to the global proc list, so
making it harder to call is a good thing.
This means that there's only two ways to add new procs into the global proc list at this time: During MPI_INIT via the call to ompi_proc_init, where my job is added to the list and via ompi_proc_unpack using a buffer from a packed proc list sent to us by someone else. Currently, this is enough to implement MPI semantics. We can extend the interface more if we like, but that may require HNP communication to get the remote proc information and I wanted to avoid that if at all possible.
Refs trac:564
This commit was SVN r12798.
The following Trac tickets were found above:
Ticket 564 --> https://svn.open-mpi.org/trac/ompi/ticket/564
r12714) for supporting compilers / architectures with different
padding rules.
This commit was SVN r12749.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r12491
r12714
hits the buffer on the other side. For this kind of BTLs we need to send
FIN through the same BTL, PUT was performed with so network will handle
ordering for us. If we will use another BTL, receiver can get FIN before
data will hit the buffer and complete request prematurely. We mark such
problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA
is really fake, because the real one guaranties that sender will see the
completion only after receiver's NIC confirmed that all the data was
received).
This commit was SVN r12732.
It calls mca_pml_ob1_send_fin_btl() which may fail and doesn't check return
code. This breaks all RDMA transports event when only one BTL is used. Revert
it for now, I am working on a real fix for the problem (I hope).
This commit was SVN r12731.
The following SVN revision numbers were found above:
r12720 --> open-mpi/ompi@3e3689320b
regresion from v1.1 was reviewed and put to v1.2 branch. So revert this part
of r12721 back.
This commit was SVN r12730.
The following SVN revision numbers were found above:
r12433 --> open-mpi/ompi@82f7c0dd69
r12721 --> open-mpi/ompi@3edd850d2e
protocol when multiple NICS are available between 2 peers. The fix force
the FIN message to take exactly the same path as the fragment it describe
(i.e. same path means same BTL). Otherwise, the FIN can be received by
the peer before the RDMA complete and the request will get freed
too early.
This commit was SVN r12720.
- consistent error message when something fails (via BTL_ERROR macro)
- decrease the number of jumps.
- cleanup some parts of the code.
This commit was SVN r12719.
The temporary solution is to switch into EV_NONBLOCK mode earlier (right after the mx_connect loop) so that there isn't a giant slowdown when processes enter the stage gate 2 barrier before other proesses. They will now not block in the event library for any period of time, which appears to have a 50% speedup when running at > 64 procs.
Refs trac:645
This commit was SVN r12713.
The following Trac tickets were found above:
Ticket 645 --> https://svn.open-mpi.org/trac/ompi/ticket/645
so this isn't an issue there either. Refs trac:488
This commit was SVN r12675.
The following Trac tickets were found above:
Ticket 488 --> https://svn.open-mpi.org/trac/ompi/ticket/488
* Fix a counter roll-over issue that could result from a large (but
not excessive) number of outstanding put/get/accumulate calls
during a single synchronization issues (Refs trac:506)
* Fix epoch issue with rdma component that would effect PWSC
synchronization (Refs trac:507)
This commit was SVN r12673.
The following Trac tickets were found above:
Ticket 506 --> https://svn.open-mpi.org/trac/ompi/ticket/506
Ticket 507 --> https://svn.open-mpi.org/trac/ompi/ticket/507
* use one-sided datatype check instead of send/receive and check both
the origin and target datatypes
* allow error handler to be set on MPI_WIN_NULL, per standard
* Allow recursive calls into the pt2pt osc component's progress
function
* Fix an uninitialized variable problem in the unlock header
This commit was SVN r12667.
because they are in ORTE, not OMPI. Also, remove the ORTE_PROCESS_NAME macros
in iof base as they are duplicates of the ones that were in ns_types, which
meant that bad things happened if you changed what an orte_process_name_t
looked like.
This commit was SVN r12646.
the same time, remove some of the MPI-related options from OPAL:
- provide mechanism to change at runtime whether sched_yield() should
be called when the progress engine is idle
- provide mechanism for changing the rate at which the event engine
is called when there are "no" users of the event engine (ie, when
using MPI but not TCP)
- fix some function names in the progress engine to better match
their intended use (and remove MPI naming scheme)
- remove progress_mpi_enable / progress_mpi_disable because
we can now use the functions to set the sched_yield and
tick rate interfaces
- rename opal_progress_events() to opal_progress_set_event_flag()
because the first really isn't descriptive of what the function
does and I always got confused by it
This commit was SVN r12645.
Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it.
I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn).
This commit was SVN r12597.
Same sort of problem and fix as described in r12323 - mca_pml_ob1_recv_frag_progress() was segfaulting due to a NULL req_proc pointer. The path leading to this was through the mca_pml_ob1_check_cantmatch_for_match() function, where we can match a frag using the same macros as mca_pml_ob1_frag_match() and never initialize the req_proc pointer.
This commit was SVN r12582.
The following SVN revision numbers were found above:
r12323 --> open-mpi/ompi@c752502dee
The fix is to set opcode to SEND at the entrance to the send function before
checking credits and putting fragment to the pending list. We do the same thing
in put/get functions i.e setting opcode at the entrance to the function.
This commit was SVN r12559.
- consistent arguments checking (not allowing to select an algorithm which
is not available)
- consistent way of computing the segcount (number of datatypes by segment).
- small cleanups.
- more informative debugging messages.
This commit was SVN r12545.
description. Most of the bcast algorithms can be completed using this
generic function once we create the tree structure. Add all kind of
trees.
There are 2 versions of the generic bcast function. One using overlapping
between receives (for intermediary nodes) and then blocking sends to all
childs and another where all sends are non blocking. I still have to
figure out which one give the smallest overhead.
This commit was SVN r12530.
is that if one add "pml=" to the configuration file, really bad things
happen. All PMLs will get initialize, and each of them will initialize
all BTLs. This patch force the mca_pml_base_pml to get initialized in
all cases before we go out of the mca_pml_base_open function.
This commit was SVN r12527.
set it up before the match when we know the peer, saving some
time on the critical path. If the receive is ANY_SOURCE then
we initialize the convertor on _MATCHED. Anyway, we will set it
up only once per receive.
This commit was SVN r12484.
N gatherv's:
for (i = 0 ... size)
MPI_Gatherv(..., root = i, ...)
The new algorithm simply does (effectively):
MPI_Gatherv(..., root = 0, ...)
MPI_Bcast(..., root = 0, ...)
This commit was SVN r12469.
mca_btl_openib_endpoint_connect_eager_rdma() is called recursively. He also
noticed that orte_pointer_array_add() can't fail because we allocate max number
of elements at init time. So just remove error handling and locking. No locking
- no deadlocks.
This commit was SVN r12388.
something is going wrong down in the code it is removed from the array. So add
mutex to prevent concurrent access to the array from different threads.
This commit was SVN r12385.
What's happening is that we're holding openib_btl->eager_rdma_lock when
we call mca_btl_openib_endpoint_send_eager_rdma() on
btl_openib_endpoint.c:1227. This in turn calls
mca_btl_openib_endpoint_send() on line 1179. Then, if the endpoint
state isn't MCA_BTL_IB_CONNECTED or MCA_BTL_IB_FAILED, we call
opal_progress(), where we eventually try to lock
openib_btl->eager_rdma_lock at btl_openib_component.c:997.
The fix removes this lock altogether. Instead we atomically set local RDMA
pointer to prevent other threads to create rdma buffer for the same endpoint.
And we increment eager_rdma_buffers_count atomically thus polling thread doesn't
need lock around it.
This commit was SVN r12369.
This commit essentially caches the invoking comm/win/file on the
ompi_request_t. This, paired with the req_type field, allows us to
retrieve the invoking MPI object and invoke the proper errhandler.
The patch is missing most updates for the MPI-2 one-sided stuff (i.e.,
the patch mainly fixes comms and files); I didn't really understand
that code and didn't want to hazard trying to figure it out when Brian
can probably do it much more quickly.
So #250 will still stay open, pending MPI-2 one-sided updates for this
stuff.
This commit was SVN r12339.
The following Trac tickets were found above:
Ticket 250 --> https://svn.open-mpi.org/trac/ompi/ticket/250
* Create a new request type: NOOP (described below)
* For all MPI_*_INIT functions, OBJ_NEW an ompi_request_t and set its
type to NOOP
* Ensure that the NOOP requests are OBJ_RELEASE'd when they are done
* MPI_START looks at the request type; if NOOP, just return success. If
not, call the PML start() function
* MPI_STARTALL always pass the entire array of requests back to the PML
(see next point)
* Make the PMLs only process PML requests (i.e., ignore/skip anything
that isn't of type PML -- such as the NOOP requests)
* Add a little more param error checking in STARTALL
This commit was SVN r12338.
The following Trac tickets were found above:
Ticket 529 --> https://svn.open-mpi.org/trac/ompi/ticket/529
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).
This commit was SVN r12331.
the default decision functions (for broadcast, reduce and barrier) are based on a
high performance network (not TCP). It should give good performance (really good) for
any network having the following caracteristics: small latency (5 microseconds) and good
bandwidth (more than 1Gb/s).
+ Cleanup of the reduce algorithms, plus 2 new algorithms (binary and binomial). Now most
of the reduce algorithms use a generic tree based function for completing the reduce.
+ Added macros for computing the trees (they are used for bcast and reduce right now).
+ Allow the usage of all 5 topologies.
+ Jelena's implementation of a binary tree that can be used for non commutative operations.
Right now only the tree building function is there, it will get activated soon.
+ Some others minor cleanups.
This commit was SVN r12326.
A segfault would occur in mca_pml_ob1_recv_request_progress() when trying to prepare the convertor for unpacking, because the request's req_proc field was NULL.
Turns out that we weren't setting the req_proc field in the MCA_PML_OB1_CHECK_SPECIFIC_AND_WILD_RECEIVES_FOR_MATCH macro. Instead of just setting it there I removed the other place req_proc was being set correctly, and instead took care of all the cases at once in mca_pml_ob1_recv_frag_match().
This commit was SVN r12323.
parameter. For optimisation purpose only this BTL is used to send packet
through instead of trying to send packets through all BTLs. But actually the
code was wrong. It simply used provided bml_btl and it may represent different
endpoint from packet's destination. The fixed code checks if packet's
destination is reachable through the BTL, finds appropriate bml_btl and only
then tries to send it through correct bml_btl.
This commit was SVN r12319.
is done to assure alignment so strictly aligned CPUs (like SPARC) do not
sigbus. This also may benefit other platforms too.
This commit fixes trac:494.
This commit was SVN r12312.
The following Trac tickets were found above:
Ticket 494 --> https://svn.open-mpi.org/trac/ompi/ticket/494
mentioned in the comment the completion/callback of the triggered
send operation can happen before the call returns. If this happens and
if the pipeline depth is 0 before we triggered the send operation and
this is the last send operation of the request then the completion detection
code will decrement the pipeline depth and check it for equality to 0.
Because (0-1) != 0 the pml completion function for this request will
*not* be called.
This part 2 of the fix for ticket #246.
This commit was SVN r12292.
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)
This commit was SVN r12215.
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.
This commit was SVN r12146.
The UD BTL isn't gone - the latest version is in my afriedle-ud branch. This version on the trunk was very old, ompi_ignore'd, lacked performance, and probably contained bugs. The maintained version on my branch is working solid, and will eventually come back, but not for v1.2.
This commit was SVN r12144.
where a window was in both the passive and active side of a lock sequence.
Refs trac:488
This commit was SVN r12112.
The following Trac tickets were found above:
Ticket 488 --> https://svn.open-mpi.org/trac/ompi/ticket/488
constrained:
* Make sure we always have a number of eager fragments available
that scales with the number of processes communicating with
a given proc over shared memory
* Use FREE_LIST_GET instead of FREE_LIST_WAIT to return an
error to the PML when resource exhaustion occurs
* Don't dereference the frag during alloc unless we're sure
it's not NULL
Reviewed by: Galen
Refs trac:413
This commit was SVN r12053.
The following Trac tickets were found above:
Ticket 413 --> https://svn.open-mpi.org/trac/ompi/ticket/413
First, move the OPAL_THREAD_LOCK out to the same level as its corresponding UNLOCK. It was possible to hit the UNLOCK without ever acquiring the lock.
Since the OPAL_THREAD_ADD64() is now protected by this lock, we can just do the decrement non-atomically.
This commit was SVN r11958.
Don't try to acquire ompi_request_lock here, which in all cases is already held. Avoids deadlock that occurs even when threads are enabled and we're running a THREAD_SINGLE app.
Reviewed by Galen.
This commit was SVN r11957.
The following Trac tickets were found above:
Ticket 183 --> https://svn.open-mpi.org/trac/ompi/ticket/183
I do something else" rule screws me up again. If we're in a FENCE, but
not in ACCESS | EXPOSE, put us in ACCESS|EXPOSE, as we are now known we
now in a real Fence epoch. Yay silly MPI standards
Refs trac:441
This commit was SVN r11865.
The following Trac tickets were found above:
Ticket 441 --> https://svn.open-mpi.org/trac/ompi/ticket/441
on the number of known local procs, with a high and low watermark. Right
now, we default to a low watermark point of 64MB, a per-proc scaling
factor of 32MB, and a high watermark point of 512MB
Refs trac:212
This commit was SVN r11824.
The following Trac tickets were found above:
Ticket 212 --> https://svn.open-mpi.org/trac/ompi/ticket/212
GIDs (there can be more than one) and not GIDs of the HCA on the network. Entry
zero always have to be initialized so we use it, and warn user if there is more
then one port active and default subnet is configured on at least one of them.
This commit was SVN r11815.
the component is configured successfully. Otherwise, we can end up
trying to run make in the romio directory without any Makefiles. This
really only happens on the targets that recurse into DIST_SUBDIRS - ie
dist, maintainer-clean, and distclean
refs trac:411
This commit was SVN r11807.
The following Trac tickets were found above:
Ticket 411 --> https://svn.open-mpi.org/trac/ompi/ticket/411
tell if the remote proc should be in an exposure epoch or not.
Refs trac:325
This commit was SVN r11746.
The following Trac tickets were found above:
Ticket 325 --> https://svn.open-mpi.org/trac/ompi/ticket/325
epoch's control data could overwrite the previous epoch's data because
we were reusing data structures between PW and SC. Instead, we now
have explicit post_msg and complete_msg counters for completion.
refs trac:354
* Only register the rdma osc callback once, as it turns out that some
btls (MX) do somethng more than update a table during the register
call, and each register call sucks up valuable fragments...
This commit was SVN r11745.
The following Trac tickets were found above:
Ticket 354 --> https://svn.open-mpi.org/trac/ompi/ticket/354
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...
This commit was SVN r11730.
if we want to be able to reuse the request. If not, the request will never be freed
even if the user call MPI_Request_free.
This commit was SVN r11717.
This provides support for the Infinipath interconnect using the PSM API.
Of note:
This version has a "hackaround" we always return 1 or greater from
the MTL PSM progress function, this should be examined further.
This commit was SVN r11655.
George: ompi_ddt_type_size() returns a signed int only because of the
MPI spec; it will never return a negative value. So casting the
return value out of it to a (uint32_t) is safe, and makes the
comparisons be between two unsigned values.
This commit was SVN r11639.
The following SVN revision numbers were found above:
r11619 --> open-mpi/ompi@8667648a1b
todos: macroize it as we do it 10 different ways, add mca params to control handling (push up size, no change, switch off segmenting)
This commit was SVN r11619.
* Print a warning error message if a target is not in an exposure epoch
and an update is received. This results in the app continuing with
that call having never happened, rather than evil hangs.
refs trac:325
This commit was SVN r11514.
The following Trac tickets were found above:
Ticket 325 --> https://svn.open-mpi.org/trac/ompi/ticket/325
bunch of code changed indenting level and some code got moved out of
one function and made into its own subroutine.
- Gleb pointed out that I wasn't taking into account values from the
default section of the INI file (and not finding values in the INI
file is not an error).
- I incorrectly thought that 0x5ad was Mellanox's vendor ID. Turns
out that 0x5ad is Cisco's ID, while 0x2c9 is Mellanox.
Specifically, Cisco burns its own firmware into the HCA which
replaces the vendor ID, although the part ID stays the same. So
it's Mellanox hardware with Cisco firmware. And apparently several
of us do that. :-) So I expanded the concept of the vendor_id in
the INI file to allow for lists of vendor IDs.
- Along with that, I updated the default INI file to list all the IB
vendors (that I am aware of -- certainly open to putting more data
in there from other vendors) who overwrite Mellanox's vendor_id with
their own for the part numbers that we have on file.
This commit was SVN r11506.