adding new procs that the remote proc's pml is the same as our local pml.
Turns the hangs from mismatched PMLs into an abort, which is better,
I think.
This commit was SVN r13582.
needlessly registered in multiple different places, and none of them
had a good help string. There was also an inconsistent check for
setting both mpi_leave_pinned and mpi_leave_pinned_pipeline (i.e., it
was only in ob1). This commit moves the registration of these params
to one central place (ompi/runtime/ompi_mpi_params.c, with all other
mpi_* MCA params) and uses globals to propagate the values as
relevant. The error check was also moved to the central location to
ensure that we can consistency everywhere.
This commit was SVN r13226.
OB1 always use first element from array of BTLs available for RDMA. The patch
change the array creation algorithm, it puts different BTL in the first element
in round robin fashion.
This commit was SVN r13174.
components that use configure.m4 for configuration or are always built.
The macro has not been needed since moving to configure types other than
configure.stub
Fixes trac:590
This commit was SVN r13031.
The following Trac tickets were found above:
Ticket 590 --> https://svn.open-mpi.org/trac/ompi/ticket/590
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.
Refs trac:587
This commit was SVN r13023.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
* Make sure that the pval always writes to the correct portion of the
lval. This only matters on 32 bit big endian machines.
* On 32 bit machines when assigning to pval, the other 4 bytes of lval
weren't being written, which could lead to bogus data
We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes. Refs trac:587
This commit was SVN r12974.
The following Trac tickets were found above:
Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).
This commit was SVN r12878.
r12714) for supporting compilers / architectures with different
padding rules.
This commit was SVN r12749.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r12491
r12714
hits the buffer on the other side. For this kind of BTLs we need to send
FIN through the same BTL, PUT was performed with so network will handle
ordering for us. If we will use another BTL, receiver can get FIN before
data will hit the buffer and complete request prematurely. We mark such
problematic BTLs with MCA_BTL_FLAGS_FAKE_RDMA flag (this kind of RDMA
is really fake, because the real one guaranties that sender will see the
completion only after receiver's NIC confirmed that all the data was
received).
This commit was SVN r12732.
It calls mca_pml_ob1_send_fin_btl() which may fail and doesn't check return
code. This breaks all RDMA transports event when only one BTL is used. Revert
it for now, I am working on a real fix for the problem (I hope).
This commit was SVN r12731.
The following SVN revision numbers were found above:
r12720 --> open-mpi/ompi@3e3689320b
regresion from v1.1 was reviewed and put to v1.2 branch. So revert this part
of r12721 back.
This commit was SVN r12730.
The following SVN revision numbers were found above:
r12433 --> open-mpi/ompi@82f7c0dd69
r12721 --> open-mpi/ompi@3edd850d2e
protocol when multiple NICS are available between 2 peers. The fix force
the FIN message to take exactly the same path as the fragment it describe
(i.e. same path means same BTL). Otherwise, the FIN can be received by
the peer before the RDMA complete and the request will get freed
too early.
This commit was SVN r12720.
Same sort of problem and fix as described in r12323 - mca_pml_ob1_recv_frag_progress() was segfaulting due to a NULL req_proc pointer. The path leading to this was through the mca_pml_ob1_check_cantmatch_for_match() function, where we can match a frag using the same macros as mca_pml_ob1_frag_match() and never initialize the req_proc pointer.
This commit was SVN r12582.
The following SVN revision numbers were found above:
r12323 --> open-mpi/ompi@c752502dee
set it up before the match when we know the peer, saving some
time on the critical path. If the receive is ANY_SOURCE then
we initialize the convertor on _MATCHED. Anyway, we will set it
up only once per receive.
This commit was SVN r12484.
This commit essentially caches the invoking comm/win/file on the
ompi_request_t. This, paired with the req_type field, allows us to
retrieve the invoking MPI object and invoke the proper errhandler.
The patch is missing most updates for the MPI-2 one-sided stuff (i.e.,
the patch mainly fixes comms and files); I didn't really understand
that code and didn't want to hazard trying to figure it out when Brian
can probably do it much more quickly.
So #250 will still stay open, pending MPI-2 one-sided updates for this
stuff.
This commit was SVN r12339.
The following Trac tickets were found above:
Ticket 250 --> https://svn.open-mpi.org/trac/ompi/ticket/250
allocation logic is completely done outside the data-type engine (in the PML) there is
no need for any special case inside the data-type engine. There is less arguments for
the ompi_convertor_pack and ompi_convertor_unpack as well (the last field free_after is
not required anymore as there is no memory allocated in the engine itself). This change
affect all components using datatypes. I test most of them, but it might happens that I
miss some ... If it's the case please let me know (don't shoot the pianist!!).
This commit was SVN r12331.
A segfault would occur in mca_pml_ob1_recv_request_progress() when trying to prepare the convertor for unpacking, because the request's req_proc field was NULL.
Turns out that we weren't setting the req_proc field in the MCA_PML_OB1_CHECK_SPECIFIC_AND_WILD_RECEIVES_FOR_MATCH macro. Instead of just setting it there I removed the other place req_proc was being set correctly, and instead took care of all the cases at once in mca_pml_ob1_recv_frag_match().
This commit was SVN r12323.
parameter. For optimisation purpose only this BTL is used to send packet
through instead of trying to send packets through all BTLs. But actually the
code was wrong. It simply used provided bml_btl and it may represent different
endpoint from packet's destination. The fixed code checks if packet's
destination is reachable through the BTL, finds appropriate bml_btl and only
then tries to send it through correct bml_btl.
This commit was SVN r12319.
mentioned in the comment the completion/callback of the triggered
send operation can happen before the call returns. If this happens and
if the pipeline depth is 0 before we triggered the send operation and
this is the last send operation of the request then the completion detection
code will decrement the pipeline depth and check it for equality to 0.
Because (0-1) != 0 the pml completion function for this request will
*not* be called.
This part 2 of the fix for ticket #246.
This commit was SVN r12292.
all platforms. The only exceptions (and I will not deal with them
anytime soon) are on Windows:
- the write functions which require the length to be an int when it's
a size_t on all UNIX variants.
- all iovec manipulation functions where the iov_len is again an int
when it's a size_t on most of the UNIXes.
As these only happens on Windows, so I think we're set for now :)
This commit was SVN r12215.
size and diplacement of data-type. After this patch all data can contain size_t bytes
and the displacements are defined as ptrdiff_t. All of the files I was able to compile
have been modified to match this requirement.
This commit was SVN r12146.
long ago) supposed to be used as a cache for accessing the PML procs. But in
all of the PMLs the PML proc contain only one field i.e. a pointer to the ompi_proc.
This pointer can be accessed using the c_remote_group easily. Therefore, there is no
meaning of keeping the PML procs around. Slim fast commit ...
This commit was SVN r11730.
if we want to be able to reuse the request. If not, the request will never be freed
even if the user call MPI_Request_free.
This commit was SVN r11717.
different macros, one for each project. Therefore, now we have OPAL_DECLSPEC,
ORTE_DECLSPEC and OMPI_DECLSPEC. Please use them based on the sub-project.
This commit was SVN r11270.
There was some old code regarding the convertor which does not have to be there
(the problem was corrected a while ago). In the PML we already know how the progress
function is defined, so call the BML progress instead, which will save one function
call.
The macro MCA_PML_OB1_COMPUTE_SEGMENT_LENGTH is already defined in the pml_ob1.h
so it should not be in the endpoint.h.
Remove a double definition of the mca_pml_ob1_progress function in the pml_ob1.h.
This commit was SVN r10775.
is the one provided by the user. For the buffered send the real datatype used
for the communication is always MPI_BYTE and the count can be retrieved from
the req_bytes_packed field. This will decrease the size of the request by
one pointer and one size_t (8 bytes or 16 bytes depending on the architecture).
This commit was SVN r10680.
interconnects that provide matching logic in the library.
Currently includes support for MX and some support for
Portals
* Fix overuse of proc_pml pointer on the ompi_proc structuer,
splitting into proc_pml for pml data and proc_bml for
the BML endpoint data
* bug fixes in bsend init code, which wasn't being used by
the OB1 or DR PMLs...
This commit was SVN r10642.
standard). This macro allow us to specify the length of the fragment. Now we are
able to know how the message is fragmented between the network devices or inside
the communication protocol.
This commit was SVN r10508.
by the BTL (btl_max_rdma_size). Now the PUT protocol is pipelined even if there
is just one network between the 2 peers. Unfortunately, this problem is present
the 1.1 (no pipeline for the PUT protocol).
This commit was SVN r10499.
call to the _UNPACK macro, in the case where the length of the received data
is zero. This might happens on the PUT protocol.
This commit was SVN r10431.
Sender can receive and complete PUT request before it gets completion on the first rndv packet. senreq struct may be reused for the next MPI_Send and unexpected completion mess up the things. I sometimes got SEGV and sometimes data corruption.
This commit was SVN r10301.
message from a known peer (not MPI_ANY_SOURCE) then we can attach the
remote proc and initialize the convertor as soon as we know the data-type,
and the count (so basically in the _INIT macro). If it's not the case, then
create them in the _MATCHED macro (as in the original version). Of course,
beforeinitializing the convertor we check that there will be some data
in the message.
This commit, plus the convertor improvements from few days ago, lower the
latency for my test case environment (mvapi) by 0.1 microseconds. The convertor
now is as slim as it can be, I don't think there is anything else to
remove/improve.
This commit was SVN r9843.
We support all the events in the PERUSE specifications, but right now only one event
of each type can be attached to a communicator. This will be worked out in the future.
The events were places in such a way, that we will be able to measure the overhead
for our threading implementation (the cost of the synchronization objects).
This commit was SVN r9500.
add object size to opal class
no longer need the size when allocating a new object as this is stored in
the class structure
--- dr changes
Previous rev. maintained state on the communicator used for acking duplicate
fragments, but the communicator may be destroyed prior to successfull
delivery of an ack to the peer. We must therefore maintain this state
globally on a per peer, not a per peer, per communicator basis.
This requires that we use a global rank on the wire and translate this as
appropriate to a local rank within the communicator.
This commit was SVN r9454.
only as a pointer reference completely confuse some compilers (gcc 4.1
included). Removing the inline (it was there before when the function
was used in the same file) seems to solve the problem. However, the most
strange thing is that the bug only appear when we compile directly in
the trunk directory. It just don't happens when we're using the VPATH
build.
This commit was SVN r9408.
flag, new flags to be included when convertor is initialized
- modified pml/btl module defs and added stub functions for diagnostic
output routines to dump state of queues / endpoints
- updates to data reliability pml
This commit was SVN r9329.
to let the PML (or io, more generally the low level request manager)
to have it's own release function (what was before the req_fini). This
function will only be called from the low level while the req_free will
be called from the upper level (MPI layer) in order to mark the request
as not used by the user anymore.
From the request point of view the requests will be marked as inactive
everytime we read their status (true for persistent as well). As
MPI_REQUEST_NULL is already marked as inactive, the test and wait functions
are simpler. The drawback is that now we have to change in the
ompi_request_{test|wait} the req_status of the request once we get it's
status.
This commit was SVN r9290.
- initial support for gm progress thread
- corrected threading issue in pml
- added polling progress for a configurable number of cycles to wait for threaded case
This commit was SVN r9188.
around, since OB1 currently doesn't do the right thing there, but that should
not happen in the near future because the R2 BML should not make any RDMA
networks available between machines with different architectures
* Clean up the #ifs a little bit so that we don't do unneeded work when
on big endian machines and heterogeneous support is disabled...
This commit was SVN r9184.
- moved hton64 and ntoh64 from the bunch of places it had been copied
into one header file
- properly set and use the btl_tcp's nbo option to put things in
network byte order on the wire if both sides don't have the same
endianness
- Put the OB1 PML's headers (with a couple exceptions I need to discuss
with Tim) in network byte order on the wire if both sides don't have
the same endianness
- since it was needed for the TCP BTL, move the orte_process_name_t
HTON and NTOH macros from the TCP OOB to ns_types.h
This commit was SVN r9145.
counterparts, the reset to MPI_REQUEST_NULL of the upper
struct ompi_request_t was broken. Nightly mpi_test_suite
failed, e.g.
mpirun -np 2 ./mpi_test_suite -t "Ring Isend"
This commit was SVN r9028.
The following SVN revision numbers were found above:
r8945 --> open-mpi/ompi@83f83e5730
progress function in the mca_pml to the BML progress one, avoid having a cascade of
call to the progress function and speed up a little bit the execution.
This commit was SVN r9007.
- move files out of toplevel include/ and etc/, moving it into the
sub-projects
- rather than including config headers with <project>/include,
have them as <project>
- require all headers to be included with a project prefix, with
the exception of the config headers ({opal,orte,ompi}_config.h
mpi.h, and mpif.h)
This commit was SVN r8985.
Correct the problem introduced by the commit 8933 (thanks Tim). In order to avoid to much
trafic on the bus we do not compute the bytes_delivered (require an atomic size_t add)
we have to set it in the begining or otherwise we will report the wrong count in the
MPI status.
This commit was SVN r8968.
1. remove all useless macros from the proc header file
2. merge 2 of the match macros (they share the same logic except one list)
This commit was SVN r8946.
of request we are playing with (send or receive). Therefore, it's useless to have another
switch inside this macro and make the code bigger. Now, we have 2 versions
MCA_PML_OB1_SEND_REQUEST_FREE and MCA_PML_OB1_RECV_REQUEST_FREE.
This commit was SVN r8945.
and add a new macro that can be used for both sends and receives.
Move to atomic operations to manage the length of the sended or received
status. There is one instance where the atomic operation is not required
as the code can cannot be executed in same time by 2 differents threads.
This commit was SVN r8933.
second instance of the ompi_proc from the send and receive request. This
information is already available on the base request, so there is no
need for duplication. The drawback is that now (in order to avoid a
second lookup in the communicator array of procs) we have to set the base
proc in the PML's _ALLOC macro.
This commit was SVN r8900.
if-clause, getting rid of local schedule variable.
This commit was SVN r8778.
The following SVN revision numbers were found above:
r8771 --> open-mpi/ompi@2fadddebc8
- remove windows socket initialization (it's already in the TCP component)
- protect all used header files
- remove the unused ones.
This commit was SVN r8434.
- when processing out of order list - reset match to null on each iteration
- check matched request type and if probe - complete probe and queue fragment
on unexpected list
This commit was SVN r8339.
the base send and receive request from the pml_base, we can solve our problem
if we construct the convertor attached to any request in the pml_base_construct
function. At the end of the life time for each request (here life time is
related to one utilisation, without taking in account the cache) we release
all information attached to the convertors in the _FINI macro by calling the
ompi_convertor_cleanup.
This commit was SVN r7910.
convertor (when prepared) increase the reference count on the used datatype. This reference count
will be released only when the OBJ_DESTRUCT is called on a convertor. However, having to call
OBJ_CONSTRUCT and OBJ_DESTRUCT on each request every time we want to use it (even when it come
from the cache) is an expensive operation. This can be avoided is the OBJ_DESTRUCT will leave the
convertor in exactly the same state as OBJ_CONSTRUCT. With this approach we just have to call
OBJ_CONSTRUCT for each convertor once when we initially create the request.
This commit was SVN r7813.
- corrected memory hook callback to catch all allocations (need to optimize this)
- don't attempt to consolidate allocations
This commit was SVN r7600.
searching the tree. Modified the memory callback to search the tree at each
page boundary for registrations. This is necessary as an application may
malloc memory and send out of any portion of that memory, even discontiguous
regions.
This commit was SVN r7510.
add misc asserts to check for proper reference counting.
ugly hack 1 -- use mallopt to never release memory ala sbrk - this is
commented out in mca_btl_mvapi_component_init
ugly hack 2 -- test registrations comming out of the tree via rcache_find, for
an unknown reason the tree is returning registrations where the address is not
within the base or bound of the registration. If this happens, we return
NULL.
comment out code to enable mem hooks if leave_pinned is set, note we can do
this via an mca param and will default it to leave_pinned with mem_hooks when
we iron out these issues.
I am adding a unit test for the rcache. Note that we have a unit test for the
rb tree but the compare function is significantly different than that used for
registrations. After we have tracked down the issues with rcache_rb we will
remove the above hacks.
This commit was SVN r7499.
AM_INIT_AUTOMAKE, instead of the deprecated version.
* Work around dumbness in modern AC_INIT that requires the version
number to be set at autoconf time (instead of at configure time, as
it was before). Set the version number, minus the subversion r number,
at autoconf time. Override the internal variables to include the r
number (if needed) at configure time. Basically, the right thing
should always happen. The only place it might not is the version
reported as part of configure --help will not have an r number.
* Since AM_INIT_AUTOMAKE taks a list of options, no need to specify
them in all the Makefile.am files.
* Addes support for subdir-objects, meaning that object files are put
in the directory containing source files, even if the Makefile.am is
in another directory. This should start making it feasible to
reduce the number of Makefile.am files we have in the tree, which
will greatly reduce the time to run autogen and configure.
This commit was SVN r7211.
* don't know what I was thinking, but can't use the MCA_PML_CALL macro on
the two data values, as they don't have things that the macro can
expand into
This commit was SVN r6868.
the values to the PML structure. This will allow PMLs that want to do
hardware matching at the cost of a smaller range of valid tags and cids.
Updated all the places that used the MPI_TAG_UB_VALUE constant to instead
look at the pml struct.
This commit was SVN r6778.
Change all the places where they are used to fit the new name.
Remove the code to check the remote arch from the PML. We will have a GPR mechanism
in ompi_mpi_initialize to do that.
This commit was SVN r6750.
support in OMPI. Currently only enables/disables the architecture
sharing modex in ob1 pml.
* Add sds framework to ompi_info
* Figure out table ids to use for Portals BTL at configure time, since
we should use 30 & 31 on Red Storm, but the reference implementation
only supports 0-8.
* Some bug fixes in Portals UTCP sds
This commit was SVN r6650.
- only call sched_yield if it exists
- don't fail out if modex doens't work in ob1
- bunch of fixes for Portals BTL
- add cnos rml component
- add NULL gpr component (should only be used if replica AND proxy
fail to load)
This commit was SVN r6629.
multiple components to share a single mpool module (e.g., the
ptl/btl and coll sm components).
- Re-tool the ptl, btl, and coll sm components to first look for the
target mpool module, and if they don't find it, to create it.
- coll sm component now correctly identifies when it is supposed to
run or not (i.e., if all the processes in the communicator are on
the same host). Now we just need to fill in some algorithms. :-)
This commit was SVN r6530.
is a lot more difficult than a PTL, and it can adapt it's behavior to the level of threading required
by the user. In this case the behavior is the priorit of the PML. Therefore this information is never
availale before the init function (of the PML) is called. So I try to keep nearly the same structure
as it was before, with one change. When a PML get initialized it does not necessarily means it has been
selected, so it does not means it has to create all it's internal structures (and select the PTL and
all this stuff). They can all be done later, when a PML knows that it definitively get selected
(when the enable function is called with the argument set to true). Thus, in the case of a PML close
one have to check if the PML has been selected or not before trying to clean up the internals.
I had to change the MPI_Init function to allow the PML to be enabled before we start adding procs inside.
This commit was SVN r6434.
components to succeed with --enable-dist. Instead, just add them to
all_components and make dist will still work - we're going to stamp out
the Makefiles no matter what
* Add missing header to ob1 pml for make dist
* Clean up the Portals BTL configure code
This commit was SVN r6413.
- After long discussions and ruminations on how we run components in
LAM/MPI, made the decision that, by default, all components included
in Open MPI will use the version number of their parent project
(i.e., OMPI or ORTE). They are certaint free to use a different
number, but this simplification makes the common cases easy:
- components are only released when the parent project is released
- it is easy (trivial?) to distinguish which version component goes
with with version of the parent project
- removed all autogen/configure code for templating the version .h
file in components
- made all ORTE components use ORTE_*_VERSION for version numbers
- made all OMPI components use OMPI_*_VERSION for version numbers
- removed all VERSION files from components
- configure now displays OPAL, ORTE, and OMPI version numbers
- ditto for ompi_info
- right now, faking it -- OPAL and ORTE and OMPI will always have the
same version number (i.e., they all come from the same top-level
VERSION file). But this paves the way for the Great Configure
Reorganization, where, among other things, each project will have
its own version number.
So all in all, we went from a boatload of version numbers to
[effectively] three. That's pretty good. :-)
This commit was SVN r6344.
* rename ompi_basename to opal_basename
* rename ompi bitop functions to opal
* rename ompi_cmd_line to opal_cmd_line
* rename ompi_sizet2int to opal_sizet2int
* rename orte_daemon_init to opal_daemon_init
* rename ompi_few to opal_few
This commit was SVN r6330.