1
1
Граф коммитов

10458 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
f08cce16db Fix Coverity CID 468: remove unused variable.
This commit was SVN r15996.
2007-08-29 01:21:17 +00:00
Brian Barrett
59b22533f2 Enable RDMA for heterogeneous situations. Currently done by overloading
the ompi_convertor_need_buffers function to only return 0 if the convertor
is homogeneous (which it never does on the trunk, but does to on v1.2, but
that's a different issue).  Only enable the heterogeneous rdma code for
a btl if it supports it (via a flag), as some btls need some work for this
to work properly.  Currently only TCP and OpenIB extensively tested

This commit was SVN r15990.
2007-08-28 21:23:44 +00:00
Brian Barrett
dcf678dbab Fix heterogeneous issue with non-blocking RML receive, where the sender
field could be in the wrong endianness

This commit was SVN r15989.
2007-08-28 20:54:52 +00:00
Gleb Natapov
fa69c5cc10 If a memory on a sender's size is not registered don't register it on a receive
side too. Otherwise a content of the recvreq->req_rdma array is replaced later
without freeing previous content and refcount on registration in mpool become
wrong.

This commit was SVN r15978.
2007-08-28 07:43:06 +00:00
Tim Mattox
2c29a2b4ee Resync the trunk NEWS file's 1.2.4 section with the 1.2 branch NEWS.
This commit was SVN r15977.
2007-08-28 04:07:19 +00:00
Rich Graham
bc97d22182 remove tabs. Remove old code that was commented out.
This commit was SVN r15975.
2007-08-28 03:08:36 +00:00
Rich Graham
4d58f9aed7 Add comments. Move temporary receive object from a free list object to
a stack object.

This commit was SVN r15971.
2007-08-27 21:41:04 +00:00
Pak Lui
75c7d4e03b Temporary workaround for making Totalview be able to get those opal
symbols and load into the library when compiled with a Sun Studio C compiler

This commit was SVN r15970.
2007-08-27 19:04:56 +00:00
Gleb Natapov
e1a1d9d90e Receive request converter can be accessed in parallel by a thread that receives
data and a thread that run RDMA schedule function. Protect access to the
converter by a lock.

This commit was SVN r15967.
2007-08-27 11:41:42 +00:00
Gleb Natapov
33196d972b post_send() function is called without endpoint lock held from explicit credits
update function so eager_rdma_remote.head have to be updated in a thread safe
manner.

This commit was SVN r15966.
2007-08-27 11:37:01 +00:00
Gleb Natapov
32a61c3bf2 Credit fragment is not protected properly from concurrent access. There is a
race that can prevent further explicit credits update from been sent. Fix the
race.

This commit was SVN r15965.
2007-08-27 11:34:59 +00:00
Gleb Natapov
065d04dfde Do not free recvreq while schedule function is running in another thread.
This commit was SVN r15964.
2007-08-27 11:31:40 +00:00
Brad Benton
ccda5c9c74 Modified the MCA_BTL_TCP_CONNECTED case in mca_btl_tcp_endpoint_send_handler()
to always first check for a NULL frag pointer before trying to send the
fragment.  This avoids an issue in multi-threaded execution in which 
multiple threads working on the same endpoint can result in a thread 
finding itself here with nothing to send.

This commit was SVN r15963.
2007-08-26 23:40:02 +00:00
George Bosilca
475073c684 Be as user friendly as possible and provide more information. Now we make the
difference between the user specified length, and the one available from
Open MPI (this allow to se the truncated receives). Moreover, if the
data-type used is named we now print the count as well as the name of
the used data-type.

This commit was SVN r15962.
2007-08-26 23:07:14 +00:00
Jeff Squyres
18db56e270 Fix Coverity defect 675: possible NULL dereference in an error
condition.

This commit was SVN r15957.
2007-08-25 12:18:55 +00:00
Jeff Squyres
b69c7688a0 Fix Coverity defect 676: possible NULL dereference in an error
condition.

This commit was SVN r15956.
2007-08-25 12:17:02 +00:00
George Bosilca
a6723b34ea Cleanup the code. Remove all debugging messages.
This commit was SVN r15955.
2007-08-24 02:58:09 +00:00
Edgar Gabriel
a2f5cada1a convert the hiearch component to the new structure. More testing required before we remove the .ompi_ignore flag again.
This commit was SVN r15954.
2007-08-23 20:41:29 +00:00
George Bosilca
daaf5a9bf1 Correct the assert macro.
This commit was SVN r15953.
2007-08-23 19:48:04 +00:00
George Bosilca
db19f927e8 A lot of cleanups. Verbose is enabled right now as we're tracking down
an issue with the ompi_communicator_t structure.

This commit was SVN r15951.
2007-08-23 16:40:07 +00:00
Rainer Keller
b385f8a790 - ompi_comm_set(): PML add_comm may return something != OMPI_SUCCESS
Use OMPI_SUCCESS throughout. 
 - ompi_comm_allocate(): Initialize new_comm=NULL to get rid of
   warnings.

This commit was SVN r15948.
2007-08-23 07:40:40 +00:00
Rainer Keller
1b5fa48a29 - Add missing PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q when matching
from the posted generic_recv-queue.
 - Move the PERUSE_COMM_MSG_MATCH_POSTED_REQ from
   MCA_PML_OB1_RECV_REQUEST_MATCHED to
   mca_pml_ob1_recv_frag_match() as suggested by Terry Dontje
   Only post, if this is not a probe/iprobe request.
 - Do not post PERUSE_COMM_REQ_MATCH_UNEX for probes / iprobes and
   do in correct order before PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q

This commit was SVN r15947.
2007-08-23 07:09:43 +00:00
Rainer Keller
c175801f98 - Initialize in the order of mca_pml_ob1_comm_proc_t...
This commit was SVN r15946.
2007-08-23 05:56:22 +00:00
Rainer Keller
b0df55d53b - For MPI_Probe/MPI_Iprobe, we should not have a
PERUSE_COMM_REQ_ACTIVATE event.
   Therefore move the PERUSE_TRACE_COMM_EVENT for this event from
   MCA_PML_BASE_SEND_REQUEST_INIT / MCA_PML_BASE_RECV_REQUEST_INIT
   to the proper places into pml_ob1_isend.c / pml_ob1_irecv.c right
   after the MCA_PML_OB1_SEND_REQUEST_INIT /
   MCA_PML_OB1_RECV_REQUEST_INIT.

This commit was SVN r15945.
2007-08-23 05:52:33 +00:00
George Bosilca
b5af2ba6f2 Correctly retrieve the MPI_SOURCE field for receives.
This commit was SVN r15944.
2007-08-22 22:35:30 +00:00
Gleb Natapov
becf4aa9c9 ompi_pointer_array_get_size doesn't return how much elements are actually in an
array, so count them by ourselves.

This commit was SVN r15943.
2007-08-22 09:31:12 +00:00
Shiqing Fan
a497a3fcad - Fix some small bugs, copy-paste mistakes.
This commit was SVN r15941.
2007-08-21 19:57:28 +00:00
Josh Hursey
5a029a47bd forgot to separate the arguments
This commit was SVN r15940.
2007-08-21 19:43:41 +00:00
Sven Stork
3985a35c35 - export required symbol
This commit was SVN r15939.
2007-08-21 18:46:11 +00:00
Rainer Keller
08634e7e4a - Properly unlock the ompi_proc_lock in case of errors and otherwise.
This commit was SVN r15936.
2007-08-20 16:47:13 +00:00
Jeff Squyres
ad784a9ab0 Make "simultaneous" be a size_t; there's already a check to ensure
that it is >= 1, so making it a size_t makes it easier to interact
with all the other size_t variables and removes a compiler warning.

This commit was SVN r15935.
2007-08-20 13:22:46 +00:00
Jeff Squyres
3653bfcbe7 This function returns void.
This commit was SVN r15934.
2007-08-20 13:12:38 +00:00
Gleb Natapov
d8f3063895 Create only one CQ for all BTLs on the same HCA. Many BTLs can be created for
one HCA. Multiple ports, LMC, multiple BTLs per one LID. Having only one CQ for
all of them substantially reduce polling time.

This commit was SVN r15933.
2007-08-20 12:28:25 +00:00
Gleb Natapov
5596aa5f53 The sizes of mca_pml_ob1_send_request_t and mca_pml_ob1_recv_request_t depend
on a parameter and are determined in runtime. r15346 removed calculation of
correct sizes for this structures. This patch adds it back and fixes trac:1116, #1114.

This commit was SVN r15932.

The following SVN revision numbers were found above:
  r15346 --> open-mpi/ompi@433f8a7694

The following Trac tickets were found above:
  Ticket 1116 --> https://svn.open-mpi.org/trac/ompi/ticket/1116
2007-08-20 12:06:27 +00:00
Josh Hursey
db79f2392e Make sure to enable C/R support for the HNP when restarting.
This commit was SVN r15931.
2007-08-19 20:43:33 +00:00
George Bosilca
7e8bd529dc A better management of the receive requests. If the request was matched
then we can report the real source (in both local and global rank) as
well as the real amount of data transfered.

This commit was SVN r15929.
2007-08-19 20:16:48 +00:00
George Bosilca
7ef49614fc Correctly read a bool from the application memory. Report back if the
receive request was matched or not.

This commit was SVN r15928.
2007-08-19 20:05:09 +00:00
George Bosilca
c7e0ab93ae Don't forget to include string.h for the strcmp function.
This commit was SVN r15927.
2007-08-19 19:59:15 +00:00
George Bosilca
5ac19c62a1 Translate between global and local ranks. A global rank is the rank
of the process in the MPI_COMM_WORLD, while a local rank is the rank
of the process in the communicator where the request was posted. In
order to get the message graph nicely, each request has to have the
global rank set correctly.

This commit was SVN r15926.
2007-08-19 19:50:26 +00:00
Brian Barrett
af4e86c25f Update collectives selection logic to allow for multiple components to be
used at nce (up to one unique collective module per collective function).
Matches r15795:15921 of the tmp/bwb-coll-select branch

This commit was SVN r15924.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r15795
  r15921
2007-08-19 03:37:49 +00:00
Brian Barrett
2b8af283de Add ability to completely turn off MPI one-sided support, so that users
can experiment with using ROMIO directly.

This commit was SVN r15922.
2007-08-18 21:35:51 +00:00
Josh Hursey
729c63cf9d Fix invalid MCA 'base' names so they appear in ompi_info.
A subset of this patch needs to be applied to v1.2

Refs trac:928

This commit was SVN r15918.

The following Trac tickets were found above:
  Ticket 928 --> https://svn.open-mpi.org/trac/ompi/ticket/928
2007-08-18 03:05:45 +00:00
Rolf vandeVaart
797078115d Fix the case where mpi_preconnect_oob=1 and
mpi_preconnect_oob_simultaneous > np.  Need to scale back
simultaneous to equal np in those cases.  Reviewed by Brian.

This commit fixes trac:1064.

This commit was SVN r15916.

The following Trac tickets were found above:
  Ticket 1064 --> https://svn.open-mpi.org/trac/ompi/ticket/1064
2007-08-17 20:18:42 +00:00
George Bosilca
b9ea4c92e7 Don't show the requests having a negative tag (they are internal
requests for Open MPI). Add a variable to allow Open MPI developers
to see all internal messages.

This commit was SVN r15915.
2007-08-17 18:59:57 +00:00
Brian Barrett
1b76d46bd7 Fix error that was preventing anyone with LT 1.5.x from running autogen.sh.
This commit was SVN r15911.
2007-08-17 17:11:36 +00:00
Edgar Gabriel
0684002812 fixes: 1127
fix some of the multi-threading problems for the cid allocation. Two bugs
specifically:
 - since we do not have a queue for incoming fragments of unknown cid, we need
 to synchronize all processes before exiting the communicator creation. This
 synchronization was/is located in comm_activate, which was however too late
 for the multi-threaded case. Thus, for multi-threaded scenarios we are now
 synchronizing 'before' we allow another thread to enter the cid-allocation
 loop.

- for synchronization, we used for the sake of simplicity allreduce
  operations. It turns out, that these operations interefered with the
 allreductions in the cid-allocation routine, which lead to non-sense results
  in the cid-allocation and potentially to endless loops.

Multi-threaded communicator creation seems to work now, is however still 'very
very' slow. I think, the busy wait of threads is killing the performance of
the active threads in the cid allocation. But this is another topic.

This commit was SVN r15910.
2007-08-17 16:15:26 +00:00
Jeff Squyres
e333de3fc7 Update NEWS for some items that just went into 1.2.4.
This commit was SVN r15909.
2007-08-17 15:16:58 +00:00
Brian Barrett
2d4918b09d Support versions of the Libtool 2.1a snapshots after the lt_dladvise code
was brought in.  This supercedes the GLOBL patch that we had been using
with Libtool 2.1a versions prior to the lt_dladvise code.  Autogen
tries to figure out which version you're on, so either will now work with
the trunk.

This commit was SVN r15903.
2007-08-17 04:08:23 +00:00
Brian Barrett
3b98b5f0a1 The reference implementation of Portals (which runs over TCP on Linux) is
only static libraries.  Previously, we were linking the libraries into 
directly into the common, btl, and mtl code.  This seemed to work fine
for me on my Opteron Fedora box, but caused Lisa some issues (PtlNIInit
would succeed, but the network handle would fail when used with
PtlEQAlloc).

Instead, link the portals libraries directly into libmpi and not at
all into the common, btl, or mtl components.  THen use some linker
tricks to force the linker to bring in the public interface for the
reference implementation (which thankfully is pretty small).

This commit was SVN r15902.
2007-08-17 03:56:49 +00:00
Brian Barrett
c9e3654a85 Allow OMPI components to modify the link options for libmpi.so. This
functionality used to exist, but I removed it like a year ago because
it wasn't being used.  Well, now I need it :).

This commit was SVN r15901.
2007-08-17 03:52:53 +00:00