1
1

3068 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
33196d972b post_send() function is called without endpoint lock held from explicit credits
update function so eager_rdma_remote.head have to be updated in a thread safe
manner.

This commit was SVN r15966.
2007-08-27 11:37:01 +00:00
Gleb Natapov
32a61c3bf2 Credit fragment is not protected properly from concurrent access. There is a
race that can prevent further explicit credits update from been sent. Fix the
race.

This commit was SVN r15965.
2007-08-27 11:34:59 +00:00
Gleb Natapov
065d04dfde Do not free recvreq while schedule function is running in another thread.
This commit was SVN r15964.
2007-08-27 11:31:40 +00:00
Brad Benton
ccda5c9c74 Modified the MCA_BTL_TCP_CONNECTED case in mca_btl_tcp_endpoint_send_handler()
to always first check for a NULL frag pointer before trying to send the
fragment.  This avoids an issue in multi-threaded execution in which 
multiple threads working on the same endpoint can result in a thread 
finding itself here with nothing to send.

This commit was SVN r15963.
2007-08-26 23:40:02 +00:00
George Bosilca
475073c684 Be as user friendly as possible and provide more information. Now we make the
difference between the user specified length, and the one available from
Open MPI (this allow to se the truncated receives). Moreover, if the
data-type used is named we now print the count as well as the name of
the used data-type.

This commit was SVN r15962.
2007-08-26 23:07:14 +00:00
Jeff Squyres
18db56e270 Fix Coverity defect 675: possible NULL dereference in an error
condition.

This commit was SVN r15957.
2007-08-25 12:18:55 +00:00
Jeff Squyres
b69c7688a0 Fix Coverity defect 676: possible NULL dereference in an error
condition.

This commit was SVN r15956.
2007-08-25 12:17:02 +00:00
George Bosilca
a6723b34ea Cleanup the code. Remove all debugging messages.
This commit was SVN r15955.
2007-08-24 02:58:09 +00:00
Edgar Gabriel
a2f5cada1a convert the hiearch component to the new structure. More testing required before we remove the .ompi_ignore flag again.
This commit was SVN r15954.
2007-08-23 20:41:29 +00:00
George Bosilca
daaf5a9bf1 Correct the assert macro.
This commit was SVN r15953.
2007-08-23 19:48:04 +00:00
George Bosilca
db19f927e8 A lot of cleanups. Verbose is enabled right now as we're tracking down
an issue with the ompi_communicator_t structure.

This commit was SVN r15951.
2007-08-23 16:40:07 +00:00
Rainer Keller
b385f8a790 - ompi_comm_set(): PML add_comm may return something != OMPI_SUCCESS
Use OMPI_SUCCESS throughout. 
 - ompi_comm_allocate(): Initialize new_comm=NULL to get rid of
   warnings.

This commit was SVN r15948.
2007-08-23 07:40:40 +00:00
Rainer Keller
1b5fa48a29 - Add missing PERUSE_COMM_REQ_REMOVE_FROM_POSTED_Q when matching
from the posted generic_recv-queue.
 - Move the PERUSE_COMM_MSG_MATCH_POSTED_REQ from
   MCA_PML_OB1_RECV_REQUEST_MATCHED to
   mca_pml_ob1_recv_frag_match() as suggested by Terry Dontje
   Only post, if this is not a probe/iprobe request.
 - Do not post PERUSE_COMM_REQ_MATCH_UNEX for probes / iprobes and
   do in correct order before PERUSE_COMM_MSG_REMOVE_FROM_UNEX_Q

This commit was SVN r15947.
2007-08-23 07:09:43 +00:00
Rainer Keller
c175801f98 - Initialize in the order of mca_pml_ob1_comm_proc_t...
This commit was SVN r15946.
2007-08-23 05:56:22 +00:00
Rainer Keller
b0df55d53b - For MPI_Probe/MPI_Iprobe, we should not have a
PERUSE_COMM_REQ_ACTIVATE event.
   Therefore move the PERUSE_TRACE_COMM_EVENT for this event from
   MCA_PML_BASE_SEND_REQUEST_INIT / MCA_PML_BASE_RECV_REQUEST_INIT
   to the proper places into pml_ob1_isend.c / pml_ob1_irecv.c right
   after the MCA_PML_OB1_SEND_REQUEST_INIT /
   MCA_PML_OB1_RECV_REQUEST_INIT.

This commit was SVN r15945.
2007-08-23 05:52:33 +00:00
George Bosilca
b5af2ba6f2 Correctly retrieve the MPI_SOURCE field for receives.
This commit was SVN r15944.
2007-08-22 22:35:30 +00:00
Gleb Natapov
becf4aa9c9 ompi_pointer_array_get_size doesn't return how much elements are actually in an
array, so count them by ourselves.

This commit was SVN r15943.
2007-08-22 09:31:12 +00:00
Shiqing Fan
a497a3fcad - Fix some small bugs, copy-paste mistakes.
This commit was SVN r15941.
2007-08-21 19:57:28 +00:00
Sven Stork
3985a35c35 - export required symbol
This commit was SVN r15939.
2007-08-21 18:46:11 +00:00
Rainer Keller
08634e7e4a - Properly unlock the ompi_proc_lock in case of errors and otherwise.
This commit was SVN r15936.
2007-08-20 16:47:13 +00:00
Jeff Squyres
ad784a9ab0 Make "simultaneous" be a size_t; there's already a check to ensure
that it is >= 1, so making it a size_t makes it easier to interact
with all the other size_t variables and removes a compiler warning.

This commit was SVN r15935.
2007-08-20 13:22:46 +00:00
Gleb Natapov
d8f3063895 Create only one CQ for all BTLs on the same HCA. Many BTLs can be created for
one HCA. Multiple ports, LMC, multiple BTLs per one LID. Having only one CQ for
all of them substantially reduce polling time.

This commit was SVN r15933.
2007-08-20 12:28:25 +00:00
Gleb Natapov
5596aa5f53 The sizes of mca_pml_ob1_send_request_t and mca_pml_ob1_recv_request_t depend
on a parameter and are determined in runtime. r15346 removed calculation of
correct sizes for this structures. This patch adds it back and fixes trac:1116, #1114.

This commit was SVN r15932.

The following SVN revision numbers were found above:
  r15346 --> open-mpi/ompi@433f8a7694

The following Trac tickets were found above:
  Ticket 1116 --> https://svn.open-mpi.org/trac/ompi/ticket/1116
2007-08-20 12:06:27 +00:00
George Bosilca
7e8bd529dc A better management of the receive requests. If the request was matched
then we can report the real source (in both local and global rank) as
well as the real amount of data transfered.

This commit was SVN r15929.
2007-08-19 20:16:48 +00:00
George Bosilca
7ef49614fc Correctly read a bool from the application memory. Report back if the
receive request was matched or not.

This commit was SVN r15928.
2007-08-19 20:05:09 +00:00
George Bosilca
c7e0ab93ae Don't forget to include string.h for the strcmp function.
This commit was SVN r15927.
2007-08-19 19:59:15 +00:00
George Bosilca
5ac19c62a1 Translate between global and local ranks. A global rank is the rank
of the process in the MPI_COMM_WORLD, while a local rank is the rank
of the process in the communicator where the request was posted. In
order to get the message graph nicely, each request has to have the
global rank set correctly.

This commit was SVN r15926.
2007-08-19 19:50:26 +00:00
Brian Barrett
af4e86c25f Update collectives selection logic to allow for multiple components to be
used at nce (up to one unique collective module per collective function).
Matches r15795:15921 of the tmp/bwb-coll-select branch

This commit was SVN r15924.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r15795
  r15921
2007-08-19 03:37:49 +00:00
Brian Barrett
2b8af283de Add ability to completely turn off MPI one-sided support, so that users
can experiment with using ROMIO directly.

This commit was SVN r15922.
2007-08-18 21:35:51 +00:00
Josh Hursey
729c63cf9d Fix invalid MCA 'base' names so they appear in ompi_info.
A subset of this patch needs to be applied to v1.2

Refs trac:928

This commit was SVN r15918.

The following Trac tickets were found above:
  Ticket 928 --> https://svn.open-mpi.org/trac/ompi/ticket/928
2007-08-18 03:05:45 +00:00
Rolf vandeVaart
797078115d Fix the case where mpi_preconnect_oob=1 and
mpi_preconnect_oob_simultaneous > np.  Need to scale back
simultaneous to equal np in those cases.  Reviewed by Brian.

This commit fixes trac:1064.

This commit was SVN r15916.

The following Trac tickets were found above:
  Ticket 1064 --> https://svn.open-mpi.org/trac/ompi/ticket/1064
2007-08-17 20:18:42 +00:00
George Bosilca
b9ea4c92e7 Don't show the requests having a negative tag (they are internal
requests for Open MPI). Add a variable to allow Open MPI developers
to see all internal messages.

This commit was SVN r15915.
2007-08-17 18:59:57 +00:00
Edgar Gabriel
0684002812 fixes: 1127
fix some of the multi-threading problems for the cid allocation. Two bugs
specifically:
 - since we do not have a queue for incoming fragments of unknown cid, we need
 to synchronize all processes before exiting the communicator creation. This
 synchronization was/is located in comm_activate, which was however too late
 for the multi-threaded case. Thus, for multi-threaded scenarios we are now
 synchronizing 'before' we allow another thread to enter the cid-allocation
 loop.

- for synchronization, we used for the sake of simplicity allreduce
  operations. It turns out, that these operations interefered with the
 allreductions in the cid-allocation routine, which lead to non-sense results
  in the cid-allocation and potentially to endless loops.

Multi-threaded communicator creation seems to work now, is however still 'very
very' slow. I think, the busy wait of threads is killing the performance of
the active threads in the cid allocation. But this is another topic.

This commit was SVN r15910.
2007-08-17 16:15:26 +00:00
Brian Barrett
3b98b5f0a1 The reference implementation of Portals (which runs over TCP on Linux) is
only static libraries.  Previously, we were linking the libraries into 
directly into the common, btl, and mtl code.  This seemed to work fine
for me on my Opteron Fedora box, but caused Lisa some issues (PtlNIInit
would succeed, but the network handle would fail when used with
PtlEQAlloc).

Instead, link the portals libraries directly into libmpi and not at
all into the common, btl, or mtl components.  THen use some linker
tricks to force the linker to bring in the public interface for the
reference implementation (which thankfully is pretty small).

This commit was SVN r15902.
2007-08-17 03:56:49 +00:00
Brian Barrett
c9e3654a85 Allow OMPI components to modify the link options for libmpi.so. This
functionality used to exist, but I removed it like a year ago because
it wasn't being used.  Well, now I need it :).

This commit was SVN r15901.
2007-08-17 03:52:53 +00:00
George Bosilca
b1082c95ff Remove all output. This final commit should solve all [hopefully]
problems with the integration with parallel debuggers.

This commit was SVN r15898.
2007-08-17 02:35:14 +00:00
George Bosilca
4dacd163cc Don't grab the information from the MPI_Status for the send requests.
This commit was SVN r15897.
2007-08-17 02:31:50 +00:00
George Bosilca
2086b7b445 Optimize the group creation. Don't create a new group if there is
already one containing the same nodes (useful for MPI_Comm_dup).

This commit was SVN r15896.
2007-08-17 02:19:34 +00:00
George Bosilca
7efffdb1da Be smart about parsing the communicators list. Based on the values of
lowest_free and number_free detect if the communicator list has changed.
If not, there is no reason to rebuild it, just use the old one.

This commit was SVN r15895.
2007-08-16 22:51:55 +00:00
George Bosilca
9cce0eb4bc Deal with int/size_t differences. This introduce some small problems
with the reported length of the receive requests, but I'll fix
it soon.

This commit was SVN r15893.
2007-08-16 21:02:24 +00:00
Brad Benton
c254645383 Fixes trac:1134.
Fixed a condition test while checking that all segments are empty.
Without this fix, a NULL segment pointer could make it past the
test, resulting in a SegV when dereferenced.

This commit was SVN r15891.

The following Trac tickets were found above:
  Ticket 1134 --> https://svn.open-mpi.org/trac/ompi/ticket/1134
2007-08-16 19:39:52 +00:00
Brad Benton
1ddba9ec65 Lock the endpoint before doing endpoint_state processing. This ensures
that the subsequent unlock is valid.

This commit was SVN r15890.
2007-08-16 18:11:29 +00:00
Aurelien Bouteiller
3a83c61c40 Fixed a bug with available space in sender based.
This commit was SVN r15889.
2007-08-16 17:54:26 +00:00
George Bosilca
51e726ee8c Remove some old [and unused] code.
This commit was SVN r15887.
2007-08-16 17:06:17 +00:00
Mohamad Chaarawi
b18129c260 Fix to the invalid process lookup cause by group_intersection
This commit was SVN r15886.
2007-08-16 17:01:01 +00:00
George Bosilca
1ae49b7143 Don't use a fprintf. Instead a plain print will do the job.
This commit was SVN r15883.
2007-08-16 14:55:44 +00:00
Tim Prins
5a795128af Change it so that different components in orte use unique rml tags
This commit was SVN r15881.
2007-08-16 14:02:35 +00:00
Jeff Squyres
4b7a6cd922 Fix the following compile error:
{{{
ompi/debuggers/ompi_dll.c:102: error: initializer element is not
constant
}}}

The fix is stupid and I suspect that we'll want to ''not'' print out
all this debugging information all the time.  But I'll leave that to
George to fix...  :-)

This commit was SVN r15880.
2007-08-16 11:51:06 +00:00
George Bosilca
33a73a88c6 Add a lot more information about the requests (pending, matched,
completed). Correctly detect the tag is a receive was matched.

This commit was SVN r15879.
2007-08-16 07:11:56 +00:00
Aurelien Bouteiller
77565d60d9 Heavy modification of the pml_v framework.
* Code cleanup and rationalization
* Fixed: mca_pml_base_send/recv_request are now allocated before recreation by the PML-V
* Fixed: pointer arithmetic bug in sender based that crashed 
* Changed: directory structure. This is one step forward using autogen.sh to build static-components.h (it needs to have the directory structure of a mca framework for this). 

This commit was SVN r15878.
2007-08-16 05:52:30 +00:00