1
1

55 Коммитов

Автор SHA1 Сообщение Дата
Gleb Natapov
90fb58de4f When frags are allocated from mpool by free_list the frag structure is also
allocated from mpool memory (which is registered memory for RDMA transports)
This is not a problem for a small jobs, but for a big number of ranks an
amount of waisted memory is big.

This commit was SVN r13921.
2007-03-05 14:17:50 +00:00
Jeff Squyres
f820e44112 Remove a gcc-ism from the code (defining an anonymous union in the
middle of a struct).  Now we properly define and name the union
outside the struct and simply create an instance of it inside the
struct. 

This commit was SVN r13709.
2007-02-19 18:21:57 +00:00
Brian Barrett
8900d3ae43 Second take at fixing the issues with using ompi_ptr_t. Add helper functions for converting from .pval to .lval and vice-versa. Users of ompi_ptr_t types should only use one of the fields in the union unless using the helper conversion functions. For the BTLs, local pointers will always be stored in the .pval field and remote pointers always stored in the .lval field.
George wrote the initial patch, I extended it slightly and am responsible for all bugs found.

Refs trac:587

This commit was SVN r13023.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-07 01:48:57 +00:00
Brian Barrett
48ec0b2071 Revert out r12974, 12976, and 12991 as George has provided a less intrusive fix
for now...

This commit was SVN r12997.

The following SVN revision numbers were found above:
  r12974 --> open-mpi/ompi@27cea44a9c
2007-01-04 22:07:37 +00:00
Brian Barrett
27cea44a9c Fix a number of issues with the ompi_ptr_t:
* Make sure that the pval always writes to the correct portion of the
    lval.  This only matters on 32 bit big endian machines.
  * On 32 bit machines when assigning to pval, the other 4 bytes of lval
    weren't being written, which could lead to bogus data

We use macros so that there aren't casts all over the code and the pval
assignment can occur to the correct 4 bytes.  Refs trac:587

This commit was SVN r12974.

The following Trac tickets were found above:
  Ticket 587 --> https://svn.open-mpi.org/trac/ompi/ticket/587
2007-01-03 19:47:48 +00:00
Gleb Natapov
190e7a27cd Merge with gleb-mpool branch. All RDMA components use same mpool now (rdma).
udapl/openib/vapi/gm mpools a deprecated. rdma mpool has parameter that allows
to limit its size mpool_rdma_rcache_size_limit (default is 0 - unlimited).

This commit was SVN r12878.
2006-12-17 12:26:41 +00:00
Brian Barrett
33320b7165 Rework the opal_progress interface to better support dynamic processes and at
the same time, remove some of the MPI-related options from OPAL:

  - provide mechanism to change at runtime whether sched_yield() should 
    be called when the progress engine is idle
  - provide mechanism for changing the rate at which the event engine
    is called when there are "no" users of the event engine (ie, when
    using MPI but not TCP)
  - fix some function names in the progress engine to better match
    their intended use (and remove MPI naming scheme)
  - remove progress_mpi_enable / progress_mpi_disable because 
    we can now use the functions to set the sched_yield and
    tick rate interfaces
  - rename opal_progress_events() to opal_progress_set_event_flag()
    because the first really isn't descriptive of what the function
    does and I always got confused by it

This commit was SVN r12645.
2006-11-22 02:06:52 +00:00
Ralph Castain
6d6cebb4a7 Bring over the update to terminate orteds that are generated by a dynamic spawn such as comm_spawn. This introduces the concept of a job "family" - i.e., jobs that have a parent/child relationship. Comm_spawn'ed jobs have a parent (the one that spawned them). We track that relationship throughout the lineage - i.e., if a comm_spawned job in turn calls comm_spawn, then it has a parent (the one that spawned it) and a "root" job (the original job that started things).
Accordingly, there are new APIs to the name service to support the ability to get a job's parent, root, immediate children, and all its descendants. In addition, the terminate_job, terminate_orted, and signal_job APIs for the PLS have been modified to accept attributes that define the extent of their actions. For example, doing a "terminate_job" with an attribute of ORTE_NS_INCLUDE_DESCENDANTS will terminate the given jobid AND all jobs that descended from it.

I have tested this capability on a MacBook under rsh, Odin under SLURM, and LANL's Flash (bproc). It worked successfully on non-MPI jobs (both simple and including a spawn), and MPI jobs (again, both simple and with a spawn).

This commit was SVN r12597.
2006-11-14 19:34:59 +00:00
George Bosilca
3db5c0487d typos.
This commit was SVN r12168.
2006-10-18 17:12:25 +00:00
Galen Shipman
73e9ef46fc use int32_t not size_t (ORTE interface change)..
This commit was SVN r11323.
2006-08-22 17:13:10 +00:00
Galen Shipman
2667c52a5d Track fragments by list, not by size..
-- reviewed by Brian, needs to hit all the branches.. 

This commit was SVN r10078.
2006-05-25 18:07:26 +00:00
George Bosilca
b3cc3d82d3 Activate the OOB while we setup connections for MVAPI. Same thing should be done for the
Open IB ...

This commit was SVN r9640.
2006-04-14 20:53:42 +00:00
Gleb Natapov
b6ab1f4262 fix compilation warnings.
This commit was SVN r9515.
2006-04-02 11:32:25 +00:00
Gleb Natapov
ea11582191 Porting of short message RDMA from openib BTL. Endpoint registers circular buffer and sends its address and rkey to the peer. Peer uses this buffer to eagerly RDMA small message into it. Endpoint polls the buffer for message arrival before checking HP/LP QPs. Set btl_mvapi_use_eager_rdma to 1 to enable it.
This commit was SVN r9474.
2006-03-30 12:55:31 +00:00
George Bosilca
ecc3e00362 Various cleanups.
This commit was SVN r9002.
2006-02-12 21:36:07 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
George Bosilca
9f1357fb89 Remove all the useless includes. Most of the endpoint do not depend on the
orte includes.

This commit was SVN r8932.
2006-02-08 05:10:48 +00:00
Galen Shipman
c8045bf397 Fixup for ORTE datatype checkin,
- use appropriate header files 
- change calls from orte_dps to orte_dss 

This commit was SVN r8920.
2006-02-07 15:20:44 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Tim Woodall
9d484916db remove locks already held
This commit was SVN r8853.
2006-01-31 14:23:08 +00:00
George Bosilca
0f1c6d79e8 Make the MVAPI BTL thread safe again. The problem was a double locking on the endpoint mutex.
It's still not very clean as we still lock the mvapi_btl mutex inside a critical section
protected by the endpoint mutex ...

This commit was SVN r8810.
2006-01-25 23:14:06 +00:00
Tim Woodall
51ec050647 port of revised flow control from openib
This commit was SVN r8799.
2006-01-24 23:44:30 +00:00
Brian Barrett
38391e3406 disable shared receive queue support at compile time if the mvapi implementation
does not support shared receive queues (such as the one shipped by SilverStorm / 
Infinicon for OS X).  Reviewed by Galen.

This commit was SVN r8389.
2005-12-06 15:46:30 +00:00
Galen Shipman
eb3ccdb4d8 make compiler happy on false postive warning..
This commit was SVN r8192.
2005-11-18 18:48:11 +00:00
Tim Woodall
58dd6c2493 - merge from release branch
This commit was SVN r8174.
2005-11-17 05:32:30 +00:00
Tim Woodall
01b94862df merge from release branch
This commit was SVN r8168.
2005-11-16 17:12:44 +00:00
Tim Woodall
142b7cc682 merge from release branch
This commit was SVN r8167.
2005-11-16 17:10:49 +00:00
Tim Woodall
2013104d1a SRQ cleanup
This commit was SVN r8104.
2005-11-10 20:51:56 +00:00
Tim Woodall
985c2ca943 cleanup
This commit was SVN r8093.
2005-11-10 15:40:27 +00:00
Tim Woodall
78522ed454 send credits on correct qp
This commit was SVN r8050.
2005-11-08 22:59:44 +00:00
Tim Woodall
2d9c509add flow control
This commit was SVN r8039.
2005-11-08 16:50:07 +00:00
Jeff Squyres
42ec26e640 Update the copyright notices for IU and UTK.
This commit was SVN r7999.
2005-11-05 19:57:48 +00:00
Galen Shipman
4d2d39b0a6 intial checking of SRQ flow control support for mvapi
This commit was SVN r7796.
2005-10-18 14:55:11 +00:00
Galen Shipman
67d38b7896 Add multi-nic support to openib
Fix connection establishment race in openib 
Other misc 

This commit was SVN r7570.
2005-09-30 22:58:09 +00:00
Galen Shipman
8239e635b9 fix misc warnings, cleanup macro..
This commit was SVN r7547.
2005-09-30 03:13:51 +00:00
Tim Woodall
a74ca0062a reductions to initial memory footprint
This commit was SVN r7455.
2005-09-21 19:10:56 +00:00
Tim Woodall
1b73d3856e possible race condition - set endpoint state before sending connect ack
This commit was SVN r7448.
2005-09-20 21:03:55 +00:00
Galen Shipman
d932cfd342 merge of rcache work into the trunk.. lotsa fun ;-)..
I regression tested before the merge, I will regression test tonight and
correct issues that might have crept in. 

This commit was SVN r7329.
2005-09-12 22:28:23 +00:00
Tim Woodall
dfe52fceef minor changes to thread locking
This commit was SVN r7154.
2005-09-02 16:27:01 +00:00
Galen Shipman
589b1b8b5a Additional changes to add_proc and tokens
This commit was SVN r7152.
2005-09-02 15:18:36 +00:00
Galen Shipman
c8a23106c0 More fixes for sq tokens,
Additional work on multi-rail support. 

This commit was SVN r7139.
2005-09-02 03:04:28 +00:00
Tim Woodall
636ab23fdb atomic increment/test
This commit was SVN r7130.
2005-09-01 15:09:50 +00:00
Galen Shipman
29f7b4deda Changed send tokens to both send/rdma tokens for both low and high priority
queue pairs. Tested on intel p2p with 16 procs - Passed. 

This commit was SVN r7119.
2005-09-01 02:41:44 +00:00
Galen Shipman
c7e9563377 Added sender side per qp send tokens to limit the number of outstanding
sends. 

This commit was SVN r7112.
2005-08-31 20:28:42 +00:00
Galen Shipman
09873f299f Fixed a race in connection establishment..
This commit was SVN r7110.
2005-08-31 19:43:22 +00:00
Galen Shipman
73757b300c Added BTL_VERBOSE and OMPI_MCA_btl_base_debug , if set to 1 DEBUG output if
set to 2 VERBOSE output.. 

This commit was SVN r6783.
2005-08-09 17:49:39 +00:00
Tim Woodall
2214f0502d - first cut at tcp btl (working but not optimal)
- reworked btl error logging macros

This commit was SVN r6701.
2005-08-02 13:20:50 +00:00
Galen Shipman
9437cea964 Added support for shared receive queue, note that this is a run-time option
using OMPI_MCA_btl_mvapi_use_srq=1  and is disabled by default. 

This commit was SVN r6602.
2005-07-25 21:15:41 +00:00
Galen Shipman
fd969ac833 More code cleanup.. Also converted post receive requests to macros..
This commit was SVN r6566.
2005-07-20 17:43:31 +00:00
Galen Shipman
2f67ab82bb Working version of openib btl ;-)
Fixed receive descriptor counts that limited mvapi and openib to 2 procs.                                                   
Begin porting error messages to use the BTL_ERROR macro. 

This commit was SVN r6554.
2005-07-19 21:04:22 +00:00