1
1

344 Коммитов

Автор SHA1 Сообщение Дата
Tim Woodall
8bf6ed7a36 - corrected locking in gm btl - gm api is not thread safe
- initial support for gm progress thread
- corrected threading issue in pml
- added polling progress for a configurable number of cycles to wait for threaded case

This commit was SVN r9188.
2006-03-02 00:39:07 +00:00
Brian Barrett
579e74290f * make gm wire-up endian safe
This commit was SVN r9179.
2006-02-28 02:03:46 +00:00
Brian Barrett
bfd49d248b * (hopefully) fix MPI_BOTTOM for portals, same way as oll the other RDMA btls from
eons ago...

This commit was SVN r9172.
2006-02-27 17:07:24 +00:00
Brian Barrett
9b19e3fef0 * remove some debugging output that shouldn't have been committed. Doh!
This commit was SVN r9171.
2006-02-27 16:23:52 +00:00
Brian Barrett
285581dff2 More endian-related cleanups:
- moved hton64 and ntoh64 from the bunch of places it had been copied
    into one header file
  - properly set and use the btl_tcp's nbo option to put things in
    network byte order on the wire if both sides don't have the same
    endianness
  - Put the OB1 PML's headers (with a couple exceptions I need to discuss
    with Tim) in network byte order on the wire if both sides don't have
    the same endianness
  - since it was needed for the TCP BTL, move the orte_process_name_t
    HTON and NTOH macros from the TCP OOB to ns_types.h

This commit was SVN r9145.
2006-02-26 00:45:54 +00:00
Jeff Squyres
628125599d Fix the TCL btl module endpoint matching during setup for the scenario
when running an MPI job spanning a node that has two TCP NICs and a
node that has one TCP NIC.  Previously, for the 2 NIC/module process,
we would return the first peer IP address if we couldn't find a subnet
match with any of the peer's published IP addresses -- this was to
support running OMPI across subnet boundaries.  Changed the behavior
to only do that behavior if the IP address we're trying to match is
public (i.e., not 10.x.y.z, 192.168.x.y, or 172.16.x.y) *and* any of
the remote peer's addresses are public (working on the assumption that
if we both have public addresses, they're routable to each other).

This definitely will not work in all scenarios, such as when we go to
WAN kinds of executions, and will need to be revisited at that time.

This commit was SVN r9119.
2006-02-23 02:02:19 +00:00
Galen Shipman
e58b758031 standardize behavior of btl_alloc, if the size is larger than the max send
size, btl_alloc returns NULL. 

This commit was SVN r9114.
2006-02-22 17:37:59 +00:00
Brian Barrett
0d098c9d57 * after talking with galen, take into account we might truncate
This commit was SVN r9113.
2006-02-22 17:03:04 +00:00
Brian Barrett
765d2ffc29 * the self btl should set the segment size field on alloc like the other btls
* clean up duplicate free in long message accumulates that looks like it was
  a cut-n-paste error

This commit was SVN r9112.
2006-02-22 16:20:13 +00:00
Brian Barrett
08747dcaf8 * Throttle the number of incoming receives so that we don't overrun our receive
event queue and lose receive messages.

This commit was SVN r9006.
2006-02-13 15:59:54 +00:00
Brian Barrett
20d06e889e * revert out some of the attempts to better use the Portals 3.3.2-2 user-space
run-time support, as it appears to be doing bad things to memory.  Update
  the hack to get the local nid to match the recent TCP nal changes, and
  update the P3RT api useage

This commit was SVN r9005.
2006-02-13 15:41:00 +00:00
George Bosilca
ecc3e00362 Various cleanups.
This commit was SVN r9002.
2006-02-12 21:36:07 +00:00
Brian Barrett
566a050c23 Next step in the project split, mainly source code re-arranging
- move files out of toplevel include/ and etc/, moving it into the
    sub-projects
  - rather than including config headers with <project>/include, 
    have them as <project>
  - require all headers to be included with a project prefix, with
    the exception of the config headers ({opal,orte,ompi}_config.h
    mpi.h, and mpif.h)

This commit was SVN r8985.
2006-02-12 01:33:29 +00:00
Andrew Friedley
b37e18916f Many different things, the big ones:
- Start filling in the progress function, focusing on connection establishment.
 - Initialize udapl mpool and free lists
 - Create/destroy a protection zone with each IA
 - Misc organization as I learn how things work

This commit was SVN r8969.
2006-02-10 21:49:15 +00:00
George Bosilca
9f1357fb89 Remove all the useless includes. Most of the endpoint do not depend on the
orte includes.

This commit was SVN r8932.
2006-02-08 05:10:48 +00:00
Galen Shipman
c8045bf397 Fixup for ORTE datatype checkin,
- use appropriate header files 
- change calls from orte_dps to orte_dss 

This commit was SVN r8920.
2006-02-07 15:20:44 +00:00
George Bosilca
20fd358327 A BTL cannot depend on the orte data-types.
This commit was SVN r8915.
2006-02-07 06:05:36 +00:00
Ralph Castain
4b9f015c0b Merge in the new data support subsystem for ORTE. MPI folks should not notice a difference. Longer explanation will be sent to developers mailing list.
This commit was SVN r8912.
2006-02-07 03:32:36 +00:00
Rainer Keller
7ac0ffc349 - Instead of doing the unlock inside the if, just move the if-statement
later after the mandatory unlock.

This commit was SVN r8885.
2006-02-02 17:32:22 +00:00
Tim Woodall
a2fde48f2f changes from release branch
This commit was SVN r8858.
2006-01-31 16:17:18 +00:00
Tim Woodall
bcd6c525f8 removed duplicate locks
This commit was SVN r8857.
2006-01-31 16:12:37 +00:00
Tim Woodall
9d484916db remove locks already held
This commit was SVN r8853.
2006-01-31 14:23:08 +00:00
Brian Barrett
b1d2424013 Merge in present work on the MPI-2 onesided chapter. The current code is not
complete, but stable enough that it will have no impact on general development,
so into the trunk it goes.  Changes in this commit include:

 - Remove the --with option for disabling MPI-2 onesided support.  It
   complicated code, and has no real reason for existing
 - add a framework osc (OneSided Communication) for encapsulating
   all the MPI-2 onesided functionality
 - Modify the MPI interface functions for the MPI-2 onesided chapter
   to properly call the underlying framework and do the required
   error checking
 - Created an osc component pt2pt, which is layered over the BML/BTL
   for communication (although it also uses the PML for long message
   transfers).  Currently, all support functions, all communication
   functions (Put, Get, Accumulate), and the Fence synchronization
   function are implemented.  The PWSC active synchronization
   functions and Lock/Unlock passive synchronization functions are
   still not implemented

This commit was SVN r8836.
2006-01-28 15:38:37 +00:00
George Bosilca
0f1c6d79e8 Make the MVAPI BTL thread safe again. The problem was a double locking on the endpoint mutex.
It's still not very clean as we still lock the mvapi_btl mutex inside a critical section
protected by the endpoint mutex ...

This commit was SVN r8810.
2006-01-25 23:14:06 +00:00
Galen Shipman
ddc22d8c7e Use endpoint_lock, not ib_lock (copy paste error from openib btl)
This commit was SVN r8806.
2006-01-25 15:04:37 +00:00
Andrew Friedley
ec995160e6 Checkpoint for switch to mpool work:
- Remove printing of CFLAGS in configure.m4
 - Set MCA_BTL_FLAGS_SEND flag
 - Improved error handling during module initialization
 - Extract the address of each interface with dat_ia_query
 - Start playing around with fragment stuff - probably wrong
 - Misc code cleanup (removal of GM-specific code)

This commit was SVN r8801.
2006-01-25 02:21:34 +00:00
Tim Woodall
51ec050647 port of revised flow control from openib
This commit was SVN r8799.
2006-01-24 23:44:30 +00:00
Tim Woodall
e861158fcd - removed debug code
- removed extraneous memset

This commit was SVN r8798.
2006-01-24 23:38:41 +00:00
Galen Shipman
1e0ea9dd6d Major fixes for the RDMA registration cache (leave_pinned).
This commit fixes issues with HPL runs on node counts > 4. 

This commit was SVN r8793.
2006-01-23 22:51:50 +00:00
Brian Barrett
8b2a285f8f * update reference implementation support to match latest release
This commit was SVN r8783.
2006-01-22 04:56:18 +00:00
Galen Shipman
d657052510 misc cleanup..
This commit was SVN r8731.
2006-01-18 16:20:50 +00:00
Galen Shipman
84a09e4f4e use #if not #ifdef..
This commit was SVN r8720.
2006-01-17 21:07:34 +00:00
Galen Shipman
0c81c0a6ce use ibv_get_device_list if present
(submitted from roland)

This commit was SVN r8712.
2006-01-17 16:23:35 +00:00
Andrew Friedley
5ccab7bcda Checkpoint:
- Move mca_btl_udapl_error/mca_btl_module_init to mca_btl_udapl.c and rename it
 - White space cleanups
 - Free the uDAPL evd and ia handles in mca_btl_udapl_finalize

This commit was SVN r8705.
2006-01-16 21:54:50 +00:00
Andrew Friedley
a4abe3bdbe Checkpoint:
- Borrow configure.m4 from the mvapi btl.  One of the uDAPL headers emits a
   warning when -pedantic is enabled, so strip it out.
 - Change function check in ompi_check_dapl.m4 from dat_ia_open to
   dat_registry_list_providers.. dat_ia_open wasn't working right
 - Make the references to prepare_dst, put, and get NULL for now
 - Add opal_output() calls in all the udapl interface functions for debugging
 - Add evd_qlen component parameter to control event dispatcher queue length
 - First stab at component_init and module_init
 - Misc cleanups - whitespace, dead code removal
 - Update copyrights to 2006

This commit was SVN r8701.
2006-01-16 03:01:12 +00:00
George Bosilca
d4699037f7 Protect an assert if the endpoint cache is not activated.
This commit was SVN r8695.
2006-01-14 21:10:09 +00:00
George Bosilca
3317bf81ad A better implementation for the TCP endpoint cache + few comments.
This commit was SVN r8692.
2006-01-14 20:21:44 +00:00
Tim Woodall
a584c60dbe re-worked flow control logic to take into account the return
of credits from the peer prior to local completion, so that
we don't overrun the number of send wqes available.

This commit was SVN r8683.
2006-01-12 23:42:44 +00:00
Andrew Friedley
c0bad339af - Use the GM BTL as a template instead, per Tim's suggestion
- Begin adding uDAPL-specific stuff
- Added config/ompi_check_udapl.m4 - hopefully I did this right

This commit was SVN r8681.
2006-01-12 04:05:02 +00:00
Tim Woodall
63d0438991 merge in changes from release branch
This commit was SVN r8637.
2006-01-04 16:34:45 +00:00
George Bosilca
479d510eaf Use the common SM component to unmap the shared memory file.
This commit was SVN r8623.
2005-12-31 15:07:48 +00:00
George Bosilca
228f966798 A little trick to force qsort to order the sm modules in the order we expect.
On some systems (like windows) qsort can modify the order of modules when the
comparaison function return 0. As we expect to have the
mca_btl_sm_add_procs_same_base_addr called before mca_btl_sm_add_procs we have
force the qsort to return the SM modules in the correct order. Giving to the
same_addr module a slightly higher priority solve this problem.

This commit was SVN r8620.
2005-12-31 14:59:53 +00:00
Tim Woodall
4ff5316b2d correct copy/paste error
This commit was SVN r8601.
2005-12-22 18:07:46 +00:00
Tim Woodall
5d91c492d6 improve diagnostic error messages
This commit was SVN r8600.
2005-12-22 18:01:47 +00:00
Tim Woodall
e9498f7a75 improve error reporting when registrations fail
This commit was SVN r8598.
2005-12-22 16:05:28 +00:00
Andrew Friedley
f402854a96 Initial commit of uDAPL BTL component.
- Copied the template BTL and renamed everything
 - Compiles and shows up correctly in ompi_info, not tested past that
 - Should be ignored for everyone but me

This commit was SVN r8544.
2005-12-19 16:37:05 +00:00
Brian Barrett
a5af07cd6b fixes suggested by Ralf for supporting both Libtool 1 and 2 in Open MPI...
This commit was SVN r8538.
2005-12-19 03:10:23 +00:00
George Bosilca
1b667067d6 I need to know the number of iovec attached to the fragment.
This commit was SVN r8447.
2005-12-10 23:28:16 +00:00
George Bosilca
e5158142b9 The lb should be extracted from the datatype not from the convertor.
This commit was SVN r8446.
2005-12-10 23:27:20 +00:00
George Bosilca
01b0db91ae Get the lower-bound from the data not from the convertor.
This commit was SVN r8444.
2005-12-10 22:38:25 +00:00