1
1
openmpi/ompi
Jeff Squyres c42ab8ea37 Fixes trac:1210, #1319
Commit from a long-standing Mercurial tree that ended up incorporating a lot of things:

 * A few fixes for CPC interface changes in all the CPCs
 * Attempts (but not yet finished) to fix shutdown problems in the IB CM CPC
 * #1319: add CTS support (i.e., initiator guarantees to send first message; automatically activated for iWARP over the RDMA CM CPC)
   * Some variable and function renamings to make this be generic (e.g., alloc_credit_frag became alloc_control_frag)
   * CPCs no longer post receive buffers; they only post a single receive buffer for the CTS if they use CTS. Instead, the main BTL now posts the main sets of receive buffers. 
   * CPCs allocate a CTS buffer only if they're about to make a connection
 * RDMA CM improvements:
   * Use threaded mode openib fd monitoring to wait for for RDMA CM events
   * Synchronize endpoint finalization and disconnection between main thread and service thread to avoid/fix some race conditions
   * Converted several structs to be OBJs so that we can use reference counting to know when to invoke destructors
   * Make some new OBJ's have opal_list_item_t's as their base, thereby eliminating the need for the local list_item_t type
   * Renamed many variables to be internally consistent
   * Centralize the decision in an inline function as to whether this process or the remote process is supposed to be the initiator
   * Add oodles of OPAL_OUTPUT statements for debugging (hard-wired to output stream -1; to be activated by developers if they want/need them) 
   * Use rdma_create_qp() instead of ibv_create_qp()
 * openib fd monitoring improvements:
   * Renamed a bunch of functions and variables to be a little more obvious as to their true function
   * Use pipes to communicate between main thread and service thread
   * Add ability for main thread to invoke a function back on the service thread 
   * Ensure to set initiator_depth and responder_resources properly, but putting max_qp_rd_ataom and ma_qp_init_rd_atom in the modex (see rdma_connect(3))
   * Ensure to set the source IP address in rdma_resolve() to ensure that we select the correct OpenFabrics source port
   * Make new MCA param: openib_btl_connect_rdmacm_resolve_timeout
 * Other improvements:
   * btl_openib_device_type MCA param: can be "iw" or "ib" or "all" (or "infiniband" or "iwarp")
   * Somewhat improved error handling
   * Bunches of spelling fixes in comments, VERBOSE, and OUTPUT statements
   * Oodles of little coding style fixes
   * Changed shutdown ordering of btl; the device is now an OBJ with ref counting for destruction
   * Added some more show_help error messages
   * Change configury to only build IBCM / RDMACM if we have threads (because we need a progress thread) 

This commit was SVN r19686.

The following Trac tickets were found above:
  Ticket 1210 --> https://svn.open-mpi.org/trac/ompi/ticket/1210
2008-10-06 00:46:02 +00:00
..
attribute Repair the MPI-2 dynamic operations. This includes: 2008-07-03 17:53:37 +00:00
class Small fix for including unistd.h header file. 2008-06-27 16:25:31 +00:00
communicator Ensure that the mutex is properly constructed/destructed. 2008-09-09 12:57:45 +00:00
contrib/vt Removed - This file will be created by autotools 2008-09-19 15:09:46 +00:00
datatype Remove the protection around computing the remote size. This has to be done 2008-09-26 23:11:53 +00:00
debuggers Disable global ID resolution when sparse groups are used. Tested by 2008-09-23 16:27:01 +00:00
errhandler Based on a review by Ralph, no need to call getpid() or gethostname(); 2008-09-23 20:04:34 +00:00
etc Many thanks to Ralf W. for finding a subtle bug in these Makefile.am's 2008-06-04 01:28:03 +00:00
file - Initialize the lock as well 2008-09-09 08:01:41 +00:00
group Fix 2 derefenced NULL variables (Coverty fix 474 & 476). 2008-08-06 15:50:54 +00:00
include Increase the size of MPI_MAX_PORT_NAME from 256 to 1024. 2008-09-25 16:47:17 +00:00
info Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. 2008-06-09 14:53:58 +00:00
mca Fixes trac:1210, #1319 2008-10-06 00:46:02 +00:00
mpi Fix CID 1117: ensure to check return values. 2008-09-19 13:27:30 +00:00
op - double declaration of extern "C" make MS compiler complain. Change them to *_C_DECLS. 2008-08-27 15:49:40 +00:00
peruse - Move the OMPI_DECLSPEC from .c to .h 2008-09-11 12:26:33 +00:00
proc Fix CID 839: minor memory leak on error 2008-08-11 20:46:27 +00:00
request - As shown in ticket #1349, the status is not copied 2008-09-02 15:36:10 +00:00
runtime Sometimes we don't have a valid error code, so don't segv if 2008-10-01 21:42:08 +00:00
tools Very tiny modification of the output when displaying mca param values to clarify that ones found in the environment could have also been set on the cmd line - we don't have a way to distinguish them internally. 2008-09-25 13:08:17 +00:00
win Replace the ompi_pointer_array with opal_pointer_array. The next step 2007-12-21 06:02:00 +00:00
Makefile.am Some more work on the man pages: 2008-08-07 19:20:40 +00:00