openmpi

Автор	SHA1	Сообщение	Дата
Jeff Squyres	5ded50df0e	* Fix a > that should be == * Ensure to destroy the correct QP (local->id[num]->qp will always have a valid pointer in it, even if we setup a dummy qp) * Note two notable places where we need to figure out how to propagate errors up from the CPC to the main BTL / PML when errors occur. Probably have the same issue in IBCM, too. This commit was SVN r18700.	2008-06-20 22:09:30 +00:00
Jeff Squyres	0074126886	Per #1352 , most iWARP adapters today cannot handle connections between two processes on the same server (!). So for today, we'll simply mark all local processes that use iWARP adapters as "unreachable". More details in #1352. This commit was SVN r18699.	2008-06-20 22:08:00 +00:00
Jeff Squyres	f4145fce7a	Ensure that we don't try to shut down a thread that is not [yet] there (e.g., if you're excluding some devices, their destructors will be invoked before the async event thread was setup for them). This commit was SVN r18698.	2008-06-20 19:30:51 +00:00
Jeff Squyres	ed17b51204	Adjust the max_inline default size down so that it can be accepted on multiple adapters (eg., Chelsio T3). But we need to figure out how to determine a good value for the resident adapter(s) at runtime. It's problematic because, for example, Mellanox ConnectX and Chelsio T3 report max_inline values differently at run-time. If you ibv_create_qp with a max_inline value of 0, ConnectX reports back a value that is a formular based on a few other values (e.g., max_send_sge and max_recv_sge). But T3 always reports back "64". We're looking into this to figure out the best way -- reducing the default right now should allow other adapters to run while we figure it out. This commit was SVN r18697.	2008-06-20 18:24:04 +00:00
George Bosilca	54e7e03695	One less warning. This commit was SVN r18695.	2008-06-20 17:50:19 +00:00
Jeff Squyres	7905db57bd	Slightly decrease the number of buffers for the NetXen adapter This commit was SVN r18691.	2008-06-20 01:00:22 +00:00
Pavel Shamis	4537827973	Making the qp allocation more optimized. - sq parameter was replaced with max_inline parameter - inline is allocated only for relevant QPs This commit was SVN r18675.	2008-06-19 08:40:39 +00:00
Lenny Verkhovsky	f4811d6c4d	NUMA Awareness support. Gleb's patch This commit was SVN r18658.	2008-06-15 13:43:28 +00:00
Galen Shipman	44cd373a87	I also forgot to initialize the convertor max_data, george probably copied this dumb mistake from me. This commit was SVN r18653.	2008-06-13 18:33:43 +00:00
George Bosilca	170b9c344e	Mea culpa. I forget to initialize the max_data before the call to the convertor. This commit was SVN r18651.	2008-06-12 17:24:39 +00:00
Pavel Shamis	dc3f14736d	Fixing QP initialization stuff. This commit was SVN r18650.	2008-06-11 16:31:39 +00:00
Galen Shipman	a239877b78	revert my previous boneheadedness This commit was SVN r18634.	2008-06-10 01:19:04 +00:00
Galen Shipman	4ef4a9520f	remove showhelp.. This commit was SVN r18628.	2008-06-09 20:53:01 +00:00
Galen Shipman	9efbec0383	fix normal send path remove unneeded checks This commit was SVN r18624.	2008-06-09 20:25:27 +00:00
Ralph Castain	9613b3176c	Effectively revert the orte_output system and return to direct use of opal_output at all levels. Retain the orte_show_help subsystem to allow aggregation of show_help messages at the HNP. After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach. I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive. This commit was SVN r18619.	2008-06-09 14:53:58 +00:00
Jeff Squyres	1a748bc7be	First cut at the NetEffect NE020 NIC. This commit was SVN r18599.	2008-06-05 20:24:24 +00:00
Jeff Squyres	f0d465c30a	Slightly simplify the code and remove a compiler warning. This commit was SVN r18596.	2008-06-05 19:08:08 +00:00
Jeff Squyres	b1999bbba3	* Use inclusive NIC/HCA language * Add a description of receive_queues This commit was SVN r18595.	2008-06-05 19:07:22 +00:00
Pavel Shamis	7b9024bc05	Updating Mellanox's Copyright in files touched in 2008 This commit was SVN r18592.	2008-06-05 13:40:26 +00:00
Pavel Shamis	379e00050c	Fixing openib btl finalize flow. Bug fix for #1286 . This commit was SVN r18590.	2008-06-05 12:20:13 +00:00
Jeff Squyres	91a281080a	Fix a compiler warning for a case that would never really happen anyway. Rename a variable to be a bit more descriptive. This commit was SVN r18585.	2008-06-04 19:10:23 +00:00
Jeff Squyres	bc584dedd6	Remove a compiler warning that would never happen in practice. This commit was SVN r18584.	2008-06-04 19:03:02 +00:00
Jeff Squyres	6e37dd0ef0	Fix some 32/64 printf errors once and for all This commit was SVN r18582.	2008-06-04 14:39:37 +00:00
Pavel Shamis	0a8321e08d	Calls to APM functions should be protected with OMPI_HAVE_THREADS. This commit was SVN r18581.	2008-06-04 14:27:41 +00:00
Jeff Squyres	5e918ad25d	Add first cut of NetXen iWARP NIC definition. May still be refined with more experimentation. This commit was SVN r18580.	2008-06-04 12:11:45 +00:00
Pavel Shamis	c73ed2b256	Updating cpc name from xrc to xoob. This commit was SVN r18571.	2008-06-04 08:50:30 +00:00
Ralph Castain	c992e99035	Remove the tags from orte_output_open and the filtering operation from orte_output - this will be handled differently to improve the XML output interface This commit was SVN r18557.	2008-06-03 14:24:01 +00:00
Jeff Squyres	69d78c6739	Fixes trac:1215: adds specific show_help messages about PP vs. SRQ/XRC RNR retry exceeded errors. This commit was SVN r18554. The following Trac tickets were found above: Ticket 1215 --> https://svn.open-mpi.org/trac/ompi/ticket/1215	2008-06-02 11:03:48 +00:00
Jeff Squyres	8c267d50a3	Fixes trac:1121. We already show_help when we fail to create queues, so I just made the message a little more verbose such that it may be that OMPI is trying to use a feature that is not supported on the hardware. This commit was SVN r18553. The following Trac tickets were found above: Ticket 1121 --> https://svn.open-mpi.org/trac/ompi/ticket/1121	2008-05-30 19:03:58 +00:00
George Bosilca	e361bcb64c	Send optimizations. 1. The send path get shorter. The BTL is allowed to return > 0 to specify that the descriptor was pushed to the networks, and that the memory attached to it is available again for the upper layer. The MCA_BTL_DES_SEND_ALWAYS_CALLBACK flag can be used by the PML to force the BTL to always trigger the callback. Unmodified BTL will continue to work as expected, as they will return OMPI_SUCCESS which force the PML to have exactly the same behavior as before. Some BTLs have been modified: self, sm, tcp, mx. 2. Add send immediate interface to BTL. The idea is to have a mechanism of allowing the BTL to take advantage of send optimizations such as the ability to deliver data "inline". Some network APIs such as Portals allow data to be sent using a "thin" event without packing data into a memory descriptor. This interface change allows the BTL to use such capabilities and allows for other optimizations in the future. All existing BTLs except for Portals and sm have this interface set to NULL. This commit was SVN r18551.	2008-05-30 03:58:39 +00:00
Jeff Squyres	728ee47be4	Just check for the presents of $sysfsdir/class/infiniband and check that it's a directory. That's good enough to know that the OpenFabrics kernel drivers have been loaded. If you have no RDMA devices and don't want to see the OMPI warning about not finding any devices, then don't start the OpenFabrics kernel drivers. This commit was SVN r18540.	2008-05-29 14:19:51 +00:00
Nysal Jan	25ac3629e9	eHCA does not have SRQ. Adding receive_queues value so that it works out of the box This commit was SVN r18537.	2008-05-29 13:55:39 +00:00
Jeff Squyres	d5bf8fe005	Remove unused variables. This commit was SVN r18532.	2008-05-29 11:58:16 +00:00
Jeff Squyres	e5ea9d08ca	Fixes trac:1305: check to see if $sysfsdir/class/infiniband exists and is non-empty. If not, then exit the openib btl silently. This addresses the case where libibverbs is installed (which is getting more common) and therefore the openib BTL was built/installed, but the kernel drivers are not loaded (assumedly because there is no RDMA hardware present). In this case, "mpirun a.out" will not issue a warning. There appears to be no good way to definitely tell if there are no RDMA hardware devices present. For example, if libibverbs/the openib BTL is installed, there are no RDMA devices present, but the RDMA hardware kernel drivers ''are'' loaded, OMPI will warn that it was unable to find suitable devices. This warning is easily eliminated by unloading the kernel drivers. This commit was SVN r18530. The following Trac tickets were found above: Ticket 1305 --> https://svn.open-mpi.org/trac/ompi/ticket/1305	2008-05-28 22:05:47 +00:00
Pavel Shamis	28c763f751	Fixing the error flow when somebody tries to use XRC without XOOB. This commit was SVN r18527.	2008-05-28 15:56:04 +00:00
Pavel Shamis	2c81b0ab9a	Fixing compilation warning in btl_openib_connect_ibcm.c This commit was SVN r18526.	2008-05-28 15:20:48 +00:00
Pavel Shamis	879a9fe45c	setup_qps() may exit with error. This commit was SVN r18523.	2008-05-28 11:36:38 +00:00
Pavel Shamis	e657a03143	Fixing broken XRC initialization flow. This commit was SVN r18522.	2008-05-28 11:31:38 +00:00
Pavel Shamis	6596d19c90	Adding new ConnectX vendor_part_id. Fix for ticket #1310 . This commit was SVN r18495.	2008-05-26 12:25:49 +00:00
Jeff Squyres	e1f118d0e6	Remove unused variable This commit was SVN r18491.	2008-05-24 13:05:04 +00:00
Jeff Squyres	1b50e5f6a5	Use the right variable in the output This commit was SVN r18487.	2008-05-23 13:11:12 +00:00
Jeff Squyres	8faeeab81a	Style cleanup only: s/struct foo/foo_t/g to conform to rest of code base This commit was SVN r18483.	2008-05-22 19:26:00 +00:00
Jeff Squyres	1f7f0e1f96	Fixes trac:1281 * s/port/tcp_port/g where relevant to disambiguate TCP port from device port * Rework ipaddrcheck to make it work in the LMC>0 case This commit was SVN r18482. The following Trac tickets were found above: Ticket 1281 --> https://svn.open-mpi.org/trac/ompi/ticket/1281	2008-05-22 19:18:15 +00:00
Jon Mason	d0e26b1cf6	Add pretty comments for _iwarp. This commit was SVN r18478.	2008-05-22 18:02:20 +00:00
Jeff Squyres	62ac6533e0	* Add proper copyrights * Ensure _iwarp.h is always included, or you'll get warnings on platforms that don't have the RDMACM * Add skeleton for function descriptions in comments in iwarp.h This commit was SVN r18477.	2008-05-22 17:41:43 +00:00
Jeff Squyres	28b56c389a	Only check if the opal_ifindex is >= 0 (opal_ifbegin() and opal_ifnext() return -1 upon completion); don't check it against opal_ifcount() -- the interface indexes aren't necessarily related to how many interfaces were found. This commit was SVN r18476.	2008-05-22 02:10:23 +00:00
Jeff Squyres	27978b29f8	Fixes trac:1302: ensure to also use the LID for identifing an incoming IBCM request (not just the port number). This commit was SVN r18475. The following Trac tickets were found above: Ticket 1302 --> https://svn.open-mpi.org/trac/ompi/ticket/1302	2008-05-22 01:28:34 +00:00
George Bosilca	df2156568d	The Elan BTL is now thread safe, and can be build in all conditions. This commit was SVN r18471.	2008-05-21 20:44:37 +00:00
Pak Lui	1585789e8b	Fix the undeclared variable. This commit was SVN r18470.	2008-05-21 04:09:54 +00:00
Jon Mason	b9c25efbd2	Modify to comply with the "prefix rule" and remove "static inline" for the non-rdmacm enabled case. This should fix Ticket #1294. This commit was SVN r18468.	2008-05-20 23:28:59 +00:00
Jeff Squyres	64f61ebd07	Fixes trac:1285. Really. This commit has the same commit message as r18450, but without the extra bonus memory corruption that was introduced. This commit was SVN r18467. The following SVN revision numbers were found above: r18450 --> open-mpi/ompi@5295902ebe The following Trac tickets were found above: Ticket 1285 --> https://svn.open-mpi.org/trac/ompi/ticket/1285	2008-05-20 21:53:42 +00:00
Jeff Squyres	01a7f7eeb6	Switch orte_output* -> OPAL_OUTPUT* for two reasons: 1. We can't use orte_output in the CPC service thread because orte is not thread safe 1. Use the macro version sso that they're compiled out of production builds This commit was SVN r18455.	2008-05-19 17:42:51 +00:00
Jeff Squyres	76fc8dd188	Revert r18450 -- there is some memory badness in there somewhere... This commit was SVN r18451. The following SVN revision numbers were found above: r18450 --> open-mpi/ompi@5295902ebe	2008-05-18 19:11:45 +00:00
Jeff Squyres	5295902ebe	Fixes trac:1285: * allow receive_queues to be specified in the INI file * detect when multiple different receive_queues are specified and gracefully abort However, accomplishing these goals ran into multiple difficulties. By putting receive_queues in the INI file: 1. we may not find the value until we've already traversed multiple HCAs 1. we may find multiple different receive_queues values But since the openib btl initializes as it discovers each HCA/port/LID (including the BSRQ data), if we find a new receive_queues value late in the discovery process, then all the BSRQ data that was previously initialized will likely be invalid. So I had to pull all the BSRQ initialization out until after the rest of the discovery / initialization process. Additionally, note that if the user specifies the MCA parameter btl_openib_receive_queues, it trumps whatever was in the INI file. So in this case, there can never be a receive_queues conflict. This commit does the following (Jon wrote part of this, too): * adapt _ini.c to accept the "receive_queues" field in the file * move 90% of _setup_qps() from _ini.c to _component.c * move what was left of _setup_qps() into the main _register_mca_params() function * adapt init_one_hca() to detect conflicting receive_queues values from the INI file * after the _component.c loop calling init_one_hca(): * call setup_qps() to parse the final receive_queues string value * traverse all resulting btls and initialize their HCAs (if they weren't already): setup some lists and call prepare_hca_for_use() I tested this code on a dual-HCA system where I artificially put in differing receive_queues values in the INI file for the two different types of HCAs that I have and it all seemed to work. This commit was SVN r18450. The following Trac tickets were found above: Ticket 1285 --> https://svn.open-mpi.org/trac/ompi/ticket/1285	2008-05-18 18:50:56 +00:00
Jeff Squyres	caacaadb0a	Minor shuffling of code: no need to query the GID in the iWARP case. This commit was SVN r18446.	2008-05-16 03:36:48 +00:00
Jeff Squyres	9f1b5237fe	Ensure to return an error rather than continue This commit was SVN r18445.	2008-05-16 03:36:11 +00:00
Jeff Squyres	6546898f09	Minor style cleanups; nothing very important in this commit. This commit was SVN r18444.	2008-05-16 03:28:20 +00:00
Jeff Squyres	5c91f53848	Fix a minor memory leak This commit was SVN r18443.	2008-05-16 03:27:42 +00:00
Jeff Squyres	671f0c379d	Remove a whole pile of orte/util/show_help.h's that I missed. :-( This commit was SVN r18437.	2008-05-14 11:32:33 +00:00
Jeff Squyres	e7ecd56bd2	This commit represents a bunch of work on a Mercurial side branch. As such, the commit message back to the master SVN repository is fairly long. = ORTE Job-Level Output Messages = Add two new interfaces that should be used for all new code throughout the ORTE and OMPI layers (we already make the search-and-replace on the existing ORTE / OMPI layers): * orte_output(): (and corresponding friends ORTE_OUTPUT, orte_output_verbose, etc.) This function sends the output directly to the HNP for processing as part of a job-specific output channel. It supports all the same outputs as opal_output() (syslog, file, stdout, stderr), but for stdout/stderr, the output is sent to the HNP for processing and output. More on this below. * orte_show_help(): This function is a drop-in-replacement for opal_show_help(), with two differences in functionality: 1. the rendered text help message output is sent to the HNP for display (rather than outputting directly into the process' stderr stream) 1. the HNP detects duplicate help messages and does not display them (so that you don't see the same error message N times, once from each of your N MPI processes); instead, it counts "new" instances of the help message and displays a message every ~5 seconds when there are new ones ("I got X new copies of the help message...") opal_show_help and opal_output still exist, but they only output in the current process. The intent for the new orte_* functions is that they can apply job-level intelligence to the output. As such, we recommend that all new ORTE and OMPI code use the new orte_* functions, not thei opal_* functions. === New code === For ORTE and OMPI programmers, here's what you need to do differently in new code: * Do not include opal/util/show_help.h or opal/util/output.h. Instead, include orte/util/output.h (this one header file has declarations for both the orte_output() series of functions and orte_show_help()). * Effectively s/opal_output/orte_output/gi throughout your code. Note that orte_output_open() takes a slightly different argument list (as a way to pass data to the filtering stream -- see below), so you if explicitly call opal_output_open(), you'll need to slightly adapt to the new signature of orte_output_open(). * Literally s/opal_show_help/orte_show_help/. The function signature is identical. === Notes === * orte_output'ing to stream 0 will do similar to what opal_output'ing did, so leaving a hard-coded "0" as the first argument is safe. * For systems that do not use ORTE's RML or the HNP, the effect of orte_output_* and orte_show_help will be identical to their opal counterparts (the additional information passed to orte_output_open() will be lost!). Indeed, the orte_* functions simply become trivial wrappers to their opal_* counterparts. Note that we have not tested this; the code is simple but it is quite possible that we mucked something up. = Filter Framework = Messages sent view the new orte_* functions described above and messages output via the IOF on the HNP will now optionally be passed through a new "filter" framework before being output to stdout/stderr. The "filter" OPAL MCA framework is intended to allow preprocessing to messages before they are sent to their final destinations. The first component that was written in the filter framework was to create an XML stream, segregating all the messages into different XML tags, etc. This will allow 3rd party tools to read the stdout/stderr from the HNP and be able to know exactly what each text message is (e.g., a help message, another OMPI infrastructure message, stdout from the user process, stderr from the user process, etc.). Filtering is not active by default. Filter components must be specifically requested, such as: {{{ $ mpirun --mca filter xml ... }}} There can only be one filter component active. = New MCA Parameters = The new functionality described above introduces two new MCA parameters: * '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that help messages will be aggregated, as described above. If set to 0, all help messages will be displayed, even if they are duplicates (i.e., the original behavior). * '''orte_base_show_output_recursions''': An MCA parameter to help debug one of the known issues, described below. It is likely that this MCA parameter will disappear before v1.3 final. = Known Issues = * The XML filter component is not complete. The current output from this component is preliminary and not real XML. A bit more work needs to be done to configure.m4 search for an appropriate XML library/link it in/use it at run time. * There are possible recursion loops in the orte_output() and orte_show_help() functions -- e.g., if RML send calls orte_output() or orte_show_help(). We have some ideas how to fix these, but figured that it was ok to commit before feature freeze with known issues. The code currently contains sub-optimal workarounds so that this will not be a problem, but it would be good to actually solve the problem rather than have hackish workarounds before v1.3 final. This commit was SVN r18434.	2008-05-13 20:00:55 +00:00
Jon Mason	125eb5a2ed	Convert from the Linux ifaddrs to the OMPI ifaddrs, which should unbreak Solaris. This commit was SVN r18433.	2008-05-13 18:34:22 +00:00
Jeff Squyres	d8e5608053	Remove all retransmission code; the IBCM kernel module handles all of that for us. This commit was SVN r18432.	2008-05-13 16:10:34 +00:00
Jon Mason	74bf1ae25f	Fix compiler warnings This commit was SVN r18431.	2008-05-13 16:01:58 +00:00
Jon Mason	4ead9442b5	Add in IDs for all Chelsio iWARP capable adapters This commit was SVN r18428.	2008-05-12 21:59:03 +00:00
Jeff Squyres	6b26895ad4	A little style update -- constants on the left... This commit was SVN r18426.	2008-05-12 12:05:16 +00:00
Jeff Squyres	16cde0e5fa	Fix compile error on older OFED systems This commit was SVN r18425.	2008-05-12 11:56:14 +00:00
Gleb Natapov	6844ff32ba	Return OMPI_ERR_RESOURCE_BUSY from sm->btl_send() function if there is no place in cb. This will prevent OB1 from doing early completion of small sends. This commit was SVN r18424.	2008-05-12 07:15:29 +00:00
Gleb Natapov	0827e537fa	Don't include rdma/rdma_cma.h if !OMPI_HAVE_RDMACM. This commit was SVN r18422.	2008-05-11 11:58:02 +00:00
Jon Mason	99ab66e131	RDMACM code cleanup This patch adds some much needed comments, reduces the amount of code wrapping, and rearrges and removes redundant code. This commit was SVN r18417.	2008-05-08 21:20:12 +00:00
Jon Mason	88e5f2a339	Abstract iWARP subnet ID functions (sans build break) The iWARP subnet ID determination should not be in the RDMACM cpc, as it was in the preversion, as this violates the cpc abstract that is present throughout the code. Also, this patch uses the opal_list_t data struct instead of using its own linked lists. This attempt includes iwarp.c and iwarp.h This commit was SVN r18414.	2008-05-08 14:38:14 +00:00
Jeff Squyres	60f39a30f6	Revert r18409; that commit broke the build because it forgot to add the btl_openib_iwarp.c and btl_openib_iwarp.h files. This commit was SVN r18410. The following SVN revision numbers were found above: r18409 --> open-mpi/ompi@056bbb68c8	2008-05-08 00:22:21 +00:00
Jon Mason	056bbb68c8	Abstract iWARP subnet ID functions The iWARP subnet ID determination should not be in the RDMACM cpc, as it was in the preversion, as this violates the cpc abstract that is present throughout the code. Also, this patch uses the opal_list_t data struct instead of using its own linked lists. This commit was SVN r18409.	2008-05-07 23:59:43 +00:00
Ralph Castain	7c7b9b0486	Do a little cleanup on the opal graph class and opal carto framework to conform to OMPI naming conventions and avoid potential conflict with user applications - no change in functionality, passes carto test program This commit was SVN r18407.	2008-05-07 19:33:49 +00:00
Jeff Squyres	157cea378f	* A few fixes to make IP address and port number comparisons properly * A few indenting and style fixes This commit was SVN r18405.	2008-05-07 16:56:07 +00:00
Jeff Squyres	bfae8ea828	The comment wasn't long enough; I felt the need to make it longer (and explain a little more ;-) ). This commit was SVN r18404.	2008-05-07 16:53:05 +00:00
Jeff Squyres	63abb3eb9b	Clarify a comment / fix typos. This commit was SVN r18402.	2008-05-07 14:51:36 +00:00
Jon Mason	502d164908	Create subnet ID's for iWARP. This enables subnet differientation for iWARP devices, and rearrange initilization so that the services are available when they are needed. This commit was SVN r18393.	2008-05-06 22:43:52 +00:00
Jon Mason	9c724128f8	Handle no IP Address in rdmacm more resiliently If there is no IP Address, have rdmacm log the correct error and let another cpc have a go at it. This is being done by splitting off the IP address checking logic for the modex message creation, and having it log the correct error in the error case. This commit was SVN r18392.	2008-05-06 22:31:29 +00:00
Jon Mason	46bfd42c09	Fix compile warnings in rdmacm Fix some reported compiler warnings and make the code a little prettier. This commit was SVN r18391.	2008-05-06 22:19:28 +00:00
Jon Mason	9066168cd1	Prevent iWARP qp flush errors. For iWARP, the TCP connection is tied to the QP once the QP is in RTS. And destroying the QP is thus tied to connection teardown for iWARP. This is a key distinction from IB, I think. Anyway, to destroy the connection in iWARP you must move the QP out of RTS, either into CLOSING for a nice graceful close, or to ERROR if you want to be rude. In both cases, all pending non-completed SQ and RQ WRs must be flushed. This patch ignores all flush errors reaped by the cq and removes an earlier attempt to work around this in the rdmacm cpc. This commit was SVN r18388.	2008-05-06 21:57:40 +00:00
Jeff Squyres	a06d4023b8	Oops -- missed one sys_errlist -> strerror(). This commit was SVN r18378.	2008-05-06 13:22:36 +00:00
Jeff Squyres	4154e587de	strerror() is much better. This commit was SVN r18376.	2008-05-05 21:06:07 +00:00
Jon Mason	a3bf503e01	Remove error on rdma cm If there are multiple QP's, RDMACM will not send a message if the qpnum != 0. In doing so, it will log an error unecessarily. This removes that. This commit was SVN r18363.	2008-05-02 20:12:01 +00:00
Jon Mason	3989981578	Enable support of num_proc > num_nodes Add the logic to support using port numbers, instead of simply using the IP address of the sending node to determine which endpoint to connect. Since each process calls the cpc query function, it will generate its own port to listen on thus enablign this to work. This commit was SVN r18362.	2008-05-02 16:20:28 +00:00
Jeff Squyres	ba5615a18f	Merge in /tmp-public/cpc3 branch to trunk. oob/xoob still remains the default CPC. This commit was SVN r18356.	2008-05-02 11:52:33 +00:00
Donald Kerr	843a35094f	adding local work queue accounting This commit was SVN r18352.	2008-05-01 21:01:51 +00:00
George Bosilca	a69ac964df	Allow any order in the list of Elan vpid. This commit was SVN r18350.	2008-05-01 20:32:03 +00:00
Pavel Shamis	61cc8843bf	The r17940 broke the XRC code. The endpoint may be appended to list during XOOB connection bring up. This commit was SVN r18328. The following SVN revision numbers were found above: r17940 --> open-mpi/ompi@ebfdd133f5	2008-04-29 13:22:40 +00:00
Brad Penoff	c699236be2	updating SCTP BTL to configure properly with FreeBSD 7 This commit was SVN r18324.	2008-04-28 04:19:10 +00:00
Adrian Knoth	c53d3c3c22	reverted r18169,r18170 due to connection reset by peer on odin/sif This commit was SVN r18255. The following SVN revision numbers were found above: r18169 --> open-mpi/ompi@20473bfda2 r18170 --> open-mpi/ompi@d34dfbe12c	2008-04-23 15:26:15 +00:00
Jeff Squyres	c40740947f	Fix minor spelling error. This commit was SVN r18229.	2008-04-22 13:11:50 +00:00
Galen Shipman	27c425b304	make portals level ack's optional (require ACK by default) This commit was SVN r18228.	2008-04-21 22:22:18 +00:00
Ralph Castain	fa082cafa9	Shift the architecture calculation from the ompi/datatype engine to the opal/util area. This allows us to compute the architecture earlier in the launch and communicate it outside of the modex. Note: this is an early preliminary step in the movement of portions of the datatype engine to the opal layer. This commit was SVN r18198.	2008-04-17 20:43:56 +00:00
Adrian Knoth	d34dfbe12c	fixed misleading comment. This commit was SVN r18170.	2008-04-16 11:26:15 +00:00
Adrian Knoth	20473bfda2	on incoming connections, compare with every possible source address. Rational (taken from the code): /* This is PITA. We never know which source address an * incoming/outgoing packet will have, so even with * btl_tcp_if_include/exclude on the remote end, we * might get a different source address. * * If this address isn't included in btl_proc->proc_addrs, * we would erroneously drop the connection */ merge -r18165:18167 to the trunk. This commit was SVN r18169. The following SVN revisions from the original message are invalid or inconsistent and therefore were not cross-referenced: r18165 r18167	2008-04-16 11:24:09 +00:00
Adrian Knoth	e981a259bb	btl_tcp_disable_family=4 and btl_tcp_disable_family=6 are mutually exclusive, so this should result in "unreachable" when set differently between peers. This commit was SVN r18168.	2008-04-16 10:14:58 +00:00
Adrian Knoth	75c54616c7	renamed opal_sockaddr2str to opal_net_get_hostname for WANT_PEER_DUMP=1 This commit was SVN r18154.	2008-04-15 19:23:47 +00:00
Jeff Squyres	72af302360	Remove unused variable. This commit was SVN r18151.	2008-04-15 14:58:32 +00:00
Aurelien Bouteiller	0f311ed824	Make sure the function returns NULL when no elan adapter is available instead of a random value. This commit was SVN r18136.	2008-04-11 21:03:01 +00:00
Aurelien Bouteiller	20592cbcbf	Fixes a warning about mallocing 0 bytes when no elan adapter is available. This commit was SVN r18135.	2008-04-11 20:59:12 +00:00

1 2 3 4 5 ...

1160 Коммитов