Often, orte/util/show_help.h is included, although no functionality
is required -- instead, most often opal_output.h, or
orte/mca/rml/rml_types.h
Please see orte_show_help_replacement.sh commited next.
- Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
actually showed two *missing* #include "orte/util/show_help.h"
in orte/mca/odls/base/odls_base_default_fns.c and
in orte/tools/orte-top/orte-top.c
Manually added these.
Let's have MTT the last word.
This commit was SVN r20557.
by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules.
This commit was SVN r20515.
The following SVN revision numbers were found above:
r20496 --> open-mpi/ompi@4cdf91a8d4
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".
This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.
Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.
All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.
Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.
This commit was SVN r20496.
we remembered to use strcasecmp() every time I see this entry in the
file... (we did, but I just don't want to have to keep remembering
that ;-) )
This commit was SVN r20461.
btl_openib_connect_rdmacm_reject_causes_connect_error (yes, it's
still long -- on purpose :-) )
* Add INI file parameter rdmacm_reject_causes_connect_error
* Now only treat CONNECT_ERROR events as a REJECT if:
* It's on a connection where we were expecting a REJECT, ''and''
* The MCA parameter is true ''or'' the INI parameter for this
device is true
* Set the INI parameter for true for the NE020
This commit was SVN r20459.
versions of the NE driver will report the OUI while others will report
the PCI ID. We'll put in the Intel values when we get them (may not
be for a few more weeks).
This commit was SVN r20457.
properly at all. NetEffect's current driver (OFED 1.4.0) will return
a CONNECT_ERROR event to the initiator rather than the REJECTED event.
Doh! Additionally -- unfortunately -- NetEffect's vendor_id and
vendor_part_id are reported as 0 in OFED 1.4.0, so we can't
automatically detect these cards and work around the problem. So all
we can do is add a new MCA parameter
(btl_openib_connect_rdmacm_ignore_connect_errors -- yes, it's long on
purpose ;-) ) that says that if we get a CONNECT_ERROR, bascially
treat it exactly as a REJECT for the WRONG_DIRECTION reason (which is
a "good" reject). This allows OMPI to function with NetEffect/Intel
cards on OFED 1.4.0.
Note that NetEffect has been bought by Intel; I'm waiting for
information from them to update the ini file for their new OUI/PCI
ID's and/or new vendor_part_id values.
This commit was SVN r20454.
of more than one of the btl_openib_if_include, btl_openib_if_exclude,
btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude MCA parameters.
This commit was SVN r20053.
determining which IP address to use when transmitting data. Also it adds logic
to prevent usage of more than one of the btl_openib_if_include,
btl_openib_if_exclude, btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude
MCA parameters.
This should complete the code modifications needed for ticket 1665.
This commit was SVN r20052.
determing of the IP subnet. The netmask was being used improperly when
determining which subnet each connection is on. Part two is the ability to
include/exclude specific subnets.
This patch fixes ticket #1665
This commit was SVN r20016.
* If max_inline_data == -1 perform runtime detection
* If max_inline_data >=0 use the value provided
* If the user does not explicitly set this via command line, use the value from INI file
This commit fixes trac:1662
This commit was SVN r19995.
The following Trac tickets were found above:
Ticket 1662 --> https://svn.open-mpi.org/trac/ompi/ticket/1662
This will sit in trunk for a few days - would like to actually see some errors reported to syslog before moving the code to 1.3
This commit was SVN r19986.
<infiniband/driver.h> cannot be included because it will fail to
compile. So per advice from Roland, in that case, just put manually
include the one ibv_*() prototype that we need. Use an undocumented
feature of AC_CHECK_HEADERS to check for <infiniband/driver.h> that I
was told on the autoconf mailing list (see the comment for more
details).
This commit was SVN r19857.
THREAD_MULTIPLE. There's a new (hidden) MCA parameter to re-enable
these BTLs in the presence of THREAD_MULTIPLE:
btl_base_thread_multiple_override. This MCA parameter should ''only''
be used by developers who are working on make their BTLs thread safe;
it should ''not'' be used by end-users!
This commit was SVN r19826.
The following Trac tickets were found above:
Ticket 1588 --> https://svn.open-mpi.org/trac/ompi/ticket/1588
There is still a problem with OpenIB and threads (external to C/R functionality). It has been reported in Ticket #1539
Additionally:
* Fix a file cleanup bug in CRS Base.
* Fix a possible deadlock in the TCP ft_event function
* Add a mca_base_param_deregister() function to MCA base
* Add whole process checkpoint timers
* Add support for BTL: OpenIB, MX, Shared Memory
* Add support Mpool: rdma, sm
* Sundry bounds checking an cleanup in some scattered functions
This commit was SVN r19756.
I run IMB exchange on two QS22 machines with r19674 and it got stucked after 256 or 512 bytes every time.
After applying r19717 the test passed, so I guess this is a essential patch.
This commit was SVN r19752.
The following SVN revision numbers were found above:
r19674 --> open-mpi/ompi@15c47a2473
r19717 --> open-mpi/ompi@0a765cd788
following names are all new for v1.3, and therefore haven't been
officially released yet:
* btl_openib_of_cq_size
* btl_openib_of_max_inline_data
* btl_openib_of_pkey
* btl_openib_of_psn
* btl_openib_of_mtu
The "_of_" (for OpenFabrics) in there is redundant. It used to be
"_ib_", indicating that these values are pretty much passed directly
to the verbs stack. But I think the "openib" in the name implies this
already; having "_of_" in there just seems redundant, makes the name
longer, and seems redundant. It's also redundant.
So I took those "_of_"'s out of the MCA names. The old (v1.2) names
are still valid (but deprecated), such ash btl_openib_ib_cq_size.
This commit was SVN r19718.
* Change name: mca_btl_openib_of_pkey_value -> mca_btl_openib_of_pkey
(since now there's no index, the "_value" suffix is somewhat
superfluous)
* Put in a better help message for the _pkey MCA param (to agree with
the new help message in v1.2.8)
This commit was SVN r19716.
Commit from a long-standing Mercurial tree that ended up incorporating a lot of things:
* A few fixes for CPC interface changes in all the CPCs
* Attempts (but not yet finished) to fix shutdown problems in the IB CM CPC
* #1319: add CTS support (i.e., initiator guarantees to send first message; automatically activated for iWARP over the RDMA CM CPC)
* Some variable and function renamings to make this be generic (e.g., alloc_credit_frag became alloc_control_frag)
* CPCs no longer post receive buffers; they only post a single receive buffer for the CTS if they use CTS. Instead, the main BTL now posts the main sets of receive buffers.
* CPCs allocate a CTS buffer only if they're about to make a connection
* RDMA CM improvements:
* Use threaded mode openib fd monitoring to wait for for RDMA CM events
* Synchronize endpoint finalization and disconnection between main thread and service thread to avoid/fix some race conditions
* Converted several structs to be OBJs so that we can use reference counting to know when to invoke destructors
* Make some new OBJ's have opal_list_item_t's as their base, thereby eliminating the need for the local list_item_t type
* Renamed many variables to be internally consistent
* Centralize the decision in an inline function as to whether this process or the remote process is supposed to be the initiator
* Add oodles of OPAL_OUTPUT statements for debugging (hard-wired to output stream -1; to be activated by developers if they want/need them)
* Use rdma_create_qp() instead of ibv_create_qp()
* openib fd monitoring improvements:
* Renamed a bunch of functions and variables to be a little more obvious as to their true function
* Use pipes to communicate between main thread and service thread
* Add ability for main thread to invoke a function back on the service thread
* Ensure to set initiator_depth and responder_resources properly, but putting max_qp_rd_ataom and ma_qp_init_rd_atom in the modex (see rdma_connect(3))
* Ensure to set the source IP address in rdma_resolve() to ensure that we select the correct OpenFabrics source port
* Make new MCA param: openib_btl_connect_rdmacm_resolve_timeout
* Other improvements:
* btl_openib_device_type MCA param: can be "iw" or "ib" or "all" (or "infiniband" or "iwarp")
* Somewhat improved error handling
* Bunches of spelling fixes in comments, VERBOSE, and OUTPUT statements
* Oodles of little coding style fixes
* Changed shutdown ordering of btl; the device is now an OBJ with ref counting for destruction
* Added some more show_help error messages
* Change configury to only build IBCM / RDMACM if we have threads (because we need a progress thread)
This commit was SVN r19686.
The following Trac tickets were found above:
Ticket 1210 --> https://svn.open-mpi.org/trac/ompi/ticket/1210