1. The code that looks at btl_tcp_if_exclude before doing a
modex_send uses strcmp rather than strncmp. That means that
"lo0" gets sent even though "lo" is excluded.
2. The code that determines whether a particular local TCP
interface can connect to a particular remote interface doesn't
check for loopback interfaces. With this fix, users can now
enable "lo" and be assured that it will only be used for intra-
node communication.
This commit was SVN r22762.
for printing size_t use "%lu" and cast to (unsigned long).
This commit was SVN r21238.
The following SVN revision numbers were found above:
r21234 --> open-mpi/ompi@22b6177fb9
parameters on the connecting side also. Also move define of IF_NAMESIZE
into if.h file. And lastly, add one verbose debug message which may be
useful if we run into other issues like this.
This commit fixes trac:1573.
This commit was SVN r19932.
The following Trac tickets were found above:
Ticket 1573 --> https://svn.open-mpi.org/trac/ompi/ticket/1573
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.
I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.
This commit was SVN r18619.
such, the commit message back to the master SVN repository is fairly
long.
= ORTE Job-Level Output Messages =
Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):
* orte_output(): (and corresponding friends ORTE_OUTPUT,
orte_output_verbose, etc.) This function sends the output directly
to the HNP for processing as part of a job-specific output
channel. It supports all the same outputs as opal_output()
(syslog, file, stdout, stderr), but for stdout/stderr, the output
is sent to the HNP for processing and output. More on this below.
* orte_show_help(): This function is a drop-in-replacement for
opal_show_help(), with two differences in functionality:
1. the rendered text help message output is sent to the HNP for
display (rather than outputting directly into the process' stderr
stream)
1. the HNP detects duplicate help messages and does not display them
(so that you don't see the same error message N times, once from
each of your N MPI processes); instead, it counts "new" instances
of the help message and displays a message every ~5 seconds when
there are new ones ("I got X new copies of the help message...")
opal_show_help and opal_output still exist, but they only output in
the current process. The intent for the new orte_* functions is that
they can apply job-level intelligence to the output. As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.
=== New code ===
For ORTE and OMPI programmers, here's what you need to do differently
in new code:
* Do not include opal/util/show_help.h or opal/util/output.h.
Instead, include orte/util/output.h (this one header file has
declarations for both the orte_output() series of functions and
orte_show_help()).
* Effectively s/opal_output/orte_output/gi throughout your code.
Note that orte_output_open() takes a slightly different argument
list (as a way to pass data to the filtering stream -- see below),
so you if explicitly call opal_output_open(), you'll need to
slightly adapt to the new signature of orte_output_open().
* Literally s/opal_show_help/orte_show_help/. The function signature
is identical.
=== Notes ===
* orte_output'ing to stream 0 will do similar to what
opal_output'ing did, so leaving a hard-coded "0" as the first
argument is safe.
* For systems that do not use ORTE's RML or the HNP, the effect of
orte_output_* and orte_show_help will be identical to their opal
counterparts (the additional information passed to
orte_output_open() will be lost!). Indeed, the orte_* functions
simply become trivial wrappers to their opal_* counterparts. Note
that we have not tested this; the code is simple but it is quite
possible that we mucked something up.
= Filter Framework =
Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr. The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations. The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc. This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).
Filtering is not active by default. Filter components must be
specifically requested, such as:
{{{
$ mpirun --mca filter xml ...
}}}
There can only be one filter component active.
= New MCA Parameters =
The new functionality described above introduces two new MCA
parameters:
* '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
help messages will be aggregated, as described above. If set to 0,
all help messages will be displayed, even if they are duplicates
(i.e., the original behavior).
* '''orte_base_show_output_recursions''': An MCA parameter to help
debug one of the known issues, described below. It is likely that
this MCA parameter will disappear before v1.3 final.
= Known Issues =
* The XML filter component is not complete. The current output from
this component is preliminary and not real XML. A bit more work
needs to be done to configure.m4 search for an appropriate XML
library/link it in/use it at run time.
* There are possible recursion loops in the orte_output() and
orte_show_help() functions -- e.g., if RML send calls orte_output()
or orte_show_help(). We have some ideas how to fix these, but
figured that it was ok to commit before feature freeze with known
issues. The code currently contains sub-optimal workarounds so
that this will not be a problem, but it would be good to actually
solve the problem rather than have hackish workarounds before v1.3 final.
This commit was SVN r18434.
Rational (taken from the code):
/* This is PITA. We never know which source address an
* incoming/outgoing packet will have, so even with
* btl_tcp_if_include/exclude on the remote end, we
* might get a different source address.
*
* If this address isn't included in btl_proc->proc_addrs,
* we would erroneously drop the connection
*/
merge -r18165:18167 to the trunk.
This commit was SVN r18169.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r18165
r18167
We loop over all peer addresses and accept when one of them matches.
Note that this might break functionality: mca_btl_tcp_proc_insert now
always inserts the same endpoint. (is the lack of endpoints the problem?
should there be one for every remote address?)
Re #1206
This commit was SVN r17307.
- the registration array is now global instead of one by BTL.
- each framework have to declare the entries in the registration array reserved. Then
it have to define the internal way of sharing (or not) these entries between all
components. As an example, the PML will not share as there is only one active PML
at any moment, while the BTLs will have to. The tag is 8 bits long, the first 3
are reserved for the framework while the remaining 5 are use internally by each
framework.
- The registration function is optional. If a BTL do not provide such function,
nothing happens. However, in the case where such function is provided in the BTL
structure, it will be called by the BML, when a tag is registered.
Now, it's time for the second step... Converting OB1 from a switch based PML to an
active message one.
This commit was SVN r17140.
than just the PML/BTLs these days. Also clean up the code so that it
handles the situation where not all nodes register information for a given
node (rather than just spinning until that node sends information, like
we do today).
Includes r15234 and r15265 from the /tmp/bwb-modex branch.
This commit was SVN r15310.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r15234
r15265
* Move ipv6comat.h code into opal_config_bottom.h and change into some
more intelligent testing of structures
* Change opal's if interface to use sockaddr instead of sockaddr_storage,
as the RFCs suggest we do
* Move the networking code in opal that isn't directly related to if
detection into net.h
* Add quicky function to get the port out of either a sockaddr_in
or sockaddr_in6, saving a bunch of code in the oob.
* Update TCP oob and btl with new interface
This commit was SVN r14679.
during the IPv6 patch. The most important is the multi BTL support. There
was a quite interesting bug. Instead of setting up the multiple connections
over different physical devices, based on the time when these connections
were created most of the time they were all using the same physical network.
Which, of course, was not the intended goal, as we top at the maximum
bandwidth available over one device instead of gathering all available
bandwidth from all devices.
Second, the IPv6 RFC suggest to use sockaddr_storage as a holder for the
IP information, but use a sockaddr* when we pass it to functions. This is
only partially corrected by this patch.
Some other minor cleanups.
This commit was SVN r14544.
- make opal_sockaddr2str() take a sockaddr_storage instead of a sockaddr_in6
so that it works for IPv4 and IPv6 addresses, and remove a whole bunch
of #ifs in the OOOB code.
- Fix a compiler warning in the TCP BTL due to run-time determined
array size by making it a dynamicly allocated array.
- Fix the unpacking code of IPv4 addresses when using IPv6 support, so
that the address is in the correct location (instead of in an IPv6
structure, use an IPv4 structure). Refs trac:1005.
This commit was SVN r14514.
The following Trac tickets were found above:
Ticket 1005 --> https://svn.open-mpi.org/trac/ompi/ticket/1005
derefence through it. It is legal for endpoint_addr to be NULL in the
destructor because if btl_tcp_add_procs() -> btl_tcp_proc_insert()
returns UNREACH, then endpoint_addr will be NULL and we'll OBJ_RELEASE
it.
This commit was SVN r9940.
- moved hton64 and ntoh64 from the bunch of places it had been copied
into one header file
- properly set and use the btl_tcp's nbo option to put things in
network byte order on the wire if both sides don't have the same
endianness
- Put the OB1 PML's headers (with a couple exceptions I need to discuss
with Tim) in network byte order on the wire if both sides don't have
the same endianness
- since it was needed for the TCP BTL, move the orte_process_name_t
HTON and NTOH macros from the TCP OOB to ns_types.h
This commit was SVN r9145.
when running an MPI job spanning a node that has two TCP NICs and a
node that has one TCP NIC. Previously, for the 2 NIC/module process,
we would return the first peer IP address if we couldn't find a subnet
match with any of the peer's published IP addresses -- this was to
support running OMPI across subnet boundaries. Changed the behavior
to only do that behavior if the IP address we're trying to match is
public (i.e., not 10.x.y.z, 192.168.x.y, or 172.16.x.y) *and* any of
the remote peer's addresses are public (working on the assumption that
if we both have public addresses, they're routable to each other).
This definitely will not work in all scenarios, such as when we go to
WAN kinds of executions, and will need to be revisited at that time.
This commit was SVN r9119.
Change all the places where they are used to fit the new name.
Remove the code to check the remote arch from the PML. We will have a GPR mechanism
in ompi_mpi_initialize to do that.
This commit was SVN r6750.