r10877:
add warm up connection option.. of course this only warms up the first
eager btl but this should be adequate for now..
r10881:
Consulted with Galen and did a few things:
- Fix the algorithm to actually make the connections that we want
- Rename the MCA param to mpi_preconnect_all
- Cleanup the code a bit:
- move the logic to a separate .c file
- check return codes properly
This commit was SVN r11114.
The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
r10877
r10877
r10881
r10881
did pre-libevent update. The problem is that the behavior of
OPAL_EVLOOP_ONCE was changed by the OMPI team, which them broke things
during the update, so it had to be reverted to the old meaning of
loop until one event occurs. OPAL_EVLOOP_ONELOOP will go through the
event loop once (like EVLOOP_NONBLOCK) but will pause in the event
library for a bit (like EVLOOP_ONCE).
fixes trac:234
This commit was SVN r11081.
The following Trac tickets were found above:
Ticket 234 --> https://svn.open-mpi.org/trac/ompi/ticket/234
users mailing list:
http://www.open-mpi.org/community/lists/users/2006/07/1680.php
Warning: this log message is not for the weak. Read at your own
risk.
The problem was that we had several variables in Fortran common blocks
of various types, but their C counterparts were all of a type
equivalent to a fortran double complex. This didn't seem to matter
for the compilers that we tested, but we never tested static builds
(which is where this problem seems to occur, at least with the Intel
compiler: the linker compilains that the variable in the common block
in the user's .o file was of one size/alignment but the one in the C
library was a different size/alignment).
So this patch fixes the sizes/types of the Fortran common block
variables and their corresponding C instantiations to be of the same
sizes/types.
But wait, there's more.
We recently introduced a fix for the OSX linker where some C versions
of the fortran common block variables (e.g.,
_ompi_fortran_status_ignore) were not being found when linking
ompi_info (!). Further research shows that the code path for
ompi_info to require ompi_fortran_status_ignore is, unfortunately,
necessary (a quirk of how various components pull in different
portions of the code base -- nothing in ompi_info itself requires
fortran or MPI knowledge, of course).
Hence, the real problem was that there was no code path from ompi_info
to the portion of the code base where the C globals corresponding to
the Fortran common block variables were instantiated. This is because
the OSX linker does not automatically pull in .o files that only
contain unintialized global variables; the OSX linker typically only
pulls in a .o file from a library if it either has a function that is
used or have a global variable that is initialized (that's the short
version; lots of details and corner cases omitted). Hence, we changed
the global C variables corresponding to the fortran common blocks to
be initialized, thereby causing the OSX linker to pull them in
automatically -- problem solved. At the same time, we moved the
constants to another .c file with a function, just for good measure.
However, this didn't really solve the problem:
1. The function in the file with the C versions of the fortran common
block variables (ompi/mpi/f77/test_constants_f.c) did not have a
code path that was reachable from ompi_info, so the only reason
that the constants were found (on OSX) was because they were
initialized in the global scope (i.e., causing the OSX compiler to
pull in that .o file).
2. Initializing these variable in the global scope causes problems for
some linkers where -- once all the size/type problems mentioned
above were fixed -- the alignments of fortran common blocks and C
global variables do not match (even though the types of the Fortran
and C variables match -- wow!). Hence, initializing the C
variables would not necessarily match the alignment of what Fortran
expected, and the linker would issue a warning (i.e., the alignment
warnings referenced in the original post).
The solution is two-fold:
1. Move the Fortran variables from test_constants_f.c to
ompi/mpi/runtime/ompi_mpi_init.c where there are other global
constants that *are* initialized (that had nothing to do with
fortran, so the alignment issues described above are not a factor),
and therefore all linkers (including the OSX linker) will pull in
this .o file and find all the symbols that it needs.
2. Do not initialize the C variables corresponding to the Fortran
common blocks in the global scope. Indeed, never initialize them
at all (because we never need their *values* - we only check for
their *locations*). Since nothing is ever written to these
variables (particularly in the global scope), the linker does not
see any alignment differences during initialization, but does make
both the C and Fortran variables have the same addresses (this
method has been working in LAM/MPI for over a decade).
There were some comments here in the OMPI code base and in the LAM
code base that stated/implied that C variables corresponding to
Fortran common blocks had to have the same alignment as the Fortran
common blocks (i.e., 16). There were attempts in both code bases to
ensure that this was true. However, the attempts were wrong (in both
code bases), and I have now read enough Fortran compiler documentation
to convince myself that matching alignments is not required (indeed,
it's beyond our control). As long as C variables corresponding to
Fortran common blocks are not initialized in the global scope, the
linker will "figure it out" and adjust the alignment to whatever is
required (i.e., the greater of the alignments). Specifically (to
counter comments that no longer exist in the OMPI code base but still
exist in the LAM code base):
- there is no need to make attempts to specially align C variables
corresponding to Fortran common blocks
- the types and sizes of C variables corresponding to Fortran common
blocks should match, but do not need to be on any particular
alignment
Finally, as a side effect of this effort, I found a bunch of
inconsistencies with the intent of status/array_of_statuses
parameters. For all the functions that I modified they should be
"out" (not inout).
This commit was SVN r11057.
- move files out of toplevel include/ and etc/, moving it into the
sub-projects
- rather than including config headers with <project>/include,
have them as <project>
- require all headers to be included with a project prefix, with
the exception of the config headers ({opal,orte,ompi}_config.h
mpi.h, and mpif.h)
This commit was SVN r8985.
complete, but stable enough that it will have no impact on general development,
so into the trunk it goes. Changes in this commit include:
- Remove the --with option for disabling MPI-2 onesided support. It
complicated code, and has no real reason for existing
- add a framework osc (OneSided Communication) for encapsulating
all the MPI-2 onesided functionality
- Modify the MPI interface functions for the MPI-2 onesided chapter
to properly call the underlying framework and do the required
error checking
- Created an osc component pt2pt, which is layered over the BML/BTL
for communication (although it also uses the PML for long message
transfers). Currently, all support functions, all communication
functions (Put, Get, Accumulate), and the Fence synchronization
function are implemented. The PWSC active synchronization
functions and Lock/Unlock passive synchronization functions are
still not implemented
This commit was SVN r8836.
Thanks to Anthony Chan for pointing this out.
Note that these will only work for the Fortran compiler that Open MPI
was configured with -- since these values, are, by definition,
single-value, they cannot support all 4 values that Open MPI may
generate for the different Fortran name-mangling schemes. A lengthy
comment in ompi_mpi_init.c explains this in more detail. Added to the
README to explain this situation, as well as the forthcoming
.TRUE. Fortran fixes.
This commit was SVN r8231.
quering some of the collective components. Up to now, it just worked
somehow, but now with correct reference counting for ops in place, it
refused :-)
This commit was SVN r7866.
Here's the huge registry check-in you've all been waiting for with baited breath. The revised version sends a single message to all processes at the various stage gates, thus making the startup much more scalable. I could provide you with all the tawdry details, but won't for now - you are welcome to ask, though, and I'll merrily bore your ears to tears.
In addition, the commit contains the following:
1. set the ignore properties on ompi/debuggers and orte/mca/pls/poe
2. Added simplified subscribe and put functions to the registry's API. I have also converted all of the ompi functions that registered subscriptions to the new API, and caught their associated put's as well.
In a follow-on commit, I'll be adding support for George's hetero arch registry subscription (wanted to get this one in first).
This commit was SVN r7118.
tree.
- fix up #include's throughout the tree (yay contrib/search_replace.pl!)
- remove a few extraneous #include's
- remove orte_sys_info*() from opal_init()/opal_finalize() (it's
already in orte_init_stage1() and orte_system_finalize())
- remove dependencies in opal on orte_system_info -- util/os_path.c
and util/os_create_dirpath.c (they only used path_sep, anyway --
easily changed to #defines)
This commit was SVN r7059.
- Change orte_base_infrastructre to orte_infrastructre to conform with
ompi_info's needs
- Move MCA Param registration in ORTE to a centralized function that is
called first in orte_init_stage1
- Set the infrastructre flag as an argument to orte_init
- Adjust initalization functions to properly pass down the infrastructre
flag.
This commit was SVN r7053.
API is still a bit unstable and may change.
- Add a primitive "first use" component that simply has each process
"touch" the pages that they want to use, thereby [hopefully] locking
them locally to a specific processor
- Add hooks in ompi_mpi_init to enable memory affinity when processor
affinity is used.
- Added hooks in ompi_mpi_finalize to shut down memory affinity when
it was initialized during ompi_mpi_init.
- Added right hooks in ompi_info to display maffinity components.
This commit was SVN r7044.
that this ORTE job is the only one on the nodes involved, and if
told what processors to assign the processes to, will bind MPI
processes to specific processors.
- Convert #include's to new style
- Convert some <tab>'s to spaces
This commit was SVN r6904.
Change all the places where they are used to fit the new name.
Remove the code to check the remote arch from the PML. We will have a GPR mechanism
in ompi_mpi_initialize to do that.
This commit was SVN r6750.
is a lot more difficult than a PTL, and it can adapt it's behavior to the level of threading required
by the user. In this case the behavior is the priorit of the PML. Therefore this information is never
availale before the init function (of the PML) is called. So I try to keep nearly the same structure
as it was before, with one change. When a PML get initialized it does not necessarily means it has been
selected, so it does not means it has to create all it's internal structures (and select the PTL and
all this stuff). They can all be done later, when a PML knows that it definitively get selected
(when the enable function is called with the argument set to true). Thus, in the case of a PML close
one have to check if the PML has been selected or not before trying to clean up the internals.
I had to change the MPI_Init function to allow the PML to be enabled before we start adding procs inside.
This commit was SVN r6434.
* mpi_show_mca_params
If set to true, this turns on the dumping of all MCA parameters when MPI_INIT is called.
Only the 'rank 0' processes will print the parameters.
* mpi_show_mca_params_file
(This value is only used if the first argument is set to true) If this value is non-NULL
it specifies the file to put the dump into. This file can then be used as input to mpirun
for debugging purposes. If this value is not set (and mpi_show_mca_params is set) then
the parameters are dumped to stdout.
This commit was SVN r6401.