This commit decouples OMPI deployment from the version(s) of the lower
layers of the stack by probing for UDP support.
Verbs applications assume a 40-byte header (there is no current
mechanism for querying payload offset). So to support a 42-byte UDP
header without causing existing applications like ibv_ud_pingpong or
older versions of OMPI to crash, we must inform libusnic_verbs that we
are aware of the nonstandard payload offset. We do this by overriding
the `transport_type` field of the device to be 42 before calling
`ibv_open_device`. If the library resets it to something else, then we
know the lower layers are UDP capable. Otherwise we use the older
custom-L2 format.
This necessitated some minor ugliness in common_verbs, but it's as tidy
as Jeff and I know how to make it right now.
This commit only adds support for UDP headers and connectivity over the
same L2 network, it does not touch routing or interface pairing.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
cmr=v1.7.5:ticket=trac:4253
This commit was SVN r30838.
The following Trac tickets were found above:
Ticket 4253 --> https://svn.open-mpi.org/trac/ompi/ticket/4253
We're going to be bringing a bunch of usnic code to the SVN trunk
soon, and I basically brought this commit over out of order. So I'm
reverting it for now; the same functionality will come back shortly.
This commit was SVN r30805.
The following SVN revision numbers were found above:
r30804 --> open-mpi/ompi@5bedcc15bf
These constants are now upstream (see
https://git.kernel.org/cgit/libs/infiniband/libibverbs.git/commit/?id=f57a9c67eabb9e7f19c624ac3c8c27b7be55796c),
so let's support them properly in Open MPI.
Added bonus: consolidating these checks up in
ompi_check_openfabrics.m4 allowed removing some custom checks and
AC_DEFINE's from the usnic configure.m4 script.
Also change the usnic/configure.m4 check for IBV_EVENT_GID_CHANGE to
use AC_CHECK_DECLS (vs. AC_CHECK_DECL).
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r30804.
- Move the ptrdiff_t tests up higher in configure.ac to be with the
rest of the type tests.
- Create new OMPI_FIND_MPI_AINT_COUNT_OFFSET for finding the
corresponding types of MPI_Aint, MPI_Count, and MPI_Offset.
Consolidate all the old C and Fortran tests into this new macro (and
.m4 file).
- Fix Fortran MPI_*_KIND tests that incorrectly keyed off assumed
types (e.g., int64_t) rather than whatever the corresponding C
MPI_Aint, MPI_Count, MPI_Offset types turned out to be.
- Add new logic to ensure that sizeof(MPI_Count) <= sizeof(size_t),
because our entire PML, BTL, and convertor infrastructure requires
this. As a side effect, just like MPI_Offset the same type of
MPI_Count (because MPI_Count has to be able to hold an MPI_Offset,
so we can't let MPI_Offset be larger than a MPI_Count).
This commit was SVN r30776.
The following Trac tickets were found above:
Ticket 4205 --> https://svn.open-mpi.org/trac/ompi/ticket/4205
This particular test can leave various <foo>.mod files in the OMPI top
build dir, which leads developers to say "what are these .mod files in
my 'svn status' output?".
So make a subdir, run the tests in there, and then remove the subdir.
Reviewed by Dave Goodell.
This is not 100% necessary for OMPI 1.7.4, but it would be nice. The
goal is to have this test removed by OMPI 1.7.5 (i.e., fix ticket
4157). So I leave it up to the RM to decide whether this goes into
1.7.4 or not.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30524.
Ensure that these two flags are in all of mpif.h, the mpi module, and
the mpi_f08 module. Thanks to Rolf Rabenseifner for pointing out the
issue.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30519.
for 32-bit architectures.
This commit also modifies _OMPI_CHECK_HEADER to use AC_CHECK_HEADERS instead
of AC_CHECK_HEADER. This allows components to check for multiple headers
instead of just one. The new semantics of the header check in OMPI_CHECK_PACKAGE
are to return success if at least one of the specified headers exists. The new
semantics will not break current usage.
cmr=v1.7.5:ticket=trac:4053
This commit was SVN r30476.
The following Trac tickets were found above:
Ticket 4053 --> https://svn.open-mpi.org/trac/ompi/ticket/4053
implementation does (that is not quite adherant to the Fortran
standard). If a compiler allows this behavior, build the mpi_f08
wrapper. For example, ifort allows it, but Pathscale/EKOPath 5.0 is
stricter in its Fortran compliance and disallows it.
This test is temporary; the real fix is to make OMPI adhere to Fortran
properly (i.e., see #4157). Once we fix#4157, this test should be
removed. The main reason for committing this test is to put it into
v1.7.4 so that we can release, but with the intent to remove it by
1.7.5 (or 1.8.x at the latest!).
Refs trac:4157
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Add mpi_f08-(non)compliance configure test
This commit was SVN r30440.
The following Trac tickets were found above:
Ticket 4157 --> https://svn.open-mpi.org/trac/ompi/ticket/4157
names longer than 32 characters.
Per discussion on the devel list starting here:
http://www.open-mpi.org/community/lists/devel/2014/01/13799.php we
need a new litmus test to disqualify older Fortran compilers (e.g.,
Pathscale 4.0.12) that *seem* to support all the Right Things, but a)
do not support BIND(C, name="super_long_name") or b) run into an
internal error when compiling our mpi_f08 module.
Testing for b) is sketchy at best. But OMPI has some BIND(C) names
that are >32 characters, and the same compilers that exhibit b) also
seem to not support BIND(C) names that are >32 characters (i.e., a)).
Hence, the following BIND(C) test checks to ensure that BIND(C,
name="foo") works, where "foo" is actually a name >32 characters.
cmr=v1.7.4:reviewer=rhc:subject=Update Fortran configure test to exclude older pathscale/open64 compilers from mpi_f08
This commit was SVN r30421.
fix the case, to enable oshmem-fortran when "with-shmem" was specified and "ompi-fortran" was enabled and happy.
fixed by Roman, reviewed by Miked
Refs trac:3763
This commit was SVN r30391.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
As discovered by the nightly build, r30379 broke the case where
configure does not find a fortran compiler, and no form of
--enable-mpi-fortran was specified.
This commit specifically calls out the difference between a user
specifying that they want Fortran bindings and the default "try to
compile all the Fortran bindings" cases.
cmr=v1.7.4:ticket=4162
This commit was SVN r30386.
The following SVN revision numbers were found above:
r30379 --> open-mpi/ompi@7b28af54bb
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
So just take out this useless warning.
cmr=v1.7.5:ticket=4163
This commit was SVN r30383.
The following Trac tickets were found above:
Ticket 4163 --> https://svn.open-mpi.org/trac/ompi/ticket/4163
Each Fortran bindings layer builds on the other. Specifically:
* You can build just mpif.h support
* You can build mpif.h and "use mpi" support
* You can build mpif.h and "use mpi" and "use mpi_f08" support
You cannot build mpif.h and "use mpi_f08" support without also
building "use mpi" support.
This new functionality adds new capabilities to the existing
--enable-fortran-mpi switch. You can now pass the following values:
* --enable-fortran-mpi=no or none: synonyms for --disable-fortran-mpi
(i.e., build no Fortran bindings).
* --enable-fortran-mpi=mpifh: build only mpif.h support
* --enable-fortran-mpi=usempi: build mpif.h and "use mpi" support
* --enable-fortran-mpi=usempif08: build mpif.h, "use mpi", and "use
mpi_f08" support
* --enable-fortran-mpi=yes or all: synonyms for --enable-fortran-mpi
(i.e., no argument), which will attempt to build all 3 Fortran
bindings
cmr=v1.7.4:ticket=4162
This commit was SVN r30379.
The following Trac tickets were found above:
Ticket 4162 --> https://svn.open-mpi.org/trac/ompi/ticket/4162
The check to enable shmem fortran was too early, MPI can disable fortran but SHMEM fortran check was already done.
Refs trac:3763
This commit was SVN r30340.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
Add a configure test to see if the Fortran compiler supports the
PROTECTED keyword. If it does, use in mpi-f08-types.F90 (via a macro
defined in configure-fortran-output-bottom.h).
This is needed to support the PGI 9 Fortran compiler, which does not
support the PROTECTED keyword.
Note that regardless of whether we want to support the PGI 9 Fortran
compiler + mpi_f08, we need to correctly detect whether PROTECTED
works or not, and then use that determination as a criteria for
building the mpi_f08 module. Previously, mpi-f08-types.F90 used
PROTECTED unconditionally, and we didn't test for it in configure. So
if a compiler (e.g., PGI 9) supported everything else but didn't
support PROTECTED, it would try to compile the mpi_f08 stuff and choke
on the use of PROTECTED.
Refs trac:4093
This commit was SVN r30273.
The following Trac tickets were found above:
Ticket 4093 --> https://svn.open-mpi.org/trac/ompi/ticket/4093
Just some minor updates to make some Fortran test outputs be a bit
more consistent with each other. This can definitely wait until
1.7.5, unless it causes conflicts with other changes that need to come
into 1.7.4.
cmr=v1.7.5:reviewer=dgoodell
This commit was SVN r30271.
This configure option was only relevant when we were generating TKR
"use mpi" interfaces for MPI subroutines with choice buffers. Now
that we aren't, the only interface that needs to accept a choice
buffer is MPI_SIZEOF (which we have to provide).
And since there's now only several dozen interfaces in the "mpi" TKR
module, there's no reason to not generate ''all'' possible array rank
values (when there were thousands of interfaces, generating 4-vs-7
array ranks per interface per type was a big deal). The default used
to be 4; now we can just hard-code it to 7, the max possible value for
Fortran 2003 (I think the max was raised ?to 11? in F2008, but let's
not go there for now).
cmr=v1.7.5:reviewer=dgoodell:subject=Remove even more dead Fortran configury
This commit was SVN r30257.
BIND(C), but not ''all'' of it. So expand our configure checks to
look for multiple different forms of BIND(C):
* ISO_C_BINDING
* SUBROUTINE ... BIND(C)
* TYPE, BIND(C)
* TYPE(foo), BIND(C, name="bar")
If the compiler supports all of these, then declare that we support
BIND(C), and the rest of the mpi_f08 checks can continue. If we miss
any one of those, don't bother continuing -- we won't build the
mpi_f08 module.
Also push the results of all of these tests down to ompi_info so that
they can be reported easily (e.g., "Hey, why doesn't my OMPI
installation have the mpi_f08 module?").
cmr=v1.7.4:reviewer=jsquyres:subject=Expand Fortran BIND(C) configure checks
This commit was SVN r30247.
This should have been part of r30151/#4057. Oops.
cmr=v1.7.4:reviewer=dgoodell
This commit was SVN r30246.
The following SVN revision numbers were found above:
r30151 --> open-mpi/ompi@52b5e17d97
upcoming GCC/gfortran 4.9's ignore TKR interface.
This was originally committed in a side mercurial repo, but I sadly
completely forgot about it until Tobias reminded me.
cmr=v1.7.5:reviewer=dgoodell:subject=Add support for gfortran 4.9 Fortran ignore TKR
This commit was SVN r30152.
This commit adds support for placing the send memory segment in a
traditional shared memory segment when XPMEM is not available. The
current default is to reserve 4MB for shared memory on each process.
The latest benchmarks show vader performing better than sm on both
Intel and AMD CPUs.
For large messages vader will now use CMA if it is available (and
XPMEM is not).
cmr=v1.7.5:reviewer=jsquyres
This commit was SVN r30123.
r30016 was not enough to solve the issue.
So properly prefix all the shell variables used in opal_setup_java.m4
(one of them had an orte_ prefix -- oops!). Now we won't get any
conflicts.
Refs trac:4015
This commit was SVN r30037.
The following SVN revision numbers were found above:
r30016 --> open-mpi/ompi@35dfd26f9e
The following Trac tickets were found above:
Ticket 4015 --> https://svn.open-mpi.org/trac/ompi/ticket/4015
This *should* fix following situation:
1 mxm.rpm puts /etc/ld.so.conf.d/mxm.conf file during rpm install with libpath to /opt/mellanox/mxm/lib
2 some1 can extract mxm.rpm into $HOME/mxm and compile OMPI with new mxm location
3 during runtime, OMPI from prev step will pick MXM from step (1) instead of from step (2)
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30005.
got linked together (work on one caused work in the other):
* Clean up a bunch of VAR_SCOPE issues in configure. This includes:
* Using VAR_SCOPE_PUSH and VAR_SCOPE_POP in more places
* Cleaning up the use of some shell variables (e.g., name them better)
* Add support for external libevent via
--with-libevent=<dir-to-libevent-install-tree>, as specifically
asked for by downstream packagers.
* Revamp how wrapper compiler RPATH (and RUNPATH) support is done.
The external libevent work exposed weakenesses in how the original
RPATH/RUNPATH work was done, so we had to re-do it to be a bit more
robust.
This work has not yet been tested on Solaris.
Refs trac:3694
This commit was SVN r29899.
The following Trac tickets were found above:
Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
more:
- Remove OPAL_ENABLE_MULTI_THREADS, since it didn't really do anything
correctly. Opal always has threads enabled at this point.
- Remove OMPI_ENABLE_PROGRESS_THREADS, since this hasn't worked in
8 years and it has performance issues we'll never be able to
overcome. Note that we have plans for re-adding async progress, using
a hybrid protocol of async and sync sends.
- OMPI_ENABLE_THREAD_MULTIPLE now determines whether the thread lock
macros do the check or not.
- Condition variables are ALWAYS polling right now, which fixes the thread
live-lock currently found when THREAD_MULTIPLE is turned on.
This commit was SVN r29891.
http://www.open-mpi.org/community/lists/users/2013/10/22882.php, fix
the value of MPI_STATUS_SIZE for the -i8 case. Thanks to Jim Parker
for bringing up the issue and providing the patch.
Separate patches are required for v1.6 and v1.7 (and will be attached
to their respective tickets), because this breaks ABI, so we need a
non-default configure option to fix the issue but knowingly break ABI.
cmr=v1.7.4:reviewer=bosilca:subject=Fix MPI_STATUS_SIZE for -i8 case
cmr=v1.6.6:reviewer=bosilca:subject=Fix MPI_STATUS_SIZE for -i8 case
This commit was SVN r29858.
r29830 -- Jeff will straighten this out with Alexander in person next
week (I can't test this myself because I have no access to libhcoll).
Sorry for the hassle...
Refs trac:3694
This commit was SVN r29842.
The following SVN revision numbers were found above:
r29830 --> open-mpi/ompi@3bd9c603ff
The following Trac tickets were found above:
Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
needed in the global scope ($ompi_check_fca_dir). This commit removes
it from the OPAL_VAR_SCOPE, so it should be fixed now.
Sorry about that, folks! :-(
This commit was SVN r29838.
The following SVN revision numbers were found above:
r29830 --> open-mpi/ompi@3bd9c603ff
This is helpful in the work for #3694: ensure that many places that
eventually end up in configure don't overly-pollute the global shell
variable space (because debugging accidental shell variable pollution
can be a real pain).
Refs trac:3694
This commit was SVN r29830.
The following Trac tickets were found above:
Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
This prevents bazillions of warnings from clang tha -Wno-long-double
isn't supported.
The warning that clang issues when it see -Wno-long-double is does not
use any of the words we were looking for, so I added "unknown" to the
list of words to look for. I also re-indented the two m4 tests so
that they're a bit more readable.
cmr=v1.7.4:reviewer=brbarret:subject=Update -Wno-long-double test to support clang
This commit was SVN r29681.
patch. See ticket #3885, comment 10 for an explination of why calling
_STRINGIFY on something that's not a numerical constant is always a bad idea.
This commit was SVN r29613.
The following SVN revision numbers were found above:
r29608 --> open-mpi/ompi@b71bd51cdd
OSX atomic support is disabled by default. Enable with --enable-osx-builtin-atomics.
Fixes trac:2120
This commit was SVN r29568.
The following Trac tickets were found above:
Ticket 2120 --> https://svn.open-mpi.org/trac/ompi/ticket/2120
Apologies for the breakage, I did my test build in the wrong window...
No reviewer.
cmr=v1.7.4:ticket=3865
This commit was SVN r29492.
The following SVN revision numbers were found above:
r29488 --> open-mpi/ompi@25dd719d4d
The following Trac tickets were found above:
Ticket 3865 --> https://svn.open-mpi.org/trac/ompi/ticket/3865
First cut does not attempt any "cross-check". As we discover compilers
which complain about __noinline__, we will add specific cross checks to
handle those cases.
Reviewed-by: Jeff Squyres <jsquyres@cisco.com>
This commit was SVN r29488.
Reworked ompi_info tool to be close with orte_info implementation.
ompi_info_register_types(), ompi_info_close_components() and
ompi_info_show_ompi_version() are moved to runtime/ompi_info_support.c.
Added runtime/oshmem_info_support layer that exports following api to be
used into oshmem_info tool as
oshmem_info_register_types()
oshmem_info_register_framework_params()
oshmem_info_close_components()
oshmem_info_show_oshmem_version()
These functions call ompi_info_support related interfaces as long as
Oshmem supports Open MPI/SHMEM combination.
Now orte_info/ompi_info/oshmem_info have identical implementation approach.
Possible improvement:
OSHMEM processing of --config option is the same as OMPI`s (code is duplicated).
Probably list of info_support interfaces can be extended by xxx_info_do_config().
developed by Igor, reviewed by miked
This commit was SVN r29429.
(http://www.open-mpi.org/community/lists/devel/2013/09/12889.php), I
renamed all "f77" and "f90" directory/file names to "fortran"
(including removing shmemf77 / shmemf90 wrapper compilers and
replacing them with "shmemfort").
2. Fixed several Fortran coding errors.
3. Removed lots of old/stale comments that were clearly the result of
copying from the OMPI layer and then not cleaning up afterwards (i.e.,
the comments were wholly inaccurate in the oshmem layer).
4. Removed both redundant and harmful code from oshmem_config.h.in.
5. Temporarily slave building the oshmem Fortran bindings to
--enable-mpi-fortran. This doesn't seem like a good long-term
solution, but at least you can now build all Fortran bindings (MPI +
oshmem) or not. *** SEE MY NOTE IN config/oshmem_configure_options.m4
FOR WORK THAT STILL NEEDS TO BE DONE!
This commit was SVN r29165.
configure-time dynamic allocation of flags. The net result for platforms
which only support BTL-based communication is a reduction of 8*nprocs bytes
per process. Platforms which support both MTLs and BTLs will not see
a space reduction, but will now be able to safely run both the MTL and BTL
side-by-side, which will prove useful.
This commit was SVN r29100.
*** THIS RFC INCLUDES A MINOR CHANGE TO THE MPI-RTE INTERFACE ***
Note: during the course of this work, it was necessary to completely separate the MPI and RTE progress engines. There were multiple places in the MPI layer where ORTE_WAIT_FOR_COMPLETION was being used. A new OMPI_WAIT_FOR_COMPLETION macro was created (defined in ompi/mca/rte/rte.h) that simply cycles across opal_progress until the provided flag becomes false. Places where the MPI layer blocked waiting for RTE to complete an event have been modified to use this macro.
***************************************************************************************
I am reissuing this RFC because of the time that has passed since its original release. Since its initial release and review, I have debugged it further to ensure it fully supports tests like loop_spawn. It therefore seems ready for merge back to the trunk. Given its prior review, I have set the timeout for one week.
The code is in https://bitbucket.org/rhc/ompi-oob2
WHAT: Rewrite of ORTE OOB
WHY: Support asynchronous progress and a host of other features
WHEN: Wed, August 21
SYNOPSIS:
The current OOB has served us well, but a number of limitations have been identified over the years. Specifically:
* it is only progressed when called via opal_progress, which can lead to hangs or recursive calls into libevent (which is not supported by that code)
* we've had issues when multiple NICs are available as the code doesn't "shift" messages between transports - thus, all nodes had to be available via the same TCP interface.
* the OOB "unloads" incoming opal_buffer_t objects during the transmission, thus preventing use of OBJ_RETAIN in the code when repeatedly sending the same message to multiple recipients
* there is no failover mechanism across NICs - if the selected NIC (or its attached switch) fails, we are forced to abort
* only one transport (i.e., component) can be "active"
The revised OOB resolves these problems:
* async progress is used for all application processes, with the progress thread blocking in the event library
* each available TCP NIC is supported by its own TCP module. The ability to asynchronously progress each module independently is provided, but not enabled by default (a runtime MCA parameter turns it "on")
* multi-address TCP NICs (e.g., a NIC with both an IPv4 and IPv6 address, or with virtual interfaces) are supported - reachability is determined by comparing the contact info for a peer against all addresses within the range covered by the address/mask pairs for the NIC.
* a message that arrives on one TCP NIC is automatically shifted to whatever NIC that is connected to the next "hop" if that peer cannot be reached by the incoming NIC. If no TCP module will reach the peer, then the OOB attempts to send the message via all other available components - if none can reach the peer, then an "error" is reported back to the RML, which then calls the errmgr for instructions.
* opal_buffer_t now conforms to standard object rules re OBJ_RETAIN as we no longer "unload" the incoming object
* NIC failure is reported to the TCP component, which then tries to resend the message across any other available TCP NIC. If that doesn't work, then the message is given back to the OOB base to try using other components. If all that fails, then the error is reported to the RML, which reports to the errmgr for instructions
* obviously from the above, multiple OOB components (e.g., TCP and UD) can be active in parallel
* the matching code has been moved to the RML (and out of the OOB/TCP component) so it is independent of transport
* routing is done by the individual OOB modules (as opposed to the RML). Thus, both routed and non-routed transports can simultaneously be active
* all blocking send/recv APIs have been removed. Everything operates asynchronously.
KNOWN LIMITATIONS:
* although provision is made for component failover as described above, the code for doing so has not been fully implemented yet. At the moment, if all connections for a given peer fail, the errmgr is notified of a "lost connection", which by default results in termination of the job if it was a lifeline
* the IPv6 code is present and compiles, but is not complete. Since the current IPv6 support in the OOB doesn't work anyway, I don't consider this a blocker
* routing is performed at the individual module level, yet the active routed component is selected on a global basis. We probably should update that to reflect that different transports may need/choose to route in different ways
* obviously, not every error path has been tested nor necessarily covered
* determining abnormal termination is more challenging than in the old code as we now potentially have multiple ways of connecting to a process. Ideally, we would declare "connection failed" when *all* transports can no longer reach the process, but that requires some additional (possibly complex) code. For now, the code replicates the old behavior only somewhat modified - i.e., if a module sees its connection fail, it checks to see if it is a lifeline. If so, it notifies the errmgr that the lifeline is lost - otherwise, it notifies the errmgr that a non-lifeline connection was lost.
* reachability is determined solely on the basis of a shared subnet address/mask - more sophisticated algorithms (e.g., the one used in the tcp btl) are required to handle routing via gateways
* the RML needs to assign sequence numbers to each message on a per-peer basis. The receiving RML will then deliver messages in order, thus preventing out-of-order messaging in the case where messages travel across different transports or a message needs to be redirected/resent due to failure of a NIC
This commit was SVN r29058.
Commit r27211 missed a config file change which broke ompi over
iwarp transports.
This fixes trac:3726 and should be added to cmr:v1.7.3:reviewer=jsquyres
This commit was SVN r29049.
The following SVN revision numbers were found above:
r27211 --> open-mpi/ompi@b27862e5c7
The following Trac tickets were found above:
Ticket 3726 --> https://svn.open-mpi.org/trac/ompi/ticket/3726
in generated executables on systems that support it. Use
--disable-wrapper-rpath to disable this behavior. See text in
README about --disable-wrapper-rpath for more details.
This commit was SVN r28479.
The following Trac tickets were found above:
Ticket 376 --> https://svn.open-mpi.org/trac/ompi/ticket/376
A few changes were required to support this move:
1. the PMI component used to identify rte-related data (e.g., host name, bind level) and package them as a unit to reduce the number of PMI keys. This code was moved up to the ORTE layer as the OPAL layer has no understanding of these concepts. In addition, the component locally stored data based on process jobid/vpid - this could no longer be supported (see below for the solution).
2. the hash component was updated to use the new opal_identifier_t instead of orte_process_name_t as its index for storing data in the hash tables. Previously, we did a hash on the vpid and stored the data in a 32-bit hash table. In the revised system, we don't see a separate "vpid" field - we only have a 64-bit opaque value. The orte_process_name_t hash turned out to do nothing useful, so we now store the data in a 64-bit hash table. Preliminary tests didn't show any identifiable change in behavior or performance, but we'll have to see if a move back to the 32-bit table is required at some later time.
3. the db framework was a "select one" system. However, since the PMI component could no longer use its internal storage system, the framework has now been changed to a "select many" mode of operation. This allows the hash component to handle all internal storage, while the PMI component only handles pushing/pulling things from the PMI system. This was something we had planned for some time - when fetching data, we first check internal storage to see if we already have it, and then automatically go to the global system to look for it if we don't. Accordingly, the framework was provided with a custom query function used during "select" that lets you seperately specify the "store" and "fetch" ordering.
4. the ORTE grpcomm and ess/pmi components, and the nidmap code, were updated to work with the new db framework and to specify internal/global storage options.
No changes were made to the MPI layer, except for modifying the ORTE component of the OMPI/rte framework to support the new db framework.
This commit was SVN r28112.
configure test to ensure that the Fortran compiler supports BIND(C)
with LOGICAL parameters (per
http://lists.mpi-forum.org/mpi-comments/2013/02/0076.php).
This may become moot shortly -- Pathscale tells me that they intend
upgrade their compiler to support BIND(C) with default LOGICAL in the
very near term (this week?). But we still want this configure test so
that Open MPI won't even try to build the F08 bindings with older
versions of the Pathscale compilers (or any compiler that doesn't
support BIND(C) and LOGICAL parameters).
This commit was SVN r28110.
The following Trac tickets were found above:
Ticket 3523 --> https://svn.open-mpi.org/trac/ompi/ticket/3523
* Clean up ${includedir} and ${libdir} for script wrapper compilers
* Update script wrapper compilers to work like the C wrapper compilers w.r.t static and dynamic linking
* Remove the ORTE script wrapper compilers since they didn't support the ${includedir} stuff and Ralph said they weren't used anymore.
This commit was SVN r28052.
this patch has already gone into the v1.6 branch.
Leif would like to revamp the ARM support in a different way, and will
submit a patch to do so in the future.
This commit was SVN r27961.
Leif would like to revamp the ARM support in a different way, and will
submit a patch to do so in the future.
This commit was SVN r27960.
The following SVN revision numbers were found above:
r27882 --> open-mpi/ompi@8649b5eece
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
will be wrong (because it's called via a shell subroutine rather than
an m4 macro... doh!), but it's still helpful for searching around in
config.log.
This commit was SVN r27912.
case.
Fixing this issue allows the following case to start working again:
./configure --disable-dlopen --with-hwloc=/path/to/some/external/hwloc ...
Without this fix, the wrapper compilers would include -lhwloc, but
would not include -L/path/..., so -lhwloc would not be found.
This commit was SVN r27894.
meeting, and RFCed in mid-December (#3424): we no longer build the MPI
C++ bindings by default.
The C++ bindings are still ''there'' -- starting with 1.9, we'll just
be providing a little encouragement to no longer use them.
There are no definite plans to ''remove'' the C++ bindings yet. At
the earliest, we would remove them in the next feature series after
1.9.
This commit was SVN r27755.
which is the same size as a Fortran or C integer. This resulted in configure
coming up with Fortran's MAX_INT as -2^31, which obviously isn't a positive
number. Since we found the MAX_INT using the same broken loop in a couple
places and doing it right is complicated, added a new macro that is much
more careful about sign roll-over.
During the Fortran rework between v1.6 and v1.7, the variable which
indicates whether or not Fortran is being compiled changed, so on platforms
without Fortran compilers, we were trying to determine the max value for
Fortran INTEGERS where we previously didn't. I believe this is why
bug #3374 appeared as a regression.
Finally, since the OMPI code doesn't cope with OMPI_FORTRAN_HANDLE_MAX
being negative (which was the root cause of the segfault in $3374),
add a check at the end of the OMPI_FORTRAN_GET_HANDLE_MAX macro to
ensure that OMPI_FORTRAN_HANDLE_MAX is always non-negative.
This commit was SVN r27714.
config/ directory. We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).
So this commit consolidates them back into the top-level directory for
simplicity.
There's still (at least) two changes that would be nice to make:
1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
into a separate directory somewhere so that a top-level -Iconfig/
will only get our explicitly defined macros, not the autogen stuff
(e.g., with libevent2019 needing to get the visibility macro, but
NOT all the autogen-generated inclusion of component configure.m4
files).
1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}
I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.
This commit was SVN r27704.
accidentally adding component names to the OMPI_MPIEXT_ALL_SUBDIRS
variable twice; once with mpiext/ and once without. And then
prefixing all of them with mpiext/ again at the end. That wasn't
really necessary :-) -- we only need to add it once (without mpiext/)
and then prefix at the end (for consistency with the
OMPI_MPI_EXT_*_SUBDIRS handling).
This commit was SVN r26981.
did so!). Previously, I forgot to include the case of not specifying
--with-verbs -- oops! So now the OPAL_CHECK_VERBS_DIR macro will:
1. $opal_want_verbs will be set to: "yes" if --with-verbs or
--with-verbs=DIR was specified "no" if --without-verbs was
specified) "optional" if neither --with-verbs* nor --without-verbs
was specified
'''NOTE:''' --with-openib* are deprecated synonyms for --with-verbs*.
1. $opal_verbs_dir and $opal_verbs_libdir with either both be set or
both be empty.
This commit was SVN r26654.
* Add new configure command line options and deprecate some old ones:
* --with-verbs replaces --with-openib
* --with-verbs-libdir replaces --with-openib-libdir
* If you specify --with-openib[-libdir] without
--with-verbs[-libdir], you'll get a "these options have been
deprecated!" warning, but then they'll act just like
--with-verbs[--libdir].
'''Sidenote:''' Note that we are not renaming any components at this
time, nor are we renaming the top-level OMPI_CHECK_OPENIB m4 macro
(which is pretty strongly tied to the openib BTL and is bastaridzed
by the ofud BTL). Note that there will likely be more changes in
this area coming soon (next week?) when some long-standing changes
move to the SVN trunk: some openib BTL infrastructure will move to
ompi/mca/common, and its configury gets split up / refactored.
We extend our philosophy of other --with-<foo> configure options of
--with-verbs to ''all'' verbs-lovin components:
* If you specify --with-verbs, then all verbs-lovin' components must
configure successfully (or abort). This currently means: OOB ud,
BTL ofud, BTL openib.
* If you specify --with-verbs=DIR, then all verbs-lovin' component
must configure successfully (or abort), and will use DIR to find
verbs headers and libraries.
* If you specify --without-verbs, then all verbs-lovin' components
will be ignored.
This commit also fixes a problem where the --with-openib=DIR form
would not use DIR for ''all'' verbs-lovin' components (I think only
BTL openib and BTL ofud used that DIR). Now all of them do, as does
hwloc (because hwloc has some !OpenFabrics helper functions that
require ibv types from verbs.h).
There's a little new m4 infrastructure worth mentioning:
* If you create a new verbs-lovin' component (i.e., a component that
need verbs), your configure.m4 should
AC_REQUIRE([OPAL_CHECK_VERBS_DIR]).
* You can then use three global shell variables: $opal_want_verbs,
$opal_verbs_dir, $opal_verbs_libdir, which will be set as follows:
* opal_want_verbs will be "yes" and opal_verbs_dir and
opal_verbs_libdir will both be set to directory values, '''OR'''
* opal_want_verbs will be "no" and opal_verbs_dir and
opal_verbs_libdir will both be set empty
This commit was SVN r26640.
problems (i.e., don't return a value from a void() function!). Thanks
to Eugene for identifying these issues. Hopefully this will fix up
the problems Oracle is having with compiling the new Fortran stuff.
This commit was SVN r26310.
the case where we have a really ancient Fortran compiler that does not
support ISO_C_BINDING, but need to test to be sure that the new
configury works.
This commit was SVN r26290.
1. New mpifort wrapper compiler: you can utilize mpif.h, use mpi, and use mpi_f08 through this one wrapper compiler
1. mpif77 and mpif90 still exist, but are sym links to mpifort and may be removed in a future release
1. The mpi module has been re-implemented and is significantly "mo' bettah"
1. The mpi_f08 module offers many, many improvements over mpif.h and the mpi module
This stuff is coming from a VERY long-lived mercurial branch (3 years!); it'll almost certainly take a few SVN commits and a bunch of testing before I get it correctly committed to the SVN trunk.
== More details ==
Craig Rasmussen and I have been working with the MPI-3 Fortran WG and Fortran J3 committees for a long, long time to make a prototype MPI-3 Fortran bindings implementation. We think we're at a stable enough state to bring this stuff back to the trunk, with the goal of including it in OMPI v1.7.
Special thanks go out to everyone who has been incredibly patient and helpful to us in this journey:
* Rolf Rabenseifner/HLRS (mastermind/genius behind the entire MPI-3 Fortran effort)
* The Fortran J3 committee
* Tobias Burnus/gfortran
* Tony !Goetz/Absoft
* Terry !Donte/Oracle
* ...and probably others whom I'm forgetting :-(
There's still opportunities for optimization in the mpi_f08 implementation, but by and large, it is as far along as it can be until Fortran compilers start implementing the new F08 dimension(..) syntax.
Note that gfortran is currently unsupported for the mpi_f08 module and the new mpi module. gfortran users will a) fall back to the same mpi module implementation that is in OMPI v1.5.x, and b) not get the new mpi_f08 module. The gfortran maintainers are actively working hard to add the necessary features to support both the new mpi_f08 module and the new mpi module implementations. This will take some time.
As mentioned above, ompi/mpi/f77 and ompi/mpi/f90 no longer exist. All the fortran bindings implementations have been collated under ompi/mpi/fortran; each implementation has its own subdirectory:
{{{
ompi/mpi/fortran/
base/ - glue code
mpif-h/ - what used to be ompi/mpi/f77
use-mpi-tkr/ - what used to be ompi/mpi/f90
use-mpi-ignore-tkr/ - new mpi module implementation
use-mpi-f08/ - new mpi_f08 module implementation
}}}
There's also a prototype 6-function-MPI implementation under use-mpi-f08-desc that emulates the new F08 dimension(..) syntax that isn't fully available in Fortran compilers yet. We did that to prove it to ourselves that it could be done once the compilers fully support it. This directory/implementation will likely eventually replace the use-mpi-f08 version.
Other things that were done:
* ompi_info grew a few new output fields to describe what level of Fortran support is included
* Existing Fortran examples in examples/ were renamed; new mpi_f08 examples were added
* The old Fortran MPI libraries were renamed:
* libmpi_f77 -> libmpi_mpifh
* libmpi_f90 -> libmpi_usempi
* The configury for Fortran was consolidated and significantly slimmed down. Note that the F77 env variable is now IGNORED for configure; you should only use FC. Example:
{{{
shell$ ./configure CC=icc CXX=icpc FC=ifort ...
}}}
All of this work was done in a Mercurial branch off the SVN trunk, and hosted at Bitbucket. This branch has got to be one of OMPI's longest-running branches. Its first commit was Tue Apr 07 23:01:46 2009 -0400 -- it's over 3 years old! :-) We think we've pulled in all relevant changes from the OMPI trunk (e.g., Fortran implementations of the new MPI-3 MPROBE stuff for mpif.h, use mpi, and use mpi_f08, and the recent Fujitsu Fortran patches).
I anticipate some instability when we bring this stuff into the trunk, simply because it touches a LOT of code in the MPI layer in the OMPI code base. We'll try our best to make it as pain-free as possible, but please bear with us when it is committed.
This commit was SVN r26283.
prototyping new MPI functionality. The C++ bindings are officially
deprecated, and are (currently) 1 vote away from being removed from
MPI-3 altogether. So let's whack the C++ stuff from mpiext.
This commit was SVN r26239.
If the profiling directory is present '/profile' then wire in the profiling stuff.
Only supports C and F77 (and kinda F90) at the moment.
This commit was SVN r26237.
OMPI supports multiple different repository systems (SVN, hg, git).
But the VERSION file has listed "want_svn" and "svn_r" as fields, even
though the actual repo system and version may not be SVN.
So search/replace those fields (and derrivative values that come from
those fields) with "want_repo_rev" and "repo_rev", respectively.
This commit was SVN r24405.
`autogen.pl` to properly handle all Sun Fortran version strings.
This commit was SVN r24080.
The following SVN revision numbers were found above:
r24076 --> open-mpi/ompi@ca0b0efada
Note: the ompi_check_libfca.m4 file had to be modified to avoid it stomping on global CPPFLAGS and the like. The file was also relocated to the ompi/config directory as it pertains solely to an ompi-layer component.
Forgive the mid-day configure change, but I know Shiqing is working the windows issues and don't want to cause him unnecessary redo work.
This commit was SVN r23966.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.
Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.
This commit was SVN r23764.
MPI_INIT and start of MPI_FINALIZE.
* Clean up MPI Extensions build system to acknowledge that OMPI's the only
project with extensions, as well as remove some build artifacts necessary
for more general components.
This commit was SVN r23616.
http://www.open-mpi.org/community/lists/devel/2010/02/7496.php
Increase the required versions of AM, AC, and LT:
* Autoconf: 2.65
* Automake: 1.11.1
* Libtool: 2.2.6b
And therefore removed a bunch of patches that we used to apply to make
older versions of these tools work.
Also updated the HACKING document to match these version numbers,
specifically mentioned Mercurial in a few places, and removed some
outdated language about running autogen.sh in subdirectories.
This commit was SVN r22896.
error if the file-based dloader (e.g., dlopen) fails to load a DSO for
a complex reason (e.g., unresolved symbol).
This has been widely reported upstream to the libltdl maintainers - a
general solution is difficult. This is very definitely an
OMPI-specific solution. Since our embedded libltdl is hidden behind
visibility flags, that's ok.
Note that this is a change to autogen.sh, but this commit does not
force re-running autogen.sh. You'll just get the new functionality
the next time you re-run autogen.sh.
This commit was SVN r22806.
continuation mark in column 6 is enough to split the lines, we do
the same continuation mark in column 73.
Now, while that would fit any msg., this would produce warnings when
including mpif-config.h with -Wall in gfortran and -warn in ifort.
Just get the SVN Version string short and forget it. Let's see
make check choke on that.
This additionally fixes the HG version string...
(Will mention this in ticket #2259)
This commit was SVN r22620.
* Fix typo in echo message
* Only traverse into orte/ and ompi/ if they exist (i.e., properly
handle --no-orte and --no-ompi tarballs). Note that this was only
a minor error -- before, they just output error messages and
(correctly) kept going. This fix just removes the error messages.
This commit was SVN r22353.
Continue the reorganization of the configure system. Move files from the main config directory to their appropriate level-specific config directories. Modify the configure system to correctly handle compiler detection, test, and setup so that all things pertaining to opal and orte are done at the lower level, with the ompi configure system only looking at mpi-specific options.
Ensure the wrapper compilers for orte and ompi only get built when appropriate. Add support for c++ to the orte wrapper compilers, both script and non-script versions.
This commit was SVN r22138.
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.
I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.
Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.
I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.
This commit was SVN r22113.
NOTE: the IPv6 support is currently marginally working and has problems when IPv6 headers are present but the interfaces are not configured to use that protocol, per several reports from users. It is unclear that anyone is willing/able to support this capability. Should that situation change, this change can be reconsidered.
This commit was SVN r22039.
* No need for OPAL_SIZEOF_BOOL and OPAL_SIZEOF_INT in comm_inln.h --
just use sizeof()
* Fix logic in ompi_setup_cxx.m4 to account for the case where we
''do'' have a C++ compiler (duh!!)
* Fix spelling error in a shell variable that ended up making a
bad/empty #define
This should bring the trunk back to being functional. Sorry for the
interruption, folks...
This commit was SVN r21758.
The following SVN revision numbers were found above:
r21755 --> open-mpi/ompi@90d6491737
C++ compiler in configure. If we have a C++ compiler, then the MPI
C++ bindings are built by default. If we don't have a C++ compiler,
then the MPI C++ bindings are not built by default.
--enable-mpi-cxx will now force an error if there is no C++ compiler
available. --disable-mpi-cxx (or the lack of a C++ compiler) will now
disable many of the C++ compiler checks in configure.
Note that there are a few items to clean up regarding the difference
between C's _Bool type and C++'s bool type. Right now, we assume that
they are the same. But they aren't, and they shouldn't be treated as
such. This cleanup will be forced in MPI-2.2 with the introduction of
the MPI_C_BOOL MPI datatype.
This commit was SVN r21755.
automatically pick up stdint.h by using AC_INCLUDES_DEFAULT. This is needed
for the types like int8_t which were added last week.
This commit was SVN r21721.
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
Tests based on preprocessor CPP output, won't help either, as they
don't spit out nice computed numerical values, but rather the
#define itselve.
This commit was SVN r21364.
produced object file does not contain external sumbols -- bad for the
configure tests to find out external symbols.
The only way to check for Fortran naming convention is to rid of
-ipo and -fast in case of ifort...
Thanks to Michel Devel for bringing this up.
This commit was SVN r21363.
Yes, friends, our favorite PCIE BTL has resurfaced as mgmt vacillates over its existence. This is an updated version that actually mostly works, in its final stages of debugging.
Some generalization still remains to be done...
This commit was SVN r21358.