Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0.
Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message
Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple
This commit was SVN r22841.
After talking to both Brian and George, the conensus was to just
remove the flag and the test function. Begone, evil spirits, BEGONE!
This commit was SVN r22831.
The following Trac tickets were found above:
Ticket 2273 --> https://svn.open-mpi.org/trac/ompi/ticket/2273
bug: libmpi_f90 had libmpi.la in its LIBADD instead of libmpi_f77.la.
Fixes trac:2244.
This commit was SVN r22704.
The following Trac tickets were found above:
Ticket 2244 --> https://svn.open-mpi.org/trac/ompi/ticket/2244
Add a ''map_bynode'' info key to determine if the job to be started by comm_spawn* should be mapped by node or by slot. Default is to map according to the default policy set when the parent job was started.
cmr:v1.5.1
This commit was SVN r22564.
In CMake 2.6 and earlier, this function add dependencies for targets and also link the target libraries automatically, but in CMake 2.8,this behavior has been changed, i.e. it will only add the dependencies but no link, which will cause linking errors at compilation time.
This commit was SVN r22405.
we should have also relaxed the error checking for MPI_GRAPH_CREATE.
Thanks to David Singleton for pointing this out.
This commit was SVN r22251.
The following SVN revision numbers were found above:
r21816 --> open-mpi/ompi@b8332ea2b2
other request-using frameworks.
- Rather than having mpi/c/* functions allocate requests explicitly,
pass the MPI_Request* down to the I/O component and have it
perform the allocation.
- While the I/O base provides a base request which can be used,
it is not required and all request management occurs within
the component.
- Push progress management into the component, rather than having it
happen in the base. Progress functions are now easily registered,
and not all (ie, the one existing) components use progress functions
in any rational way.
ROMIO switched to generalized requests instead of MPIO_Requests many
moons ago, and Open MPI now uses ROMIO's generalized requests, so there
is no reason to wrap those requests (which are OMPI requests) in another
level of request.
Now the file function passes the MPI_Request* to the ROMIO component,
which passes it to the underlying ROMIO function, which calls
MPI_Grequest_start to create an OMPI request, which is what gets set
as the request to the user. Much cleaner.
This patch has two motivations. One, a whole heck of a lot of code
just got removed, and request handling is now much cleaner for I/O
components. Two, by adding support for Argonne's proposed generalized
request extensions, we can allow ROMIO to provide async I/O through
generalized requests, which we couldn't rationally do in the old
setup due to the crazy request completion rules.
This commit was SVN r22235.
XML code in the F90 tree isn't used anymore, but we might as well
update it just so that everything is consistent).
This commit was SVN r22127.
The following Trac tickets were found above:
Ticket 2067 --> https://svn.open-mpi.org/trac/ompi/ticket/2067
from "MPI_*_errhandler_fn" to "MPI_*_errhandler_function" (and their
corresponding C++ types, too). Also updated the corresponding man
pages, and marked the typedefs to the now-deprecated types as
deprecated.
This commit was SVN r22122.
The following Trac tickets were found above:
Ticket 2060 --> https://svn.open-mpi.org/trac/ompi/ticket/2060
(yay cisco webex!). Make sure we only go up to OPAL max datatype, not
OMPI max datatype.
This commit was SVN r22016.
The following Trac tickets were found above:
Ticket 2028 --> https://svn.open-mpi.org/trac/ompi/ticket/2028
in the v1.2 series the cid's could never go above the max. allowed for a
particular pml. Because of that, pml_add_comm never checked for the cid, and
in fact pml_add_comm was called in comm_set, which is *before* we knew the
cid.
in the v1.3 series (and trunk) we check now the cid to detect overflow, and
because of that pml_add_comm has been moved *after* the cid allocation
routine, namely into the comm_activate routine.
in the v1.2 series, the comm_activate contained a synchronization step of the
old communicator in order to prevent incoming fragments on the new
communicator, with the main problem being that the allreduce in the
communicator allocation finished at different times on different processes,
and thus, this scenario could and did really occur.
in the v1.3 series, the comm_activate does not contain the synchronization
step anymore, since we introduced the new queue for fragments with unknown
cid. The problem is however, that whether a fragment is known or not is
decided by using ompi_comm_lookup(), which will return something useful as
soon as the cid allocation finished, even before pml_add_comm has been
called. So there is a small time gap where we will not post a message into
queue for unknown cid's, but we can also not look up the process structure
belonging to the rank in that comm ( that is in pml_ob1_match_recv_frag or
something like that).
The current fix reintroduces the synchronization step in comm_activate, and
ensures that no fragment can be received for a new communicator before the
synchronization occurs , and thus comm_nextcid() and pml_add_comm has been
called. It seems to be the safest and easiest way for now. Welcome back, v1.2.
This commit was SVN r21970.
#if defined (c_plusplus)
defined (__cplusplus)
followed by
extern "C" {
and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS.
Notable exceptions are:
- opal/include/opal_config_bottom.h:
This is our generated code, that itself defines BEGIN_C_DECL and
END_C_DECL
- ompi/mpi/cxx/mpicxx.h:
Here we do not include opal_config_bottom.h:
- Belongs to external code:
opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c
opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h
- opal/include/opal/prefetch.h:
Has C++ specific macros that are protected:
- Had #if ... } #endif _and_ END_C_DECLS (aka end up with 2x
END_C_DECLS)
ompi/mca/btl/openib/btl_openib.h
- opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS...
- opal/win32/ompi_process.h: had extern "C"\n {...
opal/win32/ompi_process.h: dito
- ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS
ompi/mpi/f90/test/align_c.c: dito
- ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus
- ompi/mpi/f90/xml/common-C.xsl: Amend
Tested on linux using --with-openib and --with-mx
The following do not contain either opal_config.h, orte_config.h or
ompi_config.h
(but possibly other header files, that include one of the above):
ompi/mca/bml/r2/bml_r2_ft.h
ompi/mca/btl/gm/btl_gm_endpoint.h
ompi/mca/btl/gm/btl_gm_proc.h
ompi/mca/btl/mx/btl_mx_endpoint.h
ompi/mca/btl/ofud/btl_ofud_endpoint.h
ompi/mca/btl/ofud/btl_ofud_frag.h
ompi/mca/btl/ofud/btl_ofud_proc.h
ompi/mca/btl/openib/btl_openib_mca.h
ompi/mca/btl/portals/btl_portals_endpoint.h
ompi/mca/btl/portals/btl_portals_frag.h
ompi/mca/btl/sctp/btl_sctp_endpoint.h
ompi/mca/btl/sctp/btl_sctp_proc.h
ompi/mca/btl/tcp/btl_tcp_endpoint.h
ompi/mca/btl/tcp/btl_tcp_ft.h
ompi/mca/btl/tcp/btl_tcp_proc.h
ompi/mca/btl/template/btl_template_endpoint.h
ompi/mca/btl/template/btl_template_proc.h
ompi/mca/btl/udapl/btl_udapl_eager_rdma.h
ompi/mca/btl/udapl/btl_udapl_endpoint.h
ompi/mca/btl/udapl/btl_udapl_mca.h
ompi/mca/btl/udapl/btl_udapl_proc.h
ompi/mca/mtl/mx/mtl_mx_endpoint.h
ompi/mca/mtl/mx/mtl_mx.h
ompi/mca/mtl/psm/mtl_psm_endpoint.h
ompi/mca/mtl/psm/mtl_psm.h
ompi/mca/pml/cm/pml_cm_component.h
ompi/mca/pml/csum/pml_csum_comm.h
ompi/mca/pml/dr/pml_dr_comm.h
ompi/mca/pml/dr/pml_dr_component.h
ompi/mca/pml/dr/pml_dr_endpoint.h
ompi/mca/pml/dr/pml_dr_recvfrag.h
ompi/mca/pml/example/pml_example.h
ompi/mca/pml/ob1/pml_ob1_comm.h
ompi/mca/pml/ob1/pml_ob1_component.h
ompi/mca/pml/ob1/pml_ob1_endpoint.h
ompi/mca/pml/ob1/pml_ob1_rdmafrag.h
ompi/mca/pml/ob1/pml_ob1_recvfrag.h
ompi/mca/pml/v/pml_v_output.h
opal/include/opal/prefetch.h
opal/mca/timer/aix/timer_aix.h
opal/util/qsort.h
test/support/components.h
This commit was SVN r21855.
The following SVN revision numbers were found above:
r2 --> open-mpi/ompi@58fdc18855
* No need for OPAL_SIZEOF_BOOL and OPAL_SIZEOF_INT in comm_inln.h --
just use sizeof()
* Fix logic in ompi_setup_cxx.m4 to account for the case where we
''do'' have a C++ compiler (duh!!)
* Fix spelling error in a shell variable that ended up making a
bad/empty #define
This should bring the trunk back to being functional. Sorry for the
interruption, folks...
This commit was SVN r21758.
The following SVN revision numbers were found above:
r21755 --> open-mpi/ompi@90d6491737
note in VERSION file.
NOTE: the versions will ''always'' be 0:0:0 on the SVN trunk and
developer branches. They will only have meaningful values (starting
with 0:0:0 in 1.3.4) on release branches. Only RM's will modify these
values immediately preceeding a release.
This commit was SVN r21729.
OMPI
and a language agnostic part in OPAL. The convertor is completely
moved into OPAL. This offers several benefits as described in RFC
http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
namely:
- Fewer basic types (int* and float* types, boolean and wchar
- Fixing naming scheme to ompi-nomenclature.
- Usability outside of the ompi-layer.
- Due to the fixed nature of simple opal types, their information is
completely
known at compile time and therefore constified
- With fewer datatypes (22), the actual sizes of bit-field types may be
reduced
from 64 to 32 bits, allowing reorganizing the opal_datatype
structure, eliminating holes and keeping data required in convertor
(upon send/recv) in one cacheline...
This has implications to the convertor-datastructure and other parts
of the code.
- Several performance tests have been run, the netpipe latency does not
change with
this patch on Linux/x86-64 on the smoky cluster.
- Extensive tests have been done to verify correctness (no new
regressions) using:
1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
ompi-ddt:
a. running both trunk and ompi-ddt resulted in no differences
(except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
correctly).
b. with --enable-memchecker and running under valgrind (one buglet
when run with static found in test-suite, commited)
2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
all passed (except for the dynamic/ tests failed!! as trunk/MTT)
3. compilation and usage of HDF5 tests on Jaguar using PGI and
PathScale compilers.
4. compilation and usage on Scicortex.
- Please note, that for the heterogeneous case, (-m32 compiled
binaries/ompi), neither
ompi-trunk, nor ompi-ddt branch would successfully launch.
This commit was SVN r21641.
not end up in OPAL
- Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).
This commit was SVN r21342.
The following SVN revision numbers were found above:
r21330 --> open-mpi/ompi@95596d1814
into the OPAL namespace, eliminating cases like opal/util/arch.c
testing for ompi_fortran_logical_t.
As this is processor- and compiler-related information
(e.g. does the compiler/architecture support REAL*16)
this should have been on the OPAL layer.
- Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t
- Tested locally (Linux/x86-64) with mpich and intel testsuite
but would like to get this week-ends MTT output
- PLEASE NOTE: configure-internal macro-names and
ompi_cv_ variables have not been changed, so that
external platform (not in contrib/) files still work.
This commit was SVN r21330.
happens when hierarch is used. . Two major items:
- modify the comm_activate step to take an additional argument, indicating
whether the new communicatio has to go through the collective selection
step. This is not required sometimes (e.g. when a process calls
MPI_COMM_SPLIT with color=MPI_UNDEFINED), and contributed significantly to
the exhaustion of cids.
- when freeing a communicator, check whether we can reuse the block of cids
assigned to that comm. This only works if the current front of the cid
assignment (cid_block_start) is right ater the block of cids assigned to this
comm.
Fixes trac:1904
Fixes trac:1926
This commit was SVN r21296.
The following Trac tickets were found above:
Ticket 1904 --> https://svn.open-mpi.org/trac/ompi/ticket/1904
Ticket 1926 --> https://svn.open-mpi.org/trac/ompi/ticket/1926
case the first process of the group was not represented at all in the second
group. Also added some cleanup of the code w.r.t. booleans vs. ints.
Thanks for Geoffrey Irving for reporting the bug and providing the initial
solution.
This commit was SVN r21192.
- due to the <= with we could overrun the array
- we didn't correctly test at _all_, since we never marked the
ranks already excluded / included...
- when returning in error, we should free (elements_int_list)...
This commit was SVN r21186.
OMPI_* to OPAL_*. This allows opal layer to be used more independent
from the whole of ompi.
NOTE: 9 "svn mv" operations immediately follow this commit.
This commit was SVN r21180.
- Delete unnecessary header files using
contrib/check_unnecessary_headers.sh after applying
patches, that include headers, being "lost" due to
inclusion in one of the now deleted headers...
In total 817 files are touched.
In ompi/mpi/c/ header files are moved up into the actual c-file,
where necessary (these are the only additional #include),
otherwise it is only deletions of #include (apart from the above
additions required due to notifier...)
- To get different MCAs (OpenIB, TM, ALPS), an earlier version was
successfully compiled (yesterday) on:
Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled
Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled
Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled
This commit was SVN r21096.
This has turned into an MPI spec interpretation issue. :-(
Open MPI has intrepreted the spec one way for the past several years;
these commits reflect a different interpretation that changes how we
treat the EXTRA_STATE parameter to the Fortran attribute copy and
delete callbacks. This new way breaks our internal copy of the Intel
Fortran attribute tests. So after talking with Terry/Sun, we're going
to back out these changes (both here and on the v1.3 branch) until we
get further clarification from the Forum.
This commit was SVN r21028.
The following SVN revision numbers were found above:
r20926 --> open-mpi/ompi@0a24eadaad
r20941 --> open-mpi/ompi@045b0e8871
r20950 --> open-mpi/ompi@73af921c22
We currently apply all of the MCA params in the parent job to the child. This commit allows a user to specify additional params for the child job, and to override any pre-existing params with the new value so they can better control behavior of the child job.
This commit was SVN r20989.
The fix for #1864 in r20926 caused gcc to emit some warnings. Also,
Jeff Squyres pointed out parallel bugs in mpi_type_create_keyval and
mpi_win_create_keyval.
This commit was SVN r20950.
The following SVN revision numbers were found above:
r20926 --> open-mpi/ompi@0a24eadaad
The EXTRA_STATE parameter is passed by reference, and thus should be
dereferenced before it is stored. Similarly, the stored value should
be passed by reference to the copy and delete routines.
This fixes trac:1864.
This commit was SVN r20926.
The following Trac tickets were found above:
Ticket 1864 --> https://svn.open-mpi.org/trac/ompi/ticket/1864
Now, while #include "ompi_config.h" is good and fine in order
to have OMPI_DECLSPEC,
here it led to stdint.h (with the uint8_t) being included early
but INSIDE a namespace "MPI" {}.
Of course it was included anymore (thinkg #define _STDINT_H), when
it was required in opal/class/opal_hash_list.h
NOT good.
- opal/class/opal_object.h: Yeah, one can have nested extern "C" {}
but it's not necessary. Instead just have the outer *_C_DECLS.
This commit was SVN r20837.
The following SVN revision numbers were found above:
r20817 --> open-mpi/ompi@6f808d9b05
In case we use memcmp, strlen, strup and friends include <string.h>
Also several constants.h are not included directly
- Let's have mca_topo_base_cart_create return ompi-errors in
ompi/mca/topo/base/topo_base_cart_create.c
This commit was SVN r20773.
Anyway, this is blocking the move: do not include pml.h
if not really needed, aka none of the following used:
mca_pml
MCA_PML_CALL
OMPI_ANY_TAG
OMPI_ANY_SOURCE
OMPI_PROC_NULL
- Notable exceptions (deleting in one header->adding):
- ompi/mca/mtl/psm/
- ompi/mca/osc/rdma/
- ompi/mca/btl/openib/btl_openib_endpoint.c depended on
pml_base_sendreq.h
- Tested on Linux/x86-64, this time including make check
(thanks Jeff and Ralph)
This commit was SVN r20725.
get bitten by header depending on having already included
the corresponding [opal|orte|ompi]_config.h header.
When separating, things like [OPAL|ORTE|OMPI]_DECLSPEC
are missed.
Script to add the corresponding header in front of all following
(taking care of possible #ifdef HAVE_...)
- Including some minor cleanups to
- ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H
- ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H
- ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for
orte/util/output.h
- ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h
- ompi/mca/btl/btl.h -- reorder to fit
- ompi/mca/bml/bml.h -- reorder to fit
- ompi/runtime/ompi_mpi_finalize.c -- reorder to fit
- ompi/request/request.h -- additionally need ompi/constants.h
- Tested on linux/x86-64
This commit was SVN r20720.
devel list, it ''is'' within in the spirit of MPI to allow
MPI_REQUEST_NULL to be passed to MPI_REQUEST_GET_STATUS. I filed a
ticket proposal with MPI-2.2 to make this officially accepted:
https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/137
Plus, r20537 didn't revert out all of the machinery for allowing
MPI_REQUEST_NULL or inactive requests, anyway. So this commit simply
removes the parameter check that was added in r20537, and we're back
to where we were before this whole conversation. :-)
This commit was SVN r20616.
The following SVN revision numbers were found above:
r20537 --> open-mpi/ompi@38aab37bb3
Often, orte/util/show_help.h is included, although no functionality
is required -- instead, most often opal_output.h, or
orte/mca/rml/rml_types.h
Please see orte_show_help_replacement.sh commited next.
- Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
actually showed two *missing* #include "orte/util/show_help.h"
in orte/mca/odls/base/odls_base_default_fns.c and
in orte/tools/orte-top/orte-top.c
Manually added these.
Let's have MTT the last word.
This commit was SVN r20557.
have different sizes:
1. Do not modify the read only parameter of the Fortran MPI interface (i.e be
standard compliant).
2. When Fortran integers are 64 bits long, don't generate unlawful code.
Thanks to Christoph van Wullen for the bug report.
This commit was SVN r20420.
* New "op" MPI layer framework
* Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2)
= Op framework =
Add new "op" framework in the ompi layer. This framework replaces the
hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples
for pre-defined MPI_Ops, allowing components and modules to provide
the back-end functions. The intent is that components can be written
to take advantage of hardware acceleration (GPU, FPGA, specialized CPU
instructions, etc.). Similar to other frameworks, components are
intended to be able to discover at run-time if they can be used, and
if so, elect themselves to be selected (or disqualify themselves from
selection if they cannot run). If specialized hardware is not
available, there is a default set of functions that will automatically
be used.
This framework is ''not'' used for user-defined MPI_Ops.
The new op framework is similar to the existing coll framework, in
that the final set of function pointers that are used on any given
intrinsic MPI_Op can be a mixed bag of function pointers, potentially
coming from multiple different op modules. This allows for hardware
that only supports some of the operations, not all of them (e.g., a
GPU that only supports single-precision operations).
All the hard-coded back-end MPI_Op functions for (MPI_Op,
MPI_Datatype) tuples still exist, but unlike coll, they're in the
framework base (vs. being in a separate "basic" component) and are
automatically used if no component is found at runtime that provides a
module with the necessary function pointers.
There is an "example" op component that will hopefully be useful to
those writing meaningful op components. It is currently
.ompi_ignore'd so that it doesn't impinge on other developers (it's
somewhat chatty in terms of opal_output() so that you can tell when
its functions have been invoked). See the README file in the example
op component directory. Developers of new op components are
encouraged to look at the following wiki pages:
https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogenhttps://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponenthttps://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework
= MPI_REDUCE_LOCAL =
Part of the MPI-2.2 proposal listed here:
https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24
is to add a new function named MPI_REDUCE_LOCAL. It is very easy to
implement, so I added it (also because it makes testing the op
framework pretty easy -- you can do it in serial rather than via
parallel reductions). There's even a man page!
This commit was SVN r20280.
and ::SEEK_SET (duh); that's why it's listed in constants.h. So put
that back and make it (static const int) rather than extern, and then
remove the instantiation from mpicxx.cc. Ditto for the other 2.
This commit was SVN r20251.
The following Trac tickets were found above:
Ticket 623 --> https://svn.open-mpi.org/trac/ompi/ticket/623
in mpicxx.h a while ago, but somehow accidentally left "extern const
int" for SEEK_SET (and friends) in constants.h. This commit removes
the extraneous "extern" versions.
This commit was SVN r20250.
The following Trac tickets were found above:
Ticket 623 --> https://svn.open-mpi.org/trac/ompi/ticket/623
pondering about this problem, we came to the conclusion that the best approach
is to keep what we had before (i.e. the original approach).
The main reason for this is being nice with tool developers. In the current
incarnation, they can either catch the Fortran calls or the C calls. If they
provide both, then they will have to figure out how to cope with the double
calls (as your example highlight).
Here is the behavior Open MPI will stick too:
Fortran MPI -> C MPI
Fortran PMPI -> C MPI
However, the is another possible approach. This might avoid the double calls
while preserving the tool writers friendliness. This possible approach will do:
Fortran MPI -> C MPI
Fortran PMPI -> C PMPI
^
Unfortunately, we will have to heavily modify all files in the Fortran
interface layer in order to support this approach.
This commit was SVN r20079.
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'
This commit was SVN r20043.
I'm unable to split it in two parts, my patch and Edgar's one. So I just update
copyright information for both of us.
What this patch do:
- it use the unexpected queue create by commit r19562 to dispatch the
unexpected message to the right communicator (once this communicator
is created and initialized).
- delay the PML comm_add until we have the context_id for the new communicator.
- only do the PML comm_add on processes that really belong to the new
communicator. Please read the lengthy comment in the source code for the
reason behind this.
This commit was SVN r19929.
The following SVN revision numbers were found above:
r19562 --> open-mpi/ompi@acd3406aa7
unconditionally, which can result in a flood of messages to the user
if all MPI processes invoke abort. Additionally, some users were
confused because they saw the MPI_ABORT opal_output() messages from
''some'' MPI processes, but not ''all'' of them (despite the fact that
every MPI process supposedly invoked MPI_ABORT). The reason is that
calling MPI_ABORT triggers ORTE to kill all MPI processes, so it's a
race condition as to whether a) all MPI processes actually invoke
MPI_ABORT, and/or b) whether every process is able to opal_output()
before they are killed.
This commit does two simple things:
* Now use orte_show_help() for the MPI_ABORT message, so they are
aggregated.
* Add a note in the message that calling MPI_ABORT kills all
processes, so you might not see all output, yadda yadda yadda.
This commit was SVN r19735.
This fixes trac:1477.
Help provided by Jeff and Terry.
This commit was SVN r19533.
The following Trac tickets were found above:
Ticket 1477 --> https://svn.open-mpi.org/trac/ompi/ticket/1477
MPI::SEEK_SET and friends, suggested by Doug Gregor. This way allows
users to utilize SEEK_SET in a case statement, which they could not do
with our previous method.
This commit was SVN r19494.
way".
Don't modify coords in the top-level API function because coords is an
IN variable. Instead, as Nysal noted, the real cause of the problem
was a missing ! down in topo_base_cart_rank.c. Put a comment down in
topo_base_cart_rank.c explaining what's going on so that the code is
not so cryptic.
Refs trac:1363.
This commit was SVN r19487.
The following Trac tickets were found above:
Ticket 1363 --> https://svn.open-mpi.org/trac/ompi/ticket/1363
* Various changes to enable 0-dimensional cartesian communicators:
* Set various mtc_* members to NULL when there are 0 dimensions (and
don't bother trying to memcpy these arrays when duplicating the
communicator -- because they're NULL)
* adjust topo_base_cart_sub to correctly handle 0 dimensions
(simplified it a bit)
* adjust a few error codes to return ERR_OUT_OF_RESOURCE
* adjust error checking of CART_CREATE, CART_RANK
* Allow MPI_GRAPH_CREATE to accept 0 == nnodes.
* Bump reported MPI version in mpi.h to 2.1
This commit was SVN r19461.
The following Trac tickets were found above:
Ticket 1236 --> https://svn.open-mpi.org/trac/ompi/ticket/1236
for the F90 type create functions to the requirements of MPI 2.1 standard.
Advice to implementors. An application may often repeat a call to
MPI_TYPE_CREATE_F90_xxxx with the same combination of (xxxx,p,r).
The application is not allowed to free the returned predefined, unnamed
datatype handles. To prevent the creation of a potentially huge amount of
handles, the MPI implementation should return the same datatype handle for
the same (REAL/COMPLEX/INTEGER,p,r) combination. Checking for the
combination (p,r) in the preceding call to MPI_TYPE_CREATE_F90_xxxx and
using a hash-table to find formerly generated handles should limit the
overhead of finding a previously generated datatype with same combination
of (xxxx,p,r). (End of advice to implementors.)
This commit fixes trac:1239, and #712.
This commit was SVN r19458.
The following Trac tickets were found above:
Ticket 1239 --> https://svn.open-mpi.org/trac/ompi/ticket/1239
Possibly fixes CID 417
* ensure to initialize inner members upon default constructors using
the same syntax across all classes
* remove some member variables that aren't used anymore
* "initialize" the inner MPI_Status in the default constructor for
MPI::Status by calling the default constructor mpi_status(). This
may or may not silence CID 417; we'll see what happens in
subsequent Coverity Prevent runs.
This commit was SVN r19228.
* Make the creation of the build dir for the man pages a bit more
robust (thanks to suggestions from Ralf W.).
* Only distribute the .Xin files, not the .X man pages themselves.
* Make the .X files depend on opal_config.h so that if you re-run
configure and change opal_config.h (e.g., a new version), the man
pages should get rebuilt.
* Man pages are now cleaned with "distclean", not "maintainer-clean".
* Fix a typo in opal_crs.7in.
* Udpate make_dist_tarball to update "date" in the VERSION file.
* Make make_dist_tarball a bit friendlier to hg checkouts.
This commit was SVN r19219.
Revise the scope precedence in the MPI_Publish, Unpublish, and Lookup functions. If a global server was specified and is available, then default to using it for all three functions. If not, then default to using local scope.
If an info_key was provided, then it takes preference. We always follow the user's direction - this change only impacts the scope ordering if the user -doesn't- tell us the order to use.
This commit was SVN r19146.
versions, dates and build names.
Fixes trac:1387
Big thanks to Jeff and Brian for help and oversight.
This commit was SVN r19120.
The following Trac tickets were found above:
Ticket 1387 --> https://svn.open-mpi.org/trac/ompi/ticket/1387
"make distclean". It's not clear whether it's an Automake bug or
whether what I did simply is not supported (I've got pending mail into
Ralf W. asking about it). The short version is that during "make
distclean", ompi/mpi/f77/Makefile would rm -rf ompi/mpi/f77/.deps.
But ompi/Makefile still include's some .Plo files from that directory,
so Bad Things happened when "make distclean" unrolled from the
ompi/mpi/f77 dir back up to the ompi/ dir.
So I went with George's original suggestion and moved the f77 "base"
files in question into a new directory: ompi/mpi/f77/base and put a
Makefile.include in there. That way, this directory is not traversed
twice by distclean, and .deps is only removed when it is supposed to
be. Maybe we'll be able to do it a little better someday, but that's
the way it is now.
I'll check this with a fresh checkout once this is committed to SVN as
well; some of these kinds of problems don't show up until you do a
build from a completely fresh SVN checkout.
This commit was SVN r19054.
The following SVN revision numbers were found above:
r19040 --> open-mpi/ompi@9f4d4c4312
are properly linked against libmpi.la.
This required a little creative AM usage, inspired by discussion on
OMPI devel list:
* Make a new ompi/mpi/f77/Makefile_f77base.include; effectively move
the building of the f77 "base" glue stuff (libmpi_f77base.la) into
this Makefile and away from ompi/mpi/f77/Makefile.am. The sources
in question require some specific CPPFLAGS, so we couldn't just add
the raw sources into libmpi_la_SOURCES, unfortunately.
* Include this new Makefile in the top-level ompi/Makefile.am
* The libmpi_f77base.la LT convenience library was already sucked
into libmpi.la; breaking it out into its own Makefile allows us
to build it earlier and therefore complete buidling libmpi.la
earlier.
* Side effect: the ompi/mpi/Makefile.am is now mostly unnecessary; it
no longer specifies a SUBDIRS for each of the bindings directories
to traverse into (since they are now in the top-level SUBDIRS). As
such, the man pages are now also now included in the top-level
ompi/Makefile.am.
The end of the result is that libmpi.la -- including a few sources
from mpi/f77 -- is fully built before the C++, F77, and F90 bindings
are built. Therefore, the C++, F77, and F90 bindings libraries can
all link against libmpi.la.
This commit was SVN r19040.
The following Trac tickets were found above:
Ticket 1409 --> https://svn.open-mpi.org/trac/ompi/ticket/1409
of strings. We mostly did the Right Things already; I simplified the
code a bit and also had us not write to more characters in the C
bindings than we're supposed to (per language in the MPI-2.1 spec).
Fixes trac:1238.
This commit was SVN r18705.
The following Trac tickets were found above:
Ticket 1238 --> https://svn.open-mpi.org/trac/ompi/ticket/1238
* Put the variable in the MPI namespace; keeps it safely segregated
from user apps
* Need to actually "extern" the variable to make the compiler not
complain that the variable is never referenced
This commit was SVN r18693.
the char string ident in the C++ library be non-static so that other
places can see it. This makes the C++ library version string
analogous to all the other version strings.
This commit was SVN r18679.
After much work by Jeff and myself, and quite a lot of discussion, it has become clear that we simply cannot resolve the infinite loops caused by RML-involved subsystems calling orte_output. The original rationale for the change to orte_output has also been reduced by shifting the output of XML-formatted vs human readable messages to an alternative approach.
I have globally replaced the orte_output/ORTE_OUTPUT calls in the code base, as well as the corresponding .h file name. I have test compiled and run this on the various environments within my reach, so hopefully this will prove minimally disruptive.
This commit was SVN r18619.
such, the commit message back to the master SVN repository is fairly
long.
= ORTE Job-Level Output Messages =
Add two new interfaces that should be used for all new code throughout
the ORTE and OMPI layers (we already make the search-and-replace on
the existing ORTE / OMPI layers):
* orte_output(): (and corresponding friends ORTE_OUTPUT,
orte_output_verbose, etc.) This function sends the output directly
to the HNP for processing as part of a job-specific output
channel. It supports all the same outputs as opal_output()
(syslog, file, stdout, stderr), but for stdout/stderr, the output
is sent to the HNP for processing and output. More on this below.
* orte_show_help(): This function is a drop-in-replacement for
opal_show_help(), with two differences in functionality:
1. the rendered text help message output is sent to the HNP for
display (rather than outputting directly into the process' stderr
stream)
1. the HNP detects duplicate help messages and does not display them
(so that you don't see the same error message N times, once from
each of your N MPI processes); instead, it counts "new" instances
of the help message and displays a message every ~5 seconds when
there are new ones ("I got X new copies of the help message...")
opal_show_help and opal_output still exist, but they only output in
the current process. The intent for the new orte_* functions is that
they can apply job-level intelligence to the output. As such, we
recommend that all new ORTE and OMPI code use the new orte_*
functions, not thei opal_* functions.
=== New code ===
For ORTE and OMPI programmers, here's what you need to do differently
in new code:
* Do not include opal/util/show_help.h or opal/util/output.h.
Instead, include orte/util/output.h (this one header file has
declarations for both the orte_output() series of functions and
orte_show_help()).
* Effectively s/opal_output/orte_output/gi throughout your code.
Note that orte_output_open() takes a slightly different argument
list (as a way to pass data to the filtering stream -- see below),
so you if explicitly call opal_output_open(), you'll need to
slightly adapt to the new signature of orte_output_open().
* Literally s/opal_show_help/orte_show_help/. The function signature
is identical.
=== Notes ===
* orte_output'ing to stream 0 will do similar to what
opal_output'ing did, so leaving a hard-coded "0" as the first
argument is safe.
* For systems that do not use ORTE's RML or the HNP, the effect of
orte_output_* and orte_show_help will be identical to their opal
counterparts (the additional information passed to
orte_output_open() will be lost!). Indeed, the orte_* functions
simply become trivial wrappers to their opal_* counterparts. Note
that we have not tested this; the code is simple but it is quite
possible that we mucked something up.
= Filter Framework =
Messages sent view the new orte_* functions described above and
messages output via the IOF on the HNP will now optionally be passed
through a new "filter" framework before being output to
stdout/stderr. The "filter" OPAL MCA framework is intended to allow
preprocessing to messages before they are sent to their final
destinations. The first component that was written in the filter
framework was to create an XML stream, segregating all the messages
into different XML tags, etc. This will allow 3rd party tools to read
the stdout/stderr from the HNP and be able to know exactly what each
text message is (e.g., a help message, another OMPI infrastructure
message, stdout from the user process, stderr from the user process,
etc.).
Filtering is not active by default. Filter components must be
specifically requested, such as:
{{{
$ mpirun --mca filter xml ...
}}}
There can only be one filter component active.
= New MCA Parameters =
The new functionality described above introduces two new MCA
parameters:
* '''orte_base_help_aggregate''': Defaults to 1 (true), meaning that
help messages will be aggregated, as described above. If set to 0,
all help messages will be displayed, even if they are duplicates
(i.e., the original behavior).
* '''orte_base_show_output_recursions''': An MCA parameter to help
debug one of the known issues, described below. It is likely that
this MCA parameter will disappear before v1.3 final.
= Known Issues =
* The XML filter component is not complete. The current output from
this component is preliminary and not real XML. A bit more work
needs to be done to configure.m4 search for an appropriate XML
library/link it in/use it at run time.
* There are possible recursion loops in the orte_output() and
orte_show_help() functions -- e.g., if RML send calls orte_output()
or orte_show_help(). We have some ideas how to fix these, but
figured that it was ok to commit before feature freeze with known
issues. The code currently contains sub-optimal workarounds so
that this will not be a problem, but it would be good to actually
solve the problem rather than have hackish workarounds before v1.3 final.
This commit was SVN r18434.