MPI constants, which led to also find that MPI_ERR_WIN was mistakenly
not defined in the Fortran MPI constants.
This fix also needs to be applied to v1.7. Brian -- can you review?
cmr:v1.7
This commit was SVN r28102.
* Clean up ${includedir} and ${libdir} for script wrapper compilers
* Update script wrapper compilers to work like the C wrapper compilers w.r.t static and dynamic linking
* Remove the ORTE script wrapper compilers since they didn't support the ${includedir} stuff and Ralph said they weren't used anymore.
This commit was SVN r28052.
ompi_show_help, because opal_show_help is replaced with an
aggregating version when using ORTE, so there's no reason to
directly call orte_show_help.
This commit was SVN r28051.
r27987 - MTL MXM: ver. 2.0 interface changes.
This commit was SVN r28026.
The following SVN revision numbers were found above:
r27987 --> open-mpi/ompi@2735658d81
necessarily mean an error -- it could (and usually does) mean that the
peer realized that we both initiated a connect at the same time, and
therefore it decided to hang up.
I also added a friendly show_help error message for other cases where
recv_blocking() fails (i.e., "Something went wrong. Kaboom! Your job
will abort...").
This commit was SVN r28023.
The following Trac tickets were found above:
Ticket 3494 --> https://svn.open-mpi.org/trac/ompi/ticket/3494
- revised fix for OpenType Font conflicts based upon Jeff's comments in CMR #3497)
- renamed libotf to libopen-trace-format
- install libraries to LIBDIR (as before, no need for adjusting LD_LIBRARY_PATH)
This commit was SVN r28010.
- fixed build error when the Python bindings were enabled
- fixed conflicts with OpenType Fonts:
- install header files to INCLUDEDIR/otf1
(only when building outside VT/Open MPI)
- install libraries to LIBDIR/otf1
- renamed otfdump to otfprint
This commit was SVN r27983.
flags, and mca flags are kept seperate until the very end. The main configure
wrapper flags should now be modified by using the OPAL_WRAPPER_FLAGS_ADD
macro. MCA components should either let <framework>_<component>_{LIBS,LDFLAGS}
be copied over OR set <framework>_<component>_WRAPPER_EXTRA_{LIBS,LDFLAGS}.
The situations in which WRAPPER CPPFLAGS can be set by MCA components was
made very small to match the one use case where it makes sense.
This commit was SVN r27950.
all pending fragments when the destination goes down. This allows the PML
to recalibrate its behavior, either find an alternate route or just give up.
This commit was SVN r27881.
- library wrapping: Prevent calling dlerror, if the memory allocation wrappers are enabled. dlerror calls realloc which would ends up in an infinite recursion.
This commit was SVN r27869.
actually care if opal_pointer_array is limited to handle_max already passes
that in as the max_size during init, so don't need it there. The arch
constant was a bit more difficult, so pass that in during MPI init and
leave empty otherwise.
This is to help with the effort to allow building ompi against an external
opal or orte.
This commit was SVN r27817.
party configure.in scripts to be configure.ac so that Automake stops
complaining about them.
This commit was SVN r27791.
The following SVN revision numbers were found above:
r27790 --> open-mpi/ompi@675a2f5c48
using the modex or RML to share sm initialization information, have node rank 0
create a file containing initialization information in a well-known place. Then
during add_procs, the rest of the node processes requiring sm BTL initialization
will just read from that file to complete their initialization.
This commit was SVN r27789.
There was a race condition in the eager get protocol where the RDMA complete message could be received before the local completion of the SMSG message that started the eager get protocol.
cmr:v1.7
This commit was SVN r27740.
- configure: fixed warnings from automake 1.12.x
Changes to VT:
- general:
- incremented version number to 5.14.2
- configure:
- fixed warnings from automake 1.12.x
- VT libs / MPI wrappers:
- do initialize VT and enter the dummy main function ("user") if MPI_Initialized is the very first event to be recorded (fixed assertion error)
- leave the dummy main function on the same thread where it is entered (fixes potential stack underflow)
This commit was SVN r27734.
* btl sendi(): if message can be send inline try to avoid signal
* signal is requested one per 64 or when
there are no send wqes
when message can not be send inline
any other btl method then sendi()
This commit was SVN r27724.
- configure: pass the MPI configure options (e.g. --with-mpi-lib, --with-mpi-inc-dir) to the OTF configure, even if MPI compiler wrappers were found
This commit was SVN r27710.
config/ directory. We split them apart a while ago in the hopes that
it would simplify things, but it didn't really (e.g., because there
were still some ompi/opal .m4 files in the top-level config/
directory, resulting in developer confusion where any given m4 macro
was defined).
So this commit consolidates them back into the top-level directory for
simplicity.
There's still (at least) two changes that would be nice to make:
1. Split any generated .m4 file (e.g., autogen-generated .m4 files)
into a separate directory somewhere so that a top-level -Iconfig/
will only get our explicitly defined macros, not the autogen stuff
(e.g., with libevent2019 needing to get the visibility macro, but
NOT all the autogen-generated inclusion of component configure.m4
files).
1. Change configure to be of the form:
{{{
# ...a small amount of preamble/setup...
OPAL_SETUP
m4_ifdef([project_orte], [ORTE_SETUP])
m4_ifdef([project_ompi], [OMPI_SETUP])
# ...a small amount of finishing stuff...
}}}
I doubt we'll ever get anything as clean as that, but that would be
the goal to shoot for.
This commit was SVN r27704.
- compiler wrappers: removed invocation of pdbcomment; it removes essential information from the PDB file for instrumenting functions
This commit was SVN r27687.
- otfprofile: fixed build error when using the IBM XL C++ compiler
Changes to VT:
- configure:
- use AC_CHECK_TYPES instead of AC_CHECK_DECLS to check for PAPI's long_long type
- use AC_C_INLINE to check whether C 'inline' is present
- VT libs:
- set CUPTI tracing as default, if CUDA runtime wrapper has not been built
- added a common interface for all NVIDIA CUPTI interfaces (events, activity, callbacks)
- added support for concurrent kernel tracing (since CUDA 5.0)
- removed almost unused VT_DEBUG env. variable - replaced calls to vt_debug_msg(DBG_LEVEL,...) by vt_cntl_msg (DBG_LEVEL+10,...)
- added some more ifdefs to new CUDA 5 features
- added several guards for internal malloc() and free() calls in CUDA related source files
- revised memory allocation tracing:
- intercept memory (de)alloaction functions by library wrapping (replaces deprecated hook technique from the GNU C library)
- added support for multi-threaded applications
- added wrapper functions for memalign, posix_memalign, and valloc
- revised exec,system,fork tracing
- retitled to "Child Process Execution Tracing"
- introduced env. variable VT_EXECTRACE (marked VT_LIBCTRACE as deprecated)
- added wrapper functions for execvpe, fexecve, waitid, wait3, and wait4
- changes default function group name for
- memory (de)allocation functions: "MEM" -> "LIBC-MALLOC"
- I/O functions: "I/O" -> "LIBC-I/O"
- child process execution: "LIBC" -> "LIBC-EXEC"
- plugin counter interface:
- added check for initialized vt_plugin_cntr_info.info
- vtdyn:
- added missing header includes for Dyninst 8
- vtrun:
- do not preload the Dyninst Runtime library; it is loaded by Dyninst itself
This commit was SVN r27679.
* Add some comments in the *-wrapper-data-txt.in files just so that
someone doesn't forget in the future why we link in what we do in
the MPI and ORTE wrapper compilers.
* Update ompi_wrapper_script.in to match the new behavior.
* Update orte_wrapper_script.in to support --openmpi:linkall (which
is a no-op in this case)
This commit was SVN r27672.
The following Trac tickets were found above:
Ticket 3422 --> https://svn.open-mpi.org/trac/ompi/ticket/3422
additional functionality. Rationale (refs trac:3422):
* Normal MPI applications only ever use the MPI API. Hence, -lmpi is
sufficient (they'll never directly call ORTE or OPAL
functions). This is arguably the most common case.
* That being said, we do have some test programs (e.g., those in
orte/test/mpi) that call MPI functions but also call ORTE/OPAL
functions. I've also written the occasional MPI test program that
calls opal_output, for example (there even might be a few tests in
the IBM test suite that directly call ORTE/OPAL functions).
* Even though this is not a common case, these applications should
also compile/link with mpicc.
* So we should add a --openmpi:linkall option that will also link
in whatever is necessary to call ORTE/OPAL functions
* Yes, we could hard-code "-lopen-rte -lopen-pal" in Makefiles, but
we do reserve the right to change those library names and/or add
others someday, so it's better to abstract out the names and let
the wrapper supply whatever is necessary.
* ORTE programs, however, are different. They almost always call OPAL
functions (e.g., if they want to send a message, they must use the
OPAL DSS). As such, it seems like the ORTE programs should always
link in OPAL.
Therefore:
* Add undocumented --openmpi:linkall flag to the wrapper compilers.
See the comment in opal_wrapper.c for an explanation of what it
does. This flag is only intended for Open MPI developers -- not
end users. That's why it's undocumented.
* Update orte/test/mpi/Makefile.am to add --openmpi:linkall
* Make ortecc/ortec++'s wrapper data text files always explicitly
link in libopen-pal
This commit was SVN r27670.
The following SVN revision numbers were found above:
r27668 --> open-mpi/ompi@cf845897aa
The following Trac tickets were found above:
Ticket 3422 --> https://svn.open-mpi.org/trac/ompi/ticket/3422
1. Restore libopen-pal.la, libopen-rte.la, and libmpi.la to be
separate entities (i.e., don't have libopen-rte.la include
libopen-pal.la, and don't have libmpi.la include libopen-pal.la).
Yay!
1. Consequently, make the wrapper compilers look for flags indicating
that the user wants to compile statically (currently: -static,
!--static, -Bstatic, and "-Wl," in front of all of those). If it
is, follow a 6-way matrix for determinining which libraries to
list on the underlying command line.
1. To support that, add the name of a token static and dynamic
library to look for in each of the wrapper compiler data files.
1. Fix a long-standing typo in the opalcc wrapper data file.
This commit was SVN r27662.
An upcoming BTL from Cisco used ofud as a starting point, and should
probably be used as a starting point for any future UD-based BTL.
And this OFUD BTL is obviously still in history if anyone ever wants
to resurrect it.
This commit was SVN r27655.
of modules, print a BTL_ERROR and exit(1) (previous behavior was to
segv). This at least explicitly tells the developer that their BTL
component is behaving badly.
This commit was SVN r27634.
- general:
- incremented version number to 1.12.2
- OTF library:
- implemented workaround for handling bogus zlib sync. points during reading
Changes to VT:
- general:
- incremented version number ot 5.14.1
- configure:
- to prevent conflicts with the libtool RPM, append "/vampirtrace" to ${datadir}, if it doesn't points to the default (=${datarootdir})
(Fixes trac:3382)
This commit was SVN r27630.
The following Trac tickets were found above:
Ticket 3382 --> https://svn.open-mpi.org/trac/ompi/ticket/3382
fbtl modules. This implmentation in alignment with all other collective modules tries to
keep all the file-ops as contiguous as possible.
This commit was SVN r27611.
Reasoning: The old behavior was a little confusing. mca_base_components_open does not open an output stream so it is a little unexpected that mca_base_components_close does. To add to this several frameworks (that don't use mca_base_components_close) failed to close their output in the framework close function and others closed their output a second time. This change is an improvement to the symantics of mca_base_components_open/close as they are now symetric in their functionality.
This commit was SVN r27570.
pml/v:
- If vprotocol is not being used vprotocol_include_list is leaked. Assume vprotocol never takes ownership (see below) and always free the string.
coll/ml:
- (patch verified) calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
- Need to free mca_coll_ml_component.config_file_name on component close.
btl/openib:
- calling mca_base_param_lookup_string after mca_base_param_reg_string is unnecessary. The call to mca_base_param_lookup_string causes the value returned by mca_base_param_reg_string to be leaked.
vprotocol/base:
- There was no way for pml/v to determine if vprotocol took ownership of vprotocol_include_list. Fix by always never ownership (use strdup).
mca/base:
- param_lookup will result in storage->stringval to be a newly allocated string if the mca parameter has a string value. ensure this string is always freed.
cmr:v1.7
This commit was SVN r27569.
"not yet implemented" list (it was already in the mpi_api_files
macro, actually)
* Add mistakenly missed palltoall file. Thanks to Brian for
noticing its absence.
This commit was SVN r27554.
It appears the problem was not with the command line parser but the rsh plm. I don't know why this problem was not occuring before the command line parser changes but it appears to be resolved now.
This commit was SVN r27527.
The following SVN revision numbers were found above:
r27451 --> open-mpi/ompi@d59034e6ef
r27456 --> open-mpi/ompi@ecdbf34937
MPI_MAX_INFO_VAL (so that we don't go beyond the end of the string).
But otherwise, abibe by what says in MPI-2.2 / MPI-3.0 in the
description of MPI_INFO_GET (where "it" refers to the valuelen
argument):
If it is less than the actual size of the value, the value is
truncated. In C, valuelen should be one less than the amount of
allocated space to allow for the null terminator.
cmr:v1.7
This commit was SVN r27522.
The following SVN revision numbers were found above:
r27292 --> open-mpi/ompi@fb4af5e29c
Not sure what happened here, but the resulting trunk wouldn't even configure. After spending time fixing that problem, I found it wouldn't compile due to multiple syntax errors that had been introduced in both the OPAL and OMPI layer. This raised questions as to the completeness of the work.
Given that the author is departing, I pinged Jeff about it and we agreed to revert this for now. Hopefully, it can either be fixed by the author prior to actual departure, or someone else can pick it up (now that it is in the history) and fix it.
This commit was SVN r27511.
The following SVN revision numbers were found above:
r27508 --> open-mpi/ompi@12c3c743de
r27509 --> open-mpi/ompi@79e4a8ca38
r27510 --> open-mpi/ompi@1ad5ff625a
* Only register the progress function on first call to a non-blocking
collective operation, to try to reduce overall performance impact
* Fix tag management in roll-over case
This commit was SVN r27498.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27442.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27441.
# The processes register their information and continue.
# Actual printing of timing information happens at file close.
# Triggered by MCA parameter at runtime
This commit was SVN r27440.
# This is triggered based on a mca-paramater and can be used with all collective modules.
# Individual queues maintained for read and write.
# The additional communication to combine data is done at file-close so that the
actual timing of collective-operations will not get affected.
# The queues are initialized in file-open
This commit was SVN r27439.
- fix the Fortran layer to use new macros to convert Fortran-to-C status
- change the C internals to pull out old OMPI_SET_STATUS* macros
Also, change name of "status" argument in topo_test_f.c to "topo_type".
This commit was SVN r27403.
* Add OMPI_COMMON_VERBS_FLAGS_NOT_RC, which looks for a device that
does ''not'' support RC
* Add ompi_common_verbs_find_max_inline(), and remove that code from
the openib BTL component
This commit was SVN r27393.
ompi/mca/sbgp/basesmsocket
orte/mca/rmaps/lama
Remove stale configure.params files from the sbgp framework as the OMPI build system no longer looks at those files.
This commit was SVN r27377.
1. Multiple aggregator with non-contiguous datatype,
2. Memory corruption bugs.
Cleaned version, with proper initialization and memory management.
This commit was SVN r27370.
* Moved "check basics" sanity check from openib BTL to common/verbs
(which also allows us to have openib ''not'' include
<infiniband/driver.h>, which is a Very Good Thing)
* Add new ompi_common_verbs_qp_test() function, which tests to see
whether a device supports RC and/or UD QPs. The openib BTL now
uses this function to ensure that the device supports RC QPs.
* Rename ompi_common_verbs_find_ibv_ports() to be
ompi_common_verbs_find_ports() -- the "ibv" was redundant.
* Re-work ompi_common_verbs_find_ports() to use
ompi_common_verbs_qp_test() instead of testing for RC/UD QPs itself
* Add bunches of opal_output_verbose() to the find_ports() routine
(to help diagnosing connectivity problems -- imaging running with
--mca btl_base_verbose 10; you'll see all the find_ports() test
results)
* Make ompi_common_verbs_qp_test() warn if devices/ports are supplied
in the if_include/if_exclude strings that do not exists (quite
similar to what the openib BTL does today).
* Add ompi_common_verbs_mca_register() function, which registers
common verbs MCA params. It will also register MCA param synonyms
for thse MCA params to upper-level components (e.g.,
btl_<upper-level-component>_<the-mca-param>).
* common_verbs_warn_nonexistent_if: warn if
if_include/if_exclude-specified devices or ports do not exist.
This commit was SVN r27332.
- otfaux:
- fixed build error on Solaris and NetBSD (removed -lm from library dependencies)
Changes to VT:
- vtunify:
- disable OpenMP parallelization if PGI compiler version < 9 is used (threadprivate not supported)
This commit was SVN r27316.
* Minor man page tweaks
* Use existing ompi_mpi_thread_requested global
This commit was SVN r27308.
The following Trac tickets were found above:
Ticket 3309 --> https://svn.open-mpi.org/trac/ompi/ticket/3309
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't). If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!". Kaboom.
The only problem is that it didn't give you any indication of where
this value was being set. Quite maddening, from a user perspective.
So we changed the ompi_info handles this case. If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.
At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).
We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens. Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior. And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes. Why? I think we used them in exactly one
place in the code base (mca_base_components_open.c). So we deleted
those and just used the normal OPAL_* codes instead.
While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).
This commit was SVN r27306.
The following Trac tickets were found above:
Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
- configure: do not build OpenMP support if CFLAGS contains a compiler flag for disabling handling OpenMP directives (e.g. -fno-openmp, -nomp, -hnoomp)
Fixes trac:3117
This commit was SVN r27282.
The following Trac tickets were found above:
Ticket 3117 --> https://svn.open-mpi.org/trac/ompi/ticket/3117
- general:
- incremented version number ro 1.11.3openmpi
- otfaux:
- fixed build error when using the Oracle compiler on Solaris. (removed usage of too recent rint() function)
Fixes trac:3257
This commit was SVN r27279.
The following Trac tickets were found above:
Ticket 3257 --> https://svn.open-mpi.org/trac/ompi/ticket/3257
patch that helped them test (r27184). Thanks Abosoft!
This commit was SVN r27277.
The following SVN revision numbers were found above:
r27184 --> open-mpi/ompi@a951a5ee99
ompi_comm_split(), and the entire set of periods from the old
communicator have already been copied to the new communicator. But up
here in mca_topo_base_cart_sub(), we need to subset the periods that
are actually stored on the new communicator according to remain_dims
(just like we did for the set of dimensions).
This commit renames a few variables to be a little less misleading,
and then adds a loop to copy over the periods information. I could
have added this into the first loop (that subset-copies the
dimensions), but this code is already confusing enough and this is not
a performance-critical section: so I made it a new loop.
Note that all the topo code will be revamped a bit when the new
MPI-2.2 topo stuff (currently off in a mercurial branch) finally makes
it back to the SVN trunk. But that new stuff will only get to v1.7 --
this commit will need to be CMR'ed to v1.6.x.
cmr:v1.7
cmr:v1.6.2
This commit was SVN r27248.
The following Trac tickets were found above:
Ticket 3294 --> https://svn.open-mpi.org/trac/ompi/ticket/3294