party/"vendor" import, the changes are actually far smaller than the
size of this changeset implies. Here's a list of the changes:
* Update the AMD license header in plpa_map.c to be less restrictive
(see https://svn.open-mpi.org/trac/plpa/changeset/262 for details)
-- '''this is the most/only important change of this update.''' No
code is changed by this; only removing a clase from a license
header in plpa_map.c.
* Changes to the generated {{{configure}}}, {{{config.guess}}}, and
{{{config.sub}}} scripts (which aren't used by OMPI).
* soname version tracking changes (which also aren't used by OMPI;
they're only used when PLPA is built/installed in "standalone"
mode).
* Update the "get version" m4 (which was stolen from OMPI's m4 to
begin with, and is only used during OMPI's autogen.sh step).
* Update various PLPA version numbers to 1.3.2.
* Bug fix in plpa-taskset (which is not built in the OMPI PLPA build).
This commit was SVN r22367.
to Eugene, Jeff, and Briand for the help. This patch is supposed to
fix several outstanding issues, notably the one on tickets #2043.
This commit was SVN r22324.
:-)
Okay, cleanup the prior commit so that the default component search path shows in ompi_info, and remains available in component_find.
This commit was SVN r22278.
"my_perfect_path":SYSTEM_DEFAULT:USER_DEFAULT
and OPAL will substitute its internally derived values for the defaults (instead of forcing the user to figure them out).
This commit was SVN r22272.
friends also receive &argc and &argv (George asked Jeff to Ralph to
review before committing). The thought is that passing argv and argc
to opal/orte_init be useful to other projects outside of OMPI that are
using OPAL and/or ORTE (especially in conjunction with some other
bootstrapping code where it is helpful to modify argv). It's such a
small thing that it's easy to apply here to make others' lives a
little easier.
Ask George for more details; I'm just the messenger. :-)
Judging by the copyrights on this patch, it's been around for a
while. :-)
This commit was SVN r22260.
Add orte configuration option to control the use of the framework in the system. Although the code will build, it will not be active unless configured with --enable-bootstrap.
If bootstrap is enabled and the new opal_sysinfo framework can successfully determine the cpu model, pass that info to the application as an MCA param to support some work at Sun.
Also, have daemons report back the resources they find to guide process mapping in bootstrap operations (i.e., where the daemon starts at node boot as opposed to being launched at application start).
Adjust some platform files to enable these capabilities.
This commit was SVN r22244.
PARAM_WINDOWS_FILES is a mistake or not). Fixes trac:2079.
This commit was SVN r22171.
The following Trac tickets were found above:
Ticket 2079 --> https://svn.open-mpi.org/trac/ompi/ticket/2079
pass that on to callers of opal_cmd_line_make_opt_mca().
Thanks to Thomas Naughton III.
- Additionally, cmd-line parameters passed in table to opal_cmd_line_create()
may be wrong (e.g. OPAL_ERR_BAD_PARAM), which may be missed in the
loop.
This commit was SVN r22153.
Continue the reorganization of the configure system. Move files from the main config directory to their appropriate level-specific config directories. Modify the configure system to correctly handle compiler detection, test, and setup so that all things pertaining to opal and orte are done at the lower level, with the ompi configure system only looking at mpi-specific options.
Ensure the wrapper compilers for orte and ompi only get built when appropriate. Add support for c++ to the orte wrapper compilers, both script and non-script versions.
This commit was SVN r22138.
Re-enable "./autogen.sh -no-ompi" again. If you -no-ompi, the entire OMPI
configury is skipped and the entire ompi/ subtree is not built. There's
some simple m4-isms that prune out the relevant parts.
I added ompi/config/, orte/config/, and opal/config/ directories. I moved a
bunch of m4 files from the top-level config/ dir into ompi/config/, and a few
into orte/config/.
Note that all 3 <project>/config directories have a config_files.m4 file. This
file contains the AC_CONFIG_FILES list for that project. The AC_CONFIG_FILES
call cannot be in an AC_DEFUN macro and conditionally called -- if it is
included at all, Autoconf will process it. Hence, these config_files.m4 files
don't AC_DEFUN -- they just have AC_CONFIG_FILES. m4_ifdef() is used to
conditionally include the files or not.
I moved a bunch of obvious OMPI-only m4 files from config/ to ompi/config/,
but I'm sure that there's more that could go. A ticket will be filed with
thoughts on future work in this area.
This commit was SVN r22113.
'.', we should still find the executable - it is in a directory beneath us.
In other words, if someone gives us "foo/bar" instead of "./foo/bar", we should still be able to find bar
This commit was SVN r22110.
posix_memalign() will either return 0 or not, indicating success. And
if posix_memalign() fails, it's not always going to be due to
out-of-memory -- just return ERR_IN_ERRNO.
This commit was SVN r22070.
MAP_PRIVATE. We didn't catch this because we checked for a NULL
return, not a -1 return. Doh! Thanks again to Julian Seward for
continuing to track this down.
This commit was SVN r22062.
opposite of MAKE_MEM_DEFINED. Also add in a call to NOACCESS to
(mostly) reverse the effects of MAKE_MEM_DEFINED (technically, page 0
was accessible before this, even though it's a Bad Idea to access it).
This commit was SVN r22056.
This commit does a bunch of things:
* Address all remaining code review items from CMR #2023:
* Defer mmap setup to be lazy; only set it up the first time we
invoke a collective. In this way, we don't penalize apps that
make lots of communicators but don't invoke collectives on them
(per #2027).
* Remove the extra assignments of mca_coll_sm_one (fixing a
convertor count setup that was the real problem).
* Remove another extra/unnecessary assignment.
* Increase libevent polling frequency when using the RML to
bootstrap mmap'ed memory.
* Fix a minor procs-related memory leak in btl_sm.
* Commit a datatype fix that George and I discovered along the way to
fixing the coll sm.
* Improve error messages when mmap fails, potentially trying to
de-alloc any allocated memory when that happens.
* Fix a previously-unnoticed confusion between extent and true_extent
in coll sm reduce.
This commit was SVN r22049.
The following Trac tickets were found above:
Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023
This commit looks larger than it really is since it includes a fair amount of code cleanup.
The SIGSTOP/SIGCONT+checkpointing work uses some of the functionality in r20391. Basic use case below (note that the checkpoint generated is useable as usual if the stopped application is terminated).
{{{
shell 1) mpirun -np 2 -am ft-enable-cr my-app
... running ...
shell 2) ompi-checkpoint --stop -v MPIRUN_PID
[localhost:001300] [ 0.00 / 0.20] Requested - ...
[localhost:001300] [ 0.00 / 0.20] Pending - ...
[localhost:001300] [ 0.01 / 0.21] Running - ...
[localhost:001300] [ 1.01 / 1.22] Stopped - ompi_global_snapshot_1234.ckpt
Snapshot Ref.: 0 ompi_global_snapshot_1234.ckpt
shell 2) killall -CONT mpirun
... Application Continues execution in shell 1 ...
}}}
Other items in this commit are mostly cleanup that has been sitting off-trunk for too long:
* Add a new {{{opal_crs_base_ckpt_options_t}}} type that encapsulates the various options that could be passed to the CRS. Currently only TERM and STOP, but this makes adding others ''much'' easier.
* Eliminate ORTE_SNAPC_CKPT_STATE_PENDING_TERM, since it served a redundant purpose with the new options type.
* Lay some basic ground work for some future features.
This commit was SVN r21995.
The following SVN revision numbers were found above:
r20391 --> open-mpi/ompi@0704b98668
masks between the mask argument and a local PLPA mask. There were three
problems:
1) The "get" function computed the number of bits as sizeof(mask),
which is the size of the pointer to the mask rather than the mask
itself. So, only 4 bits were copied with m32 and 8 bits with m64.
There are actually 1024 bits.
2) The "get" and "set" functions both copied a number of bits computed
from the sizeof() mask, but sizeof() reports the number of bytes.
We have to multiply by 8 to get the number of bits.
3) These two functions check to make sure tha the mask argument is not
bigger than the PLPA mask. But, the set function copies a number
of bits in the PLPA mask, which is conceivably greater than the
number of bits in the mask argument. So, accesses to the mask
argument may overrun that argument.
Problems 1 and 2 meant that one would encounter errors when the number of
cores exceeded 4 (with -m32) or 8 (with -m64). Problem 3 probably caused
no errors.
This commit was SVN r21993.
don't set the ref count to 1, it has been already set by the call to OBJ_NEW
when the type was allocated. This fixes ticket #2014.
This commit was SVN r21976.
The new options work by adding an ":if-avail" qualifier to the "bind-to-socket" and "bind-to-core" MCA params. If the system does not support this capability, the job will launch anyway. Without the qualifier, the job will abort with an error message indicating that the required functionality is not supported on this system.
This commit was SVN r21975.
As noted in http://www.open-mpi.org/community/lists/devel/2009/08/6741.php,
we do not correctly free a dupped predefined datatype.
The fix is a bit more involving. See ticket for details.
Tested with ibm tests and mpi_test_suite (though there's two "old" failures
zero5.c and zero6.c)
Thanks to Lisandro Dalcin for bringing this up.
This commit was SVN r21929.
The following Trac tickets were found above:
Ticket 2014 --> https://svn.open-mpi.org/trac/ompi/ticket/2014