In the OPAL_ENABLE_FT_CR code path there used to be a variable
'mca_base_component_distill_checkpoint_ready' which got removed.
The FT code was not compiling and while trying to get it to compile
again the old variable was #ifdef'd out. This re-introduces the
variable with a new name 'opal_base_distill_checkpoint_ready'
and enables the code previously #ifdef'd out.
This removes the last hack introduced to get the FT code to compile
again.
This commit was SVN r30928.
* Include opal_stdint.h so that we have uin32_t
cmr=v1.7.5:ticket=trac:4298
This commit was SVN r30890.
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
1. Changed rng_buff_t --> opal_rng_buff_t
2. All global variables obey the prefix rule
3. Old code has been removed
4. Found a couple of unnecessary includes
Refs trac:4298
This commit was SVN r30807.
The following Trac tickets were found above:
Ticket 4298 --> https://svn.open-mpi.org/trac/ompi/ticket/4298
This adds the code to actually checkpoint a process using CRIU
with the necessary variables to control the behaviour.
Right now only --np 1 is supported and --mca oob tcp.
Following parameters are supported:
* crs_criu_log: name of the log file
* crs_criu_log_level: verbosity level in the log file
* crs_criu_tcp_established: C/R established TCP connections
* crs_criu_shell_job: C/R shell jobs
* crs_criu_ext_unix_sk: allow external unix connections
* crs_criu_leave_running: leave tasks in running state after checkpoint
This commit was SVN r30772.
* Remove redundant/unnecessary uses of $2
* Change a bunch of logic from negative to positive
* Use OPAL_VAR_SCOPE_PUSH/POP to help reduce env var usage
* Only use "" in test statements with strings that require sanitization
* Removed redundant AC_MSG_WARN/ERROR. There's now only one check at
the bottom for whether the component is "good" or not. We'll
AC_MSG_WARN/ERROR in that one location.
Thanks to Jeff Squyres for this patch.
This commit was SVN r30739.
To be able to checkpoint/restart using criu (criu.org) a new
CRS component is added which is based on criu. This first commit
provides the minimal set of functions and configure script options
to enable --with-criu and link against libcriu.so.
No actual checkpoint/restart functionality is yet implemented.
This is only the framework which needs to be filled with the
actual functionality.
This commit was SVN r30666.
Wire the security check into ORTE's OOB handshake, and add a "version" check to ensure that both ends are from the same ORTE version. If not, report the mismatch and refuse the connection
Fixes trac:4171
cmr=v1.7.5:reviewer=jsquyres:subject=Add a security framework for authenticating connections
This commit was SVN r30551.
The following Trac tickets were found above:
Ticket 4171 --> https://svn.open-mpi.org/trac/ompi/ticket/4171
Since Paul is the only one of the team with the required hardware to test it, and he has done so, consider this RM-approved.
cmr=v1.7.5:reviewer=ompi-gk1.7
This commit was SVN r30401.
Thanks to Paul Hargrove for tracking it down.
RM-approved
cmr=v1.7.4:reviewer=ompi-gk1.7
This commit was SVN r30397.
The following SVN revision numbers were found above:
r26226 --> open-mpi/ompi@12781482b9
Refs trac:4117. Please use this commit rather than the patch attached to
the ticket; the patch had a few mistakes in the tweaked wording.
This commit was SVN r30362.
The following SVN revision numbers were found above:
r30298 --> open-mpi/ompi@58479399c3
The following Trac tickets were found above:
Ticket 4117 --> https://svn.open-mpi.org/trac/ompi/ticket/4117
work around buggy NUMA node cpusets (i.e., buggy BIOSs).
Thanks to Jeff Becker for reporting the issue.
Submitted by Brice Goglin, reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30306.
The original code was passing a union by value, and doing odd things
on Solaris/SPARC (where "odd" rhymes with "SIGBUS"). Replace it with
an exploded switch/case block for all the enum values. Also use the
string literals so that we get compiler checking of the format string
vs. the type of the actual arguments.
cmr=v1.7.4:revier=hjelmn:subject=Fix MCA base var to not pass union by value
This commit was SVN r30276.
variable names for deprecated variables.
Closes trac:3270
cmr=v1.7.4:reviewer=jsquyres
This commit was SVN r30275.
The following Trac tickets were found above:
Ticket 3270 --> https://svn.open-mpi.org/trac/ompi/ticket/3270
* Fix some typos in macro names.
* Add case for OS's that have statfs() but no struct statfs (!).
* Add case for NetBSD with struct statvfs.f_fstypename.
Many thanks to Paul Hargrove who developed the majority of this patch.
Reviewed by Jeff Squyres.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30255.
Change the logic in bind.c to only include <malloc.h> if we don't have posix_memalign.
In http://www.open-mpi.org/community/lists/devel/2014/01/13619.php,
Paul Hargrove found a compiler warning on OpenBSD where <malloc.h>
exists, but is not intended to be used (and doesn't error out, so
AC_CHECK_HEADERS says its ok).
Reviewed by Brice Goglin.
cmr=v1.7.4:reviewer=ompi-rm1.7
This commit was SVN r30234.
As noted by Paul Hargrove, the #if's surrounding the use of statfs()
and statvfs() in opal/util/path.c have apparently gotten stale (e.g.,
modern flavors of *BSD OSs no longer define __BSD). Changes:
* Add statfs and statvfs to the AC_CHECK_FUNCS in configure.ac
* Add a sanity check to ensure that we have at least one of statfs()
or statvfs(). Add a similar sanity check in opal/util/path.c, just
as defensive programming.
* Use AC_CHECK_MEMBERS in configure.ac to check for specific struct
statfs/struct statvfs members that we use in opal/util/path.c
* In path.c, add some #includes as listed on the OS man page for
statfs(2) (OS X 10.8.5/Mountain Lion)
* The previous code used statvfs() on Solaris and statfs() everywhere
else. Attempting to replicate this with behavior-based configure
testing led to fairly complicted if/else logic, so the new code
uses whichever of the two are available (i.e., it might actually
use both -- OS X 10.8.5 and RHEL 6.5 have both statfs() and
statvfs()). The rationale here is that we don't really care which
of the two functions report the answer; we'll take the answer
regardless of where it comes from. For example, if one function
returns a failure and the other does not, we'll use the results
from the successful function and ignore the failed one.
This new code seems to work on OS X and Linux. We'll have to see what
happens with MTT and future Paul Hargrove testing...
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=Make statfs/statvfs more robust
This commit was SVN r30198.
configury/Makefile.am changes; this commit renames the internal
installdirs.h framework struct field names to match the configry macro
names:
* pkgdatdir -> ompidatadir
* pkglibdir -> ompilibdir
* pkgincludedir -> ompiincludedir
This commit was SVN r30145.
The following SVN revision numbers were found above:
r30140 --> open-mpi/ompi@8b778903d8
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi. This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.
This commit was SVN r30140.
Right now the C/R code fails because of a change introduced in
opal/mca/compress/base/compress_base_open.c and
pal/mca/crs/base/crs_base_open.c in 2013 with commit
git 734c724ff76d9bf814f3ab0396bcd9ee6fddcd1b
svn r28239
Update OPAL frameworks to use the MCA framework system.
This commit changed a lot but also the return value of functions from
OPAL_SUCCESS to OPAL_ERR_NOT_AVAILABLE/OPAL_ERR_NOT_AVAILABLE.
This commit lets opal_compress_base_register() and opal_crs_base_open()
always return OPAL_SUCCESS and removes unneeded #includes.
This commit was SVN r30130.
The following SVN revision numbers were found above:
r28239 --> open-mpi/ompi@365cf48db5
r30086: make sure that a super item is constructed properly).
Refs trac:4035
This commit was SVN r30090.
The following SVN revision numbers were found above:
r30086 --> open-mpi/ompi@d1c63f878e
The following Trac tickets were found above:
Ticket 4035 --> https://svn.open-mpi.org/trac/ompi/ticket/4035
Thanks to Tetsuya Mishima for identifying the problem and providing the patch!
cmr=v1.7.4:reviewer=jsquyres:subject=Fix LAMA mapper for PGI compilers
This commit was SVN r30086.
Fix comm_spawn on a single host - with the new default mapping scheme, we were incorrectly computing the number of procs to put on the node.
Refs trac:4003
This commit was SVN r30033.
The following Trac tickets were found above:
Ticket 4003 --> https://svn.open-mpi.org/trac/ompi/ticket/4003
- Use ->boolval for booleans when creating a string.
- Solaris has some issue with the ?: used in one of find functions. Use an if instead.
- Change all instances of index -> vari to avoid issues with redefining index.
cmr=v1.7.4:reviewer=jsquyres
This commit was SVN r29997.
atomics in the critical path and are not currently used. We can bring them back if there
turns out to be a good use for them.
cmr=v1.7.4:reviewer=brbarret
This commit was SVN r29994.
added on top of r29991 and:
* Consolidates the _debug variables in opal_datatype_internal.h and
opal_convertor.h
* Puts the DO_DEBUG macros back in the .c files, because they are
slightly different from each other
Refs trac:4004
This commit was SVN r29992.
The following SVN revision numbers were found above:
r29991 --> open-mpi/ompi@a88e143127
The following Trac tickets were found above:
Ticket 4004 --> https://svn.open-mpi.org/trac/ompi/ticket/4004
and extern'ed s an int in another. This caused a SIBGUS on
Solaris/SPARC.
This commit properly moves the extern to a .h file so that it's the
same in all files. It also moves the DO_DEBUG to the header file,
because it was defined to the same thing in multiple .c files.
cmr=v1.7.4:reviewer=bosilca:subject=fix SPARC SIGBUS in opal convertor code
This commit was SVN r29991.
The following Trac tickets were found above:
Ticket 3989 --> https://svn.open-mpi.org/trac/ompi/ticket/3989
and preventing access to potentially unaligned data.
Reviewed by Dave Goodell. Tested by Siegmarr Gross.
cmr=v1.7.4:reviewer=ompi-rm1.7:subject=fix SPARC SIGBUS in opal net code
This commit was SVN r29983.
The following Trac tickets were found above:
Ticket 3990 --> https://svn.open-mpi.org/trac/ompi/ticket/3990
Reset topology usage for each node as we bind as multiple nodes may be linked to the same topology object. This will need to be revisited for scale as it does take some non-zero time to reset the usage each iteration. However, storing individual topology objects for every node consumes memory, so it's a tradeoff.
cmr=v1.7.4:reviewer=jsquyres:subject=Eliminate excessive binding/memory warnings
This commit was SVN r29978.
compiled with ummunotify support (which is the check that r29720 just
recently added).
This commit was SVN r29961.
The following SVN revision numbers were found above:
r29720 --> open-mpi/ompi@ae8c826527
includes various fixes all over the C/R code which are
hard to group like the other patches.
Changes from V1:
* explain why mca_base_component_distill_checkpoint_ready no longer works
* compare return result of opal functions with OPAL_* values
Changes from V2:
* use orte_rml_oob_ft_event() instead of referencing through the modules
* properly protect variable (thanks to --enable-picky)
This commit was SVN r29922.
should be already set to the right value. This fixes a problem
identified by Guillaume Gouaillardet, where using a single
persistent receive leads to leaking the convertor stack memory.
Refs trac:3956
cmr=v1.7.4:reviewer=jsquyres:subject=Correctly handle the convertor internal stack for persistent receives.
This commit was SVN r29920.
The following Trac tickets were found above:
Ticket 3956 --> https://svn.open-mpi.org/trac/ompi/ticket/3956
* default to bind-to core
* map-by slot if np=2
* map-by socket (balance across sockets on each node) if np > 2
* map-by <obj> will imply rank-by <obj> by default (leave default binding as above)
Fix a bug in the map-by <obj> mapper where we incorrectly compute the #procs to assign if the #slots > #procs
cmr=v1.7.4:reviewer=jsquyres:subject=Update default binding and mapping values
This commit was SVN r29919.
got linked together (work on one caused work in the other):
* Clean up a bunch of VAR_SCOPE issues in configure. This includes:
* Using VAR_SCOPE_PUSH and VAR_SCOPE_POP in more places
* Cleaning up the use of some shell variables (e.g., name them better)
* Add support for external libevent via
--with-libevent=<dir-to-libevent-install-tree>, as specifically
asked for by downstream packagers.
* Revamp how wrapper compiler RPATH (and RUNPATH) support is done.
The external libevent work exposed weakenesses in how the original
RPATH/RUNPATH work was done, so we had to re-do it to be a bit more
robust.
This work has not yet been tested on Solaris.
Refs trac:3694
This commit was SVN r29899.
The following Trac tickets were found above:
Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
more:
- Remove OPAL_ENABLE_MULTI_THREADS, since it didn't really do anything
correctly. Opal always has threads enabled at this point.
- Remove OMPI_ENABLE_PROGRESS_THREADS, since this hasn't worked in
8 years and it has performance issues we'll never be able to
overcome. Note that we have plans for re-adding async progress, using
a hybrid protocol of async and sync sends.
- OMPI_ENABLE_THREAD_MULTIPLE now determines whether the thread lock
macros do the check or not.
- Condition variables are ALWAYS polling right now, which fixes the thread
live-lock currently found when THREAD_MULTIPLE is turned on.
This commit was SVN r29891.