1
1
openmpi/ompi
Ralph Castain b9893aacc5 Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here:
1. file activity - can monitor file size, access and modification times. If these fail to change over a specified number of sampling iterations (rate is an mca param), then the errmgr is notified.

2. memory usage - checks amount of memory used by a process. Limit and sampling rate can be set.

This support must be enabled by configuring --enable-sensors.

ompi_info and orte-info have been updated to include the new framework.

Also includes some initial steps toward restoring the recovery capability. Most notably, the ODLS API has been extended to include a "restart_proc" entry for restarting a local process, and organizes the various ERRMGR framework globals into a single struct as we do in the other ORTE frameworks. Fix an oversight in the ERRMGR framework where a pointer array was constructed, but not initialized.

Implementation continues.

This commit was SVN r23043.
2010-04-26 22:15:57 +00:00
..
attribute - Since r22727 orte_app_idx_t was introduced, being a uint32_t (was 2010-03-08 22:56:33 +00:00
class fixes trac:2351 - race in use of ompi free lists 2010-03-25 03:38:14 +00:00
communicator Merge in the modified thread configure option branch per today's telecon. 2010-03-16 23:10:50 +00:00
config Merge in the modified thread configure option branch per today's telecon. 2010-03-16 23:10:50 +00:00
contrib - fixed detection of older PGI compilers on CrayXT platforms 2010-04-20 10:33:02 +00:00
datatype Refs trac:2273 2010-03-16 00:47:10 +00:00
debuggers - This bites us with make check (read MTT) on static builds (read jaguar) 2010-01-25 23:41:59 +00:00
errhandler Closes trac:2158. 2010-01-07 18:16:39 +00:00
etc Many thanks to Ralf W. for finding a subtle bug in these Makefile.am's 2008-06-04 01:28:03 +00:00
file Clean up request handling in the I/O framework to be more consistent with 2009-11-26 05:13:43 +00:00
group Fun typo. :-) 2009-08-20 21:23:54 +00:00
include Convert the line endings for the added header files. They were changed automatically by Windows when adding new files. 2010-02-16 17:24:44 +00:00
info - As discussed revert r21330, Fortran-configure info should 2009-06-01 19:02:34 +00:00
mca Fix segmentation fault on heterogeneous architectures. Don't mess with the 2010-04-23 15:14:55 +00:00
mpi Type casts for building dynamical Fortran libraries. 2010-04-22 15:48:27 +00:00
mpiext Add an "affinity" Open MPI extension (also describe the 2010-04-21 17:28:08 +00:00
op Replace jumps with returns. 2009-08-20 02:29:30 +00:00
peruse - Sanity check initialization and finalization of PERUSE. 2010-01-12 16:36:24 +00:00
proc Updates to make trunk run on Catamount again: 2010-02-03 05:07:40 +00:00
request Fix the configure logic for --with-ft so that it properly takes a comma separated list. 2010-03-12 23:57:50 +00:00
runtime Establish a method by which a process knows if it has been bound by mpirun. This helps resolve a problem where a process gets "bound" to all available resources, which looks to the opal paffinity system as "not bound". This can cause mpi_init to attempt to "bind" the process itself, causing unintended behavior. 2010-04-17 01:58:26 +00:00
tools Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here: 2010-04-26 22:15:57 +00:00
win - Replace combinations of 2009-08-20 11:42:18 +00:00
CMakeLists.txt Use variables instead of hard-coded compiler flags, in order to support various C/C++ compilers on Windows. 2010-04-21 12:45:00 +00:00
Makefile.am This commit converts us to the "one big libmpi" scheme that has been 2010-02-23 22:20:01 +00:00