1
1
openmpi/orte/mca/odls/process
Ralph Castain b9893aacc5 Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here:
1. file activity - can monitor file size, access and modification times. If these fail to change over a specified number of sampling iterations (rate is an mca param), then the errmgr is notified.

2. memory usage - checks amount of memory used by a process. Limit and sampling rate can be set.

This support must be enabled by configuring --enable-sensors.

ompi_info and orte-info have been updated to include the new framework.

Also includes some initial steps toward restoring the recovery capability. Most notably, the ODLS API has been extended to include a "restart_proc" entry for restarting a local process, and organizes the various ERRMGR framework globals into a single struct as we do in the other ORTE frameworks. Fix an oversight in the ERRMGR framework where a pointer array was constructed, but not initialized.

Implementation continues.

This commit was SVN r23043.
2010-04-26 22:15:57 +00:00
..
.windows Fix the bug that caused by ADD_DEPENDENCIES() from different version of CMake. 2010-01-14 18:10:20 +00:00
configure.m4 - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00
configure.params - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00
help-odls-process.txt - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00
Makefile.am - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00
odls_process_component.c - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00
odls_process_module.c Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here: 2010-04-26 22:15:57 +00:00
odls_process.h - Change the property of a few files, that obviously 2009-08-11 01:40:00 +00:00