1
1
Граф коммитов

5 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
de07a64599 Cleanup the sensor code:
* use the global flags for linux and apple being found instead of re-doing the case statements

* update select procedure to ignore components that measure the same thing (e.g., resusage and sigar), taking the higher priority module

cmr=v1.7.5:reviewer=jsquyres:subject=Cleanup the sensor code

This commit was SVN r30368.
2014-01-22 21:01:09 +00:00
Ralph Castain
bee8bf5d8f Update the sensor framework to report stats back to the HNP if requested by including the data in heartbeats.
This commit was SVN r27748.
2013-01-05 06:30:20 +00:00
Ralph Castain
b9893aacc5 Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here:
1. file activity - can monitor file size, access and modification times. If these fail to change over a specified number of sampling iterations (rate is an mca param), then the errmgr is notified.

2. memory usage - checks amount of memory used by a process. Limit and sampling rate can be set.

This support must be enabled by configuring --enable-sensors.

ompi_info and orte-info have been updated to include the new framework.

Also includes some initial steps toward restoring the recovery capability. Most notably, the ODLS API has been extended to include a "restart_proc" entry for restarting a local process, and organizes the various ERRMGR framework globals into a single struct as we do in the other ORTE frameworks. Fix an oversight in the ERRMGR framework where a pointer array was constructed, but not initialized.

Implementation continues.

This commit was SVN r23043.
2010-04-26 22:15:57 +00:00
Ralph Castain
e38a0eab9f Remove the fddp and sensor frameworks - relocated to new cluster mgr project
This commit was SVN r22240.
2009-11-27 22:14:47 +00:00
Ralph Castain
c3c642aa0d Add two new frameworks for sensing and predicting faults. This is just the bare-bones plumbing for now - will instantiate soon.
No ess modules reference these frameworks yet, so they are completely inactive.

This commit was SVN r21847.
2009-08-20 04:27:16 +00:00