1
1
Граф коммитов

48 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
5bebdb97fa Need these header files for NetBSD. Thanks for the heads-up from
Aleksej Saushev.

This commit was SVN r23343.
2010-07-02 17:38:57 +00:00
Shiqing Fan
857f1669e2 Solve a few compilation problems on Windows.
This commit was SVN r23193.
2010-05-21 14:30:15 +00:00
Abhishek Kulkarni
abe13d802c Silence warnings by commenting out unused functions in the "hnp" notifier component.
This commit was SVN r23181.
2010-05-19 22:46:05 +00:00
Abhishek Kulkarni
118ce0e166 OMPI FTB component updates
* register FTB events from an event schema file
  * define more FTB events
  * minor fixes

This commit was SVN r23180.
2010-05-19 22:05:06 +00:00
Abhishek Kulkarni
9c5860706f Merge improvements to the "notifier" framework from the OPAL SOS and the ORTE WDC mercurial branches into the SVN trunk.
A brief description of the improvements can be found at
https://svn.open-mpi.org/trac/ompi/wiki/ORTEWDC#ChangesdonetotheORTEnotifier

This commit was SVN r23157.
2010-05-17 22:48:05 +00:00
Abhishek Kulkarni
f5b9bc4ff1 Add a new "HNP" component to the notifier framework.
This component proxies notification messages up to the HNP.  This
component runs in both the HNP and non-HNP processes for ease of
selection (e.g., so you can "--mca notifier hnp" (vs. "--mca
notifier hnp,non_hnp").  It auto-detects where it is running and
does the Right Thing -- if it's in the HNP process, it sets up to
receive incoming proxied messages.  If it's not in the HNP, then it
proxies all messages to the HNP.

This commit was SVN r23156.
2010-05-17 22:43:43 +00:00
Abhishek Kulkarni
197ec7586d Add a new "file" component to the notifier framework.
When this component is selected, the notification messages are sent to a file. 
The file can be a plain file or stdout or stderr.

The MCA parameter "notifier_file_name" can be used to specify the suffix of the file the notification messages should be sent to.
The default suffix is "wdc" and the file full name is "output-wdc".

This commit was SVN r23155.
2010-05-17 22:39:52 +00:00
Ralph Castain
efbb5c9b7c Revamp the errmgr framework to provide a greater range of optional behaviors, including different behaviors for daemons, and remove several looping messages across the code base:
* add hnp and orted modules to the errmgr framework. The HNP module contains much of the code that was in the errmgr base since that code could only be executed by the HNP anyway.

* update the odls to report process states directly into the active errmgr module, thus removing the need to send messages looped back into the odls cmd processor. Let the active errmgr module decide what to do at various states.

* remove the code to track application state progress from the plm_base_launch_support.c code. Update the plm modules to call the errmgr directly when a launch fails.

* update the plm_base_receive.c code to call the errmgr with state updates from remote daemons

* update the routed modules to reflect that process state is updated in the errmgr

* ensure that the orted's open the errmgr and select their appropriate module

* add new pretty-print utilities to print process and job state. Move the pretty-print of time info to a globally-accessible place

* define a global orte_comm function to send messages from orted's to the HNP so that others can overlay the standard RML methods, if desired.

* update the orterun help output to reflect that the "term w/o sync" error message can result from three, not two, scenarios

This commit was SVN r23023.
2010-04-23 04:44:41 +00:00
Jeff Squyres
f65eebf53d More changes for NetBSD. Thanks to Aleksej Saushev for this patch.
This commit was SVN r22680.
2010-02-22 15:05:09 +00:00
Abhishek Kulkarni
2af7657db1 A few changes to the FTB notifier interface:
- add an orte ftb notifier help file for more verbose error messages
- check if we can connect to the FTB during component->query and close
  the component, if we cannot.
- make the ftb component interface methods static.
- add mca parameters to set override the default subscription style and
  priority.

This commit was SVN r22011.
2009-09-24 23:56:41 +00:00
George Bosilca
3e971e61f3 The system headers are supposed to be protected by #ifdef and not by #if.
This commit was SVN r21700.
2009-07-16 18:27:33 +00:00
George Bosilca
52d013baae Add a missing header.
This commit was SVN r21694.
2009-07-16 17:21:37 +00:00
Ralph Castain
007d14f238 Add a threshold reporting level to the orte notifier framework. This takes a string value:
"critical" - any error at or above the critical severity will be reported (i.e., only critical errors)
"warning" - any error at or above the warning severity will be reported (i.e., warning and critical errors)
"notice" - pretty much everything will be reported

Default to "critical" to keep down the chatter.

Obviously, only places that call orte_notifier will be affected - all other error reporting (e.g., via opal_output calls) is unaffected.

This commit was SVN r21693.
2009-07-16 13:31:23 +00:00
Jeff Squyres
2a5813ac2d Silence a compiler warning.
This commit was SVN r21459.
2009-06-17 12:26:38 +00:00
Rolf vandeVaart
633b996a0f Add sys/wait.h so we can compile on Solaris.
This commit was SVN r21451.
2009-06-16 19:48:43 +00:00
Rainer Keller
5c80033aa2 - Eliminate icc warning w/ regard to __attribute__((__format__)) on
function pointers... Needed checking in opal_check_attributes.m4

This commit was SVN r21254.
2009-05-20 00:39:22 +00:00
Rainer Keller
73fd329cbd - Add the proper __opal_attribute_format__(__printf__...) to
declarations.

This commit was SVN r21226.
2009-05-14 00:10:59 +00:00
Greg Koenig
60485ff95f This is a very large change to rename several #define values from
OMPI_* to OPAL_*.  This allows opal layer to be used more independent
from the whole of ompi.

NOTE: 9 "svn mv" operations immediately follow this commit.

This commit was SVN r21180.
2009-05-06 20:11:28 +00:00
Rainer Keller
c32516c9a3 - Include errno.h, to get MTT for sun to run through
This commit was SVN r21143.
2009-05-04 09:13:16 +00:00
Ralph Castain
adc73b0fe0 Continue replacing missing headers
This commit was SVN r21108.
2009-04-29 13:34:41 +00:00
Rainer Keller
4ffec30e94 - Guys, we don't have a configure-test for errno.h (should we?)
Anyway, keeps us from compiling on Jaguar.

This commit was SVN r21100.
2009-04-29 03:18:51 +00:00
Rainer Keller
221fb9dbca ... Delayed due to notifier commits earlier this day ...
- Delete unnecessary header files using
   contrib/check_unnecessary_headers.sh after applying
   patches, that include headers, being "lost" due to
   inclusion in one of the now deleted headers...

   In total 817 files are touched.
   In ompi/mpi/c/ header files are moved up into the actual c-file,
   where necessary (these are the only additional #include),
   otherwise it is only deletions of #include (apart from the above
   additions required due to notifier...)

 - To get different MCAs (OpenIB, TM, ALPS), an earlier version was
   successfully compiled (yesterday) on:
   Linux locally using intel-11, gcc-4.3.2 and gcc-SVN + warnings enabled
   Smoky cluster (x86-64 running Linux) using PGI-8.0.2 + warnings enabled
   Lens cluster (x86-64 running Linux) using Pathscale-3.2 + warnings enabled

This commit was SVN r21096.
2009-04-29 01:32:14 +00:00
Ralph Castain
a405f04f1b Add missing header file
This commit was SVN r21088.
2009-04-28 14:30:52 +00:00
Jeff Squyres
fd1e9a6313 Minor fix -- receive 3 ints, not 2.
This commit was SVN r21077.
2009-04-27 14:10:48 +00:00
Jeff Squyres
b661f160ba Add new "command" notifier component. This component allows forking
any arbitrary command as a notifier, potentially allowing just about
anything to be a notifier.  This component forks a child during
orte_init() to avoid forking problems with some OS-bypass networks.

The following MCA parameters are available:

notifier_command_cmd:
  Default: /sbin/initlog -f $s -n "Open MPI" -s "$S: $m (errorcode: $e)"
  Command to execute, with substitution.  $s = integer severity; $S =
  string severity; $e = integer error code; $m = string message

notifier_command_timeout:
  Default: 30
  Timeout (in seconds) of the command

This commit was SVN r21076.
2009-04-27 13:40:36 +00:00
Jeff Squyres
e5103e1f3d Actually, make than an enum instead of a #define
This commit was SVN r21075.
2009-04-27 12:50:53 +00:00
Jeff Squyres
40990c1982 Add a NOTICE notifier severity
This commit was SVN r21074.
2009-04-27 12:47:54 +00:00
Jeff Squyres
df54a00b1e Minor comment fix
This commit was SVN r21073.
2009-04-27 12:47:30 +00:00
Rainer Keller
d8cf4c0fec - Get pgcc on XT to complain less:
In case we use memcmp, strlen, strup and friends include <string.h>
   Also several constants.h are not included directly
 - Let's have mca_topo_base_cart_create  return ompi-errors in
   ompi/mca/topo/base/topo_base_cart_create.c

This commit was SVN r20773.
2009-03-13 02:10:32 +00:00
Rolf vandeVaart
2b365d7d90 Fix so it builds on Solaris.
This commit was SVN r20758.
2009-03-10 18:38:42 +00:00
Jeff Squyres
4e53885f73 Fix a compiler warning and ensure that "sent" is initialized to 0.
This commit was SVN r20756.
2009-03-09 15:37:04 +00:00
Jeff Squyres
8b5e6c0425 Because I could. :-)
Relevant MCA params:

 * notifier_twitter_username: Twitter username
 * notifier_twitter_password: Twitter password

This commit was SVN r20750.
2009-03-06 22:02:17 +00:00
Jeff Squyres
2373bc36e2 Add the "smtp" notifier component. It uses libesmtp
(http://www.stafford.uklinux.net/libesmtp/) via the --with-esmtp(=DIR)
configure option.  Several MCA parameters must be set in order to use
this component:

 * notifier_smtp_server: SMTP server IP address or name; must be supplied
 * notifier_smtp_port: port to talk to on the server; defaults to 25
 * notifier_smtp_to: comma-delimited list of email addresses to send
   the mail to; must be supplied
 * notifier_smtp_from_name: free-form "name" who the mail is from;
   defaults to "Open MPI Notifier"
 * notifier_smtp_from_addr: email address from the mail is from; must
   be supplied
 * notifier_smtp_subject: subject of the mail; defaults to "Open MPI
   notifier"
 * notifier_smtp_body_prefix: prefix of the body of the mail; defaults
   to a sensible value
 * notifier_smtp_body_suffix: suffix of the body of the mail; defaults
   to a sensible value

Also libesmtp supports SMTP AUTH protocols, this component does not.
If people want/need those kinds of features, they're relatively easy
to add -- I just didn't bother [yet] before I knew if anyone cared.

This commit was SVN r20749.
2009-03-06 21:59:19 +00:00
Jeff Squyres
c17616c332 Change the ordering slightly; don't save anything until we know all
went well.

This commit was SVN r20748.
2009-03-06 21:49:38 +00:00
Rainer Keller
ec0ed48718 - Revert r20739
This commit was SVN r20742.

The following SVN revision numbers were found above:
  r20739 --> open-mpi/ompi@781caee0b6
2009-03-05 21:56:03 +00:00
Rainer Keller
a94438343b - Revert r20740
This commit was SVN r20741.

The following SVN revision numbers were found above:
  r20740 --> open-mpi/ompi@2a70618a77
2009-03-05 21:50:47 +00:00
Rainer Keller
2a70618a77 - Second patch, as discussed in Louisville.
Replace short macros in orte/util/name_fns.h
   to the actual fct. call.

 - Compiles on linux/x86-64

This commit was SVN r20740.
2009-03-05 21:14:18 +00:00
Rainer Keller
781caee0b6 - First of two or three patches, in orte/util/proc_info.h:
Adapt orte_process_info to orte_proc_info, and
   change orte_proc_info() to orte_proc_info_init().
 - Compiled on linux-x86-64
 - Discussed with Ralph

This commit was SVN r20739.
2009-03-05 20:36:44 +00:00
Ralph Castain
c2ff8dc5ce Fix notifier base functions to match revised notifier.h framework APIs
This commit was SVN r20663.
2009-02-28 23:46:18 +00:00
Ralph Castain
11979c100a Silence pointless compiler warning
This commit was SVN r20661.
2009-02-28 15:35:48 +00:00
Tim Mattox
57be80c983 First pass at integrating the CIFTS/FTB support as
a notifier module.
The Notifier framework was extended slightly to
convey more information about each event notice.
This works with the FTB v0.5 API.

To compile with FTB support, use --with-ftb=/path/to/ftb/install

CIFTS == Coordinated Infrastructure for Fault Tolerant Systems
FTB == Fault Tolerance Backplane
see http://wiki.mcs.anl.gov/cifts/index.php

This commit was SVN r20655.
2009-02-27 22:53:43 +00:00
Rainer Keller
d81443cc5a - On the way to get the BTLs split out and lessen dependency on orte:
Often, orte/util/show_help.h is included, although no functionality
   is required -- instead, most often opal_output.h, or               
   orte/mca/rml/rml_types.h                                           
   Please see orte_show_help_replacement.sh commited next.            

 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   actually showed two *missing* #include "orte/util/show_help.h"     
   in orte/mca/odls/base/odls_base_default_fns.c and                  
   in orte/tools/orte-top/orte-top.c                                  
   Manually added these.                                              

   Let's have MTT the last word.

This commit was SVN r20557.
2009-02-14 02:26:12 +00:00
Ralph Castain
b8ae4604ed Correct the notifier default module to include the new added API
This commit was SVN r19993.
2008-11-13 18:03:41 +00:00
Ralph Castain
26cd1c1955 Fix a typo and some formatting
This commit was SVN r19990.
2008-11-12 22:01:40 +00:00
Ralph Castain
ce26e3a2fb Update the notifier framework in prep for move to v1.3. Add an API to handle the case where error messages have been expressed via "show_help" so they can look similar to what was presented to users. Add three key calls in the openib btl to drop messages into syslog.
This will sit in trunk for a few days - would like to actually see some errors reported to syslog before moving the code to 1.3

This commit was SVN r19986.
2008-11-12 18:03:51 +00:00
Jeff Squyres
c078ab6b09 Minor fix for a trivial compiler warning.
This commit was SVN r19809.
2008-10-27 14:18:49 +00:00
Shiqing Fan
94a2147e3d - make sure that the system has the header files.
This commit was SVN r19400.
2008-08-25 13:56:10 +00:00
Ralph Castain
c9e53fd0d4 Add capability to notify system admins of potential problems in system communication networks and/or other system elements that are detected by Open MPI during operation. For example, failures in connections that may be indicative of connectivity problems can be reported to sys admins in addition to our current error message to the user, thus allowing more rapid correction of the problem.
This system is "off" by default and only operates upon specific directive for selection of a notifier component. At the moment, the only available component will write an error message to the syslog.

This commit was SVN r19209.
2008-08-06 21:59:21 +00:00