1
1
Граф коммитов

2140 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
8b5e6c0425 Because I could. :-)
Relevant MCA params:

 * notifier_twitter_username: Twitter username
 * notifier_twitter_password: Twitter password

This commit was SVN r20750.
2009-03-06 22:02:17 +00:00
Jeff Squyres
2373bc36e2 Add the "smtp" notifier component. It uses libesmtp
(http://www.stafford.uklinux.net/libesmtp/) via the --with-esmtp(=DIR)
configure option.  Several MCA parameters must be set in order to use
this component:

 * notifier_smtp_server: SMTP server IP address or name; must be supplied
 * notifier_smtp_port: port to talk to on the server; defaults to 25
 * notifier_smtp_to: comma-delimited list of email addresses to send
   the mail to; must be supplied
 * notifier_smtp_from_name: free-form "name" who the mail is from;
   defaults to "Open MPI Notifier"
 * notifier_smtp_from_addr: email address from the mail is from; must
   be supplied
 * notifier_smtp_subject: subject of the mail; defaults to "Open MPI
   notifier"
 * notifier_smtp_body_prefix: prefix of the body of the mail; defaults
   to a sensible value
 * notifier_smtp_body_suffix: suffix of the body of the mail; defaults
   to a sensible value

Also libesmtp supports SMTP AUTH protocols, this component does not.
If people want/need those kinds of features, they're relatively easy
to add -- I just didn't bother [yet] before I knew if anyone cared.

This commit was SVN r20749.
2009-03-06 21:59:19 +00:00
Jeff Squyres
c17616c332 Change the ordering slightly; don't save anything until we know all
went well.

This commit was SVN r20748.
2009-03-06 21:49:38 +00:00
Shiqing Fan
ddc82f3831 Correct the output global variables.
This commit was SVN r20745.
2009-03-06 15:31:12 +00:00
Shiqing Fan
a8cb7d2ab1 Fix an array over flow, detected by compiling with C++ compilers. This fix is mainly for Windows build.
This commit was SVN r20744.
2009-03-06 13:27:38 +00:00
Rainer Keller
ec0ed48718 - Revert r20739
This commit was SVN r20742.

The following SVN revision numbers were found above:
  r20739 --> open-mpi/ompi@781caee0b6
2009-03-05 21:56:03 +00:00
Rainer Keller
a94438343b - Revert r20740
This commit was SVN r20741.

The following SVN revision numbers were found above:
  r20740 --> open-mpi/ompi@2a70618a77
2009-03-05 21:50:47 +00:00
Rainer Keller
2a70618a77 - Second patch, as discussed in Louisville.
Replace short macros in orte/util/name_fns.h
   to the actual fct. call.

 - Compiles on linux/x86-64

This commit was SVN r20740.
2009-03-05 21:14:18 +00:00
Rainer Keller
781caee0b6 - First of two or three patches, in orte/util/proc_info.h:
Adapt orte_process_info to orte_proc_info, and
   change orte_proc_info() to orte_proc_info_init().
 - Compiled on linux-x86-64
 - Discussed with Ralph

This commit was SVN r20739.
2009-03-05 20:36:44 +00:00
Rainer Keller
fd28b392bf - An intrusive commit yet again (sorry): with the separation we
get bitten by header depending on having already included
   the corresponding [opal|orte|ompi]_config.h header.
   When separating, things like [OPAL|ORTE|OMPI]_DECLSPEC
   are missed.

   Script to add the corresponding header in front of all following
   (taking care of possible #ifdef HAVE_...)

 - Including some minor cleanups to
   - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H
   - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H
   - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for
     orte/util/output.h
   - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h
   - ompi/mca/btl/btl.h -- reorder to fit
   - ompi/mca/bml/bml.h -- reorder to fit
   - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit
   - ompi/request/request.h -- additionally need ompi/constants.h

 - Tested on linux/x86-64

This commit was SVN r20720.
2009-03-04 15:35:54 +00:00
Rainer Keller
84408f2fb7 - A follow-up to the commit:
As orte/mca/routed/base/base.h does not require opal_bitmap.h
   Include it in the C-files, based on routed/base/base.h...

This commit was SVN r20709.
2009-03-03 22:36:58 +00:00
George Bosilca
af9c2e10a3 Really cycle when we have several IP addresses.
This commit was SVN r20705.
2009-03-03 19:29:03 +00:00
Ralph Castain
f11931306a Modify the accounting system to recycle jobids. Properly recover resources from nodes and jobs upon completion. Adjustments in several places were required to deal with sparsely populated job, node, and proc arrays as a result of this change.
Correct an error wrt how jobids were being computed. Needed to ensure that the job family field was not overrun as we increment jobids for comm_spawn.

Update the slurm plm module so it uses the new slurm termination procedure (brings trunk back into alignment with 1.3 branch).

Update the slurmd ess component so it doesn't get selected if we are running a singleton inside of a slurm allocation.

Cleanup HNP init by moving some code that had been in orte_globals.c for historical reasons into the ess hnp module, and removing the call to that code from the ess_base_std_prolog


NOTE: this change allows orte to support an infinite aggregate number of comm_spawn's, with up to 64k being alive at any one instant. HOWEVER, the MPI layer currently does -not- support re-use of jobids. I did some prototype coding to revise the ompi_proc_t structures, but the BTLs are caching their own data, and there was no readily apparent way to update it. Thus, attempts to spawn more than the 64k limit will abort to avoid causing the MPI layer to hang.

This commit was SVN r20700.
2009-03-03 16:39:13 +00:00
Ralph Castain
fb1ecb7a45 Fix orted termination so we get the #@# relay out before we exit ourselves.
Minor change in the way we respond to job info requests - needed for coming change.

This commit was SVN r20698.
2009-03-03 13:38:29 +00:00
Jeff Squyres
d5eddc7541 Some minor fixups / patches from Bert Wesarg.
This commit was SVN r20697.
2009-03-03 13:09:19 +00:00
Jeff Squyres
f81d357c53 Free a little memory. Thanks for the patch from Bert Wesarg.
This commit was SVN r20694.
2009-03-03 12:33:43 +00:00
Jeff Squyres
f8daa60b1b Fix typo noted by Bery Wesarg.
This commit was SVN r20693.
2009-03-03 12:16:57 +00:00
George Bosilca
02de7846f8 Correctly tag the help message.
This commit was SVN r20683.
2009-03-02 22:10:45 +00:00
Josh Hursey
6d79a0398d Fix a bounds check that prevented some vpid resolution in certian launch scenarios.
Traced back to r20629.

This commit was SVN r20675.

The following SVN revision numbers were found above:
  r20629 --> open-mpi/ompi@dcff523244
2009-03-02 18:26:48 +00:00
Ralph Castain
c7fda41d2a Only remove children from the local child list when the job completes so we update the status on all procs in the job and can properly terminate the job.
Correct an error in a debugging output

This commit was SVN r20669.
2009-03-01 20:12:20 +00:00
Ralph Castain
47cfccbb49 Update a couple of tests
This commit was SVN r20668.
2009-03-01 15:32:32 +00:00
Ralph Castain
15171e4ba8 Remove completed children from the local list of child processes so that we properly track our number of children. Otherwise, we can artificially believe we have exceeded system limits on the number of local children.
This commit was SVN r20667.
2009-03-01 15:31:27 +00:00
Ralph Castain
f0fcaf8b32 For some reason, the buffer gets trashed, so for now, let's process and then relay...until I can figure out the race condition that is causing the problem.
This commit was SVN r20665.
2009-03-01 01:24:02 +00:00
Ralph Castain
c2ff8dc5ce Fix notifier base functions to match revised notifier.h framework APIs
This commit was SVN r20663.
2009-02-28 23:46:18 +00:00
Ralph Castain
11979c100a Silence pointless compiler warning
This commit was SVN r20661.
2009-02-28 15:35:48 +00:00
Tim Mattox
57be80c983 First pass at integrating the CIFTS/FTB support as
a notifier module.
The Notifier framework was extended slightly to
convey more information about each event notice.
This works with the FTB v0.5 API.

To compile with FTB support, use --with-ftb=/path/to/ftb/install

CIFTS == Coordinated Infrastructure for Fault Tolerant Systems
FTB == Fault Tolerance Backplane
see http://wiki.mcs.anl.gov/cifts/index.php

This commit was SVN r20655.
2009-02-27 22:53:43 +00:00
Ralph Castain
7e5dc8f2be Ensure that we turn off stdin read event when ctrl-c terminating a program
This commit was SVN r20654.
2009-02-27 15:01:28 +00:00
Ralph Castain
b8ffa302da Separate abnormal job termination from abnormal orted termination so we can continue to use xcast for orted cmds, but can know to turn off reading of stdin as the job is being terminated.
This commit was SVN r20650.
2009-02-27 10:16:25 +00:00
Ralph Castain
4f75f6e443 Fix a bug where we were not stopping the read event on stdin if the write to stdin of the target process was backing up.
Ensure we stop reading stdin if we are abnormally terminating - no point in doing so since the job is being terminated.

This commit was SVN r20649.
2009-02-27 09:31:34 +00:00
Rainer Keller
1745895d09 - Sorry to come back to this, but revert r20643...
Headers should be included in the .c directly.

This commit was SVN r20645.

The following SVN revision numbers were found above:
  r20643 --> open-mpi/ompi@e46c512ee7
2009-02-26 22:01:01 +00:00
Josh Hursey
e46c512ee7 Fix a couple of missing headers resulting from recent cleanup
This commit was SVN r20643.
2009-02-26 16:56:56 +00:00
Shiqing Fan
4d3f801dbd Try to find the installed flex on current windows system first, if it's not there, just use the one comes along with the source.
This commit was SVN r20642.
2009-02-26 13:03:53 +00:00
Rainer Keller
4c0e8e1e69 - Header orte/mca/oob/base/base.h is probably the wrong one to include
anyhow -- if oob functionality is neededm then orte/mca/oob/oob.h

   Nevertheless compiles fine with -Wimplicit-function-declaration   

This commit was SVN r20641.
2009-02-26 04:20:03 +00:00
Rainer Keller
04567d3af0 - Header orte/mca/errmgr/errmgr.h is not needed.
Once again compiles fine with -Wimplicit-function-declaration   

This commit was SVN r20640.
2009-02-26 04:05:30 +00:00
Rainer Keller
96e1b9b747 - Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml
or ORTE_RML.
   As the others compiles fine with -Wimplicit-function-declaration

This commit was SVN r20639.
2009-02-26 03:52:31 +00:00
Rainer Keller
bcac113b13 - Header orte/mca/ess/ess.h not being used
This commit was SVN r20638.
2009-02-26 03:28:59 +00:00
Shiqing Fan
2326f14be5 Remove the unnecessary PROJECT command, I somehow misunderstood how it should be used on Windows....
This commit was SVN r20634.
2009-02-25 16:07:43 +00:00
Ralph Castain
f3ffe48edd Remove debug output
This commit was SVN r20632.
2009-02-25 04:01:09 +00:00
Rainer Keller
b356e90fa1 - Get rid of include orte/util/proc_info.h, if not needed
Only proc_info.h-internal include file is opal/dss/dss_types.h
 - In one case (orte/util/hnp_contact.c) had to add proc_info.h again.
 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   works fine, no errors.

   Again, let's have MTT the last word.

This commit was SVN r20631.
2009-02-25 03:38:00 +00:00
Ralph Castain
85a9a2e6d8 Ensure that signals are de-trapped before exiting to stop the $#@@#$ event library from "asserting"
This commit was SVN r20630.
2009-02-25 03:10:21 +00:00
Ralph Castain
dcff523244 Fix a race condition that causes corruption of a buffer in mpirun while trying to process launch_local_proc cmds.
Cleanup the pidmap handling by changing from value to pointer arrays.

This commit was SVN r20629.
2009-02-25 02:43:22 +00:00
Shiqing Fan
3656a38a03 Fix a few type casts for windows.
This commit was SVN r20622.
2009-02-23 14:09:07 +00:00
Ralph Castain
1e5aa40e3f Ensure that this component is not selected by tools, or anything other than an MPI proc
This commit was SVN r20608.
2009-02-20 15:01:58 +00:00
Rainer Keller
02599446d0 - Occurences of ORTE_PROC_MY_NAME require orte/runtime/orte_globals.h
This commit was SVN r20607.
2009-02-20 03:16:13 +00:00
Ralph Castain
5dc4a2b1e0 Add missing include file
This commit was SVN r20603.
2009-02-19 21:40:31 +00:00
Ralph Castain
ca97f315fe Enable direct launch of applications under SLURM. Compute all required nidmap and mpidmap info based on publicly available SLURM environmental variables so that no linkage to SLURM libraries is required.
Note: this requires that nodes not be shared by jobs/users. SLURM developers are working on an enhancement to remove this constraint.


Note 2: yes, the direct routed module returned! However, it is vastly different than the old one and has zero support for such things as comm_spawn. It is solely to support non-daemon, direct-launch environments.

This commit was SVN r20601.
2009-02-19 21:39:54 +00:00
Ralph Castain
76fc406b08 Modify envars passed to support new proc_info and hier expectations
This commit was SVN r20600.
2009-02-19 21:36:30 +00:00
Ralph Castain
8359477387 Modify the base collective algorithms to take an array of arbitrary vpids instead of assuming everything is ordered in a particular way. Modify the hier grpcomm module to support arbitrary mappings
This commit was SVN r20599.
2009-02-19 21:35:20 +00:00
Ralph Castain
6151f7b60c Enable static ports for application procs during self-bootstrap for non-daemon environments by letting them select what port to use based on node rank and attempting to connect to the peer on that port
Note that this assumes non-shared nodes...but only takes affect if there is no prior knowledge of how to talk to the specified peer. Thus, all daemon-based environments are unaffected.

This commit was SVN r20598.
2009-02-19 21:33:46 +00:00
Ralph Castain
9c2c17beb0 Split out the nidmap init function that adds entries for the local node and proc so these can be separate functions
This commit was SVN r20597.
2009-02-19 21:28:58 +00:00