1
1
Граф коммитов

12760 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
ffc5d8877f Fix a problem where we're accidentally initializing the wrong
errhandler (should be initializing _errors_throw_exceptions, not
_are_fatal).  This bug was not a huge tragedy because the only real
problem is that _are_fatal has the wrong string name with it (because
MPI::Init fixes up the _errors_throw_exceptions later).

This commit was SVN r20458.
2009-02-05 21:36:10 +00:00
Jeff Squyres
50b1fd1392 Per the big discussion on the OpenFabrics list a while ago, some
versions of the NE driver will report the OUI while others will report
the PCI ID.  We'll put in the Intel values when we get them (may not
be for a few more weeks).

This commit was SVN r20457.
2009-02-05 21:19:45 +00:00
Jeff Squyres
66d0a02f90 For a problem for some iWARP drivers that don't handle RDMA CM REJECT
properly at all.  NetEffect's current driver (OFED 1.4.0) will return
a CONNECT_ERROR event to the initiator rather than the REJECTED event.
Doh!  Additionally -- unfortunately -- NetEffect's vendor_id and
vendor_part_id are reported as 0 in OFED 1.4.0, so we can't
automatically detect these cards and work around the problem.  So all
we can do is add a new MCA parameter
(btl_openib_connect_rdmacm_ignore_connect_errors -- yes, it's long on
purpose ;-) ) that says that if we get a CONNECT_ERROR, bascially
treat it exactly as a REJECT for the WRONG_DIRECTION reason (which is
a "good" reject).  This allows OMPI to function with NetEffect/Intel
cards on OFED 1.4.0.

Note that NetEffect has been bought by Intel; I'm waiting for
information from them to update the ini file for their new OUI/PCI
ID's and/or new vendor_part_id values.

This commit was SVN r20454.
2009-02-05 18:45:59 +00:00
Shiqing Fan
8086bb1a1b SIGSTOP and SIGTSTP are not supported on Windows. But they have to be defined anyway, although they are not used for Windows.
This commit was SVN r20453.
2009-02-05 17:02:34 +00:00
Jeff Squyres
08c35ca135 Somehow this mca param registration code got duplicated; remove one of
them

This commit was SVN r20452.
2009-02-05 16:52:30 +00:00
George Bosilca
36d496066b Correctly deal with the whole array.
This commit was SVN r20451.
2009-02-05 16:44:43 +00:00
Shiqing Fan
ff7ca43dd1 Update two configuration files for windows build.
This commit was SVN r20450.
2009-02-05 16:39:40 +00:00
Shiqing Fan
a20254c8a5 A few type casts, making the MS compiler silent.
This commit was SVN r20449.
2009-02-05 16:37:44 +00:00
Shiqing Fan
7d2d6b16b1 A fix for windows mainly, adding BEGIN/END_C_DECLS pairs.
This commit was SVN r20448.
2009-02-05 16:35:58 +00:00
George Bosilca
2c00133fdc Silence a possible casting warning.
This commit was SVN r20447.
2009-02-05 16:18:39 +00:00
Jeff Squyres
90c28810f4 Fix CID 1122: comm->c_name is a char array (not a pointer), so
comparing it to NULL is not useful.

This commit was SVN r20444.
2009-02-05 15:31:10 +00:00
Jeff Squyres
67a5374a61 Re CID 1180: Actually, it would be better to also print something in
the case of an error, too...

This commit was SVN r20443.
2009-02-05 15:26:44 +00:00
Jeff Squyres
598e530de9 Fix CID 1180: ensure to check the output from snprintf, since we pass
it to write().

This commit was SVN r20442.
2009-02-05 15:24:48 +00:00
George Bosilca
ee6ff2372e Fix the compilation for Windows.
This commit was SVN r20441.
2009-02-05 13:55:26 +00:00
Jeff Squyres
eaeed0402c Can just use the built-in _SCRIPTS suffix.
This commit was SVN r20440.
2009-02-05 12:21:56 +00:00
Lenny Verkhovsky
5ae9fd9865 fixing r20436
This commit was SVN r20439.

The following SVN revision numbers were found above:
  r20436 --> open-mpi/ompi@0d447511a5
2009-02-05 09:45:34 +00:00
Ralph Castain
6292b797e9 Add a new ESS module for use by local slave processes - only active when specifically selected
This commit was SVN r20438.
2009-02-05 06:07:48 +00:00
Ralph Castain
b7e6bafada Add a new routed module for local slave processes to use - only active when specifically selected
This commit was SVN r20437.
2009-02-05 06:07:04 +00:00
Ralph Castain
0d447511a5 Add a shellscript for daemon-less launch of local slave processes. No manpage as this is totally for internal use only.
This commit was SVN r20436.
2009-02-05 06:05:28 +00:00
Jeff Squyres
73ea7a9aa5 Fix CIDs 1211, 1212, 1214: fix error checking in MPI_REDUCE_LOCAL.
This commit was SVN r20435.
2009-02-05 02:18:03 +00:00
Jeff Squyres
a58d0d1a27 Fix CID 1219: ensure that "found" is initialized.
This commit was SVN r20434.
2009-02-05 01:57:20 +00:00
George Bosilca
4804ee60a7 It barely compiles ...
This commit was SVN r20433.
2009-02-05 00:14:28 +00:00
Jeff Squyres
7a3b011f45 Really fix the quoting this time. Really.
This commit was SVN r20430.
2009-02-04 23:04:21 +00:00
Ralph Castain
df3446faf1 Procs don't need to check for other job families to update routes - now that the direct routing module is gone, they always route through their daemons anyway, so save a couple of unnecessary steps.
This commit was SVN r20429.
2009-02-04 22:49:57 +00:00
Ralph Castain
dbba261451 Commit missing change in #define so r20427 doesn't break trunk
This commit was SVN r20428.

The following SVN revision numbers were found above:
  r20427 --> open-mpi/ompi@b100513022
2009-02-04 22:37:24 +00:00
Ralph Castain
b100513022 Add a few new MPI_Info options to the dpm - documentation to follow.
Fix a mistake in the dpm that hardcoded the update of routes to the HNP. This needs to be done by the individual routing modules so they can take whatever action is required - which will usually include updating the HNP, but might not...and might include additional steps. New routing modules are coming that violated this assumption, so it had to be moved back into init_routes.

All current routed modules know what to do - anyone with routed modules not in the current trunk may need to adjust them (see any of the current routed modules for examples of what to do).

This commit was SVN r20427.
2009-02-04 22:30:23 +00:00
Ralph Castain
e694c0dac6 Get the various grpcomm modules to all inter-operate cleanly with the "hier" module
This commit was SVN r20426.
2009-02-04 22:26:35 +00:00
George Bosilca
c359762c2d We're supposed to read a string and not an int ...
This commit was SVN r20421.
2009-02-04 15:51:31 +00:00
George Bosilca
745cec03e2 Fix two problems with the way we handle the lvalue in the case the Fortran and C integers
have different sizes:
1. Do not modify the read only parameter of the Fortran MPI interface (i.e be
    standard compliant).
2. When Fortran integers are 64 bits long, don't generate unlawful code.

Thanks to Christoph van Wullen for the bug report.

This commit was SVN r20420.
2009-02-04 15:41:55 +00:00
Ralph Castain
c534757b59 Correct use of the return code from opal_pointer_array_add
This commit was SVN r20417.
2009-02-04 14:02:51 +00:00
Jeff Squyres
9c2a6da128 Remove errant '>'. How on earth did that work at all?
This commit was SVN r20416.
2009-02-03 23:21:34 +00:00
Ralph Castain
f36b9332ab Pass along the new output-filename and xterm cmd line options to the orteds - otherwise, they won't work in ssh environments.
Modify the rsh launcher to add -X to ssh if xterm option was selected.

This commit was SVN r20407.
2009-02-03 20:06:05 +00:00
Ralph Castain
645f4c1f20 Silence compiler warnings about variables used before init
This commit was SVN r20406.
2009-02-03 20:04:27 +00:00
Ralph Castain
7282be4287 Silence compiler warnings about variables used before init
This commit was SVN r20405.
2009-02-03 20:04:01 +00:00
Ralph Castain
aa2abc8cac Fix xgrid plm by changing orte_pointer_array calls to opal_pointer_array
This commit was SVN r20404.
2009-02-03 18:43:00 +00:00
Shiqing Fan
eab19af55c Include the missing header that used by the fix commit r20402, and use the correct reference for the parameter of orte_odls_base_notify_iof_complete function call. Thanks Ralph for r20402.
This commit was SVN r20403.

The following SVN revision numbers were found above:
  r20402 --> open-mpi/ompi@f1084d6b84
2009-02-03 18:14:43 +00:00
Ralph Castain
f1084d6b84 Under Windows, tell the orted that the proc has met its IOF termination conditions when launched since Windows does its own IO forwarding.
This commit was SVN r20402.
2009-02-03 16:41:07 +00:00
Ralph Castain
c0e2ccb591 Enable filem
This commit was SVN r20401.
2009-02-02 17:13:48 +00:00
Ralph Castain
104a0539e3 Fix a format statement to be compatible with all gcc compiler versions
This commit was SVN r20400.
2009-02-02 15:47:07 +00:00
Ralph Castain
92bb86753a Update some ignore properties
This commit was SVN r20399.
2009-02-02 15:12:13 +00:00
Ralph Castain
9d381a4ebf Add a '!' option to the xterm iof option to invoke the -hold feature of xterm.
Correct the orte-show-help file when a rank is out of bounds, and do that test where a wildcard doesn't get incorrectly flagged as out-of-bounds.

This commit was SVN r20398.
2009-02-02 15:06:23 +00:00
Ralph Castain
0597fdd778 Ensure that orte-iof barks when given an unrecognized cmd line option
This commit was SVN r20397.
2009-02-02 14:10:54 +00:00
Ralph Castain
b19dc2a4fa Update mpirun's man page for report-pid and report-uri options
This commit was SVN r20396.
2009-02-02 13:49:07 +00:00
Ralph Castain
d207c17adf Fix a segv when an application isn't found - ensure we properly terminate.
This commit was SVN r20395.
2009-02-02 13:44:08 +00:00
Ralph Castain
c3261e1a05 Fix optimized builds
This commit was SVN r20394.
2009-02-01 20:58:17 +00:00
Ralph Castain
debf128e53 Ensure the static port array is correctly checked for size
This commit was SVN r20393.
2009-01-31 03:46:42 +00:00
Ralph Castain
2966206f58 Fix a race condition in the IOF and add some new user-requested features:
1. fix a race condition whereby a proc's output could trigger an event prior to the other outputs being setup, thus c ausing the IOF to declare the proc "terminated" too early. This was really rare, but could happen.

2. add a new "timestamp-output" option that timestamp's each line of output

3. add a new "output-filename" option that redirects each proc's output to a separate rank-named file.

4. add a new "xterm" option that redirects the output of the specified ranks to a separate xterm window.

This commit was SVN r20392.
2009-01-30 22:47:30 +00:00
Rolf vandeVaart
0704b98668 Add the ability to forward SIGTSTP (converted to SIGSTOP) and
SIGCONT to the a.outs.  By default, they are not forwarded and
the behavior remains as it has always been.  However, if one
runs with --mca orte_forward_job_control 1, then mpirun will
catch those two signals and forward them to the orteds which
will deliver them to the a.outs.  We have had requests for
this feature.

This commit was SVN r20391.
2009-01-30 18:50:10 +00:00
Ralph Castain
5e6d3ba289 Initial implementation of static ports. Provide an mca param to specify static port ranges to the OOB - can provide an
y combination of comma-separated values and ranges. Daemons will use the first port in the range, MPI procs will use the other ports in the range assuming that they know their node rank in time and enough ports were specified.

NOTE: this capability only works under specific conditions. I will outline more about this in a note to devel as the remainder of the implementation progresses. For now, the only environment where this works is slurm. The linear routed module has also been adjusted to work with static ports so that all messaging flows strictly through the topology, including the initial daemon callback - thus limiting the number of sockets opened by mpirun.

This commit was SVN r20390.
2009-01-30 18:31:43 +00:00
Jeff Squyres
e6398979ef Add note about btl_openib_warn_no_device_params_found.
This commit was SVN r20387.
2009-01-30 11:38:37 +00:00