1
1
Граф коммитов

4253 Коммитов

Автор SHA1 Сообщение Дата
Tim Mattox
9b83df22ec Fix some "is proc on local node?" logic that got accidentally flipped
by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules.

This commit was SVN r20515.

The following SVN revision numbers were found above:
  r20496 --> open-mpi/ompi@4cdf91a8d4
2009-02-11 15:02:38 +00:00
Jeff Squyres
c596a1bcb3 Fix MPI_File_c2f -- ensure that if you invoke
MPI_File_c2f(MPI_FILE_NULL), you actually get 0, not -1.  Thanks for
Lisandro Dalcin for the bug report.

This commit was SVN r20511.
2009-02-11 00:48:12 +00:00
Shiqing Fan
2f1461419c Add a new feature for checking mca subdirectories, i.e. detecting if there is an exclude file list which indicates the files that shouldn't be added to the source list. By default, the CMake build system will simply add all source files in the required sub folders, without knowing which files have to be excluded. The first use of it is in plm/base/.windows.
And clean up the nested variable names, in order to make it readable.

This commit was SVN r20498.
2009-02-10 17:20:13 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Ralph Castain
f0af389910 Enable comm_spawn of slave processes, currently only active for the rsh, slurm, and tm environments. Establish support for local rsh environments in the plm/base so that rsh of local slaves can be done by any environment that supports it. Create new orte_rsh_agent param so users can specify rsh agent from outside of rsh plm, and sym link that to the old plm_rsh_agent and pls_rsh_agent options.
Modify the orte-bootproxy to pass prefix for the remote slave to support hetero/hybrid scenarios

This commit was SVN r20492.
2009-02-09 20:44:44 +00:00
Ralph Castain
eaa57e29b6 Revert r20480 as this breaks the trunk. The dpm.h include file has defines for OMPI_RML tags that are required for wireup.
This commit was SVN r20482.

The following SVN revision numbers were found above:
  r20480 --> open-mpi/ompi@62282fefe5
2009-02-09 14:14:45 +00:00
Rainer Keller
62282fefe5 - Get rid of #include "ompi/mca/dpm/dpm.h"
This commit was SVN r20480.
2009-02-09 02:56:10 +00:00
Jeff Squyres
f68d2b00d8 Fix one more place where the old name was left over.
This commit was SVN r20473.
2009-02-06 19:21:50 +00:00
Terry Dontje
64ace9ec12 convert bzero calls to memset to remove warnings.
This commit was SVN r20471.
2009-02-06 19:08:22 +00:00
Jeff Squyres
aae930e58b s/__n/converted_n/ -- according to C99, symbols that being with "__"
are the domain of the compiler.

This commit was SVN r20462.
2009-02-06 01:04:50 +00:00
Jeff Squyres
dfb2d92b37 s/ID/id/ - both work, but if I don't make this change, I'll wonder if
we remembered to use strcasecmp() every time I see this entry in the
file... (we did, but I just don't want to have to keep remembering
that ;-) )

This commit was SVN r20461.
2009-02-06 01:02:25 +00:00
Jeff Squyres
656d8578d0 * Rename (new) MCA parameter to
btl_openib_connect_rdmacm_reject_causes_connect_error (yes, it's
   still long -- on purpose :-) )
 * Add INI file parameter rdmacm_reject_causes_connect_error
 * Now only treat CONNECT_ERROR events as a REJECT if:
   * It's on a connection where we were expecting a REJECT, ''and''
   * The MCA parameter is true ''or'' the INI parameter for this
     device is true
 * Set the INI parameter for true for the NE020

This commit was SVN r20459.
2009-02-06 00:51:04 +00:00
Jeff Squyres
ffc5d8877f Fix a problem where we're accidentally initializing the wrong
errhandler (should be initializing _errors_throw_exceptions, not
_are_fatal).  This bug was not a huge tragedy because the only real
problem is that _are_fatal has the wrong string name with it (because
MPI::Init fixes up the _errors_throw_exceptions later).

This commit was SVN r20458.
2009-02-05 21:36:10 +00:00
Jeff Squyres
50b1fd1392 Per the big discussion on the OpenFabrics list a while ago, some
versions of the NE driver will report the OUI while others will report
the PCI ID.  We'll put in the Intel values when we get them (may not
be for a few more weeks).

This commit was SVN r20457.
2009-02-05 21:19:45 +00:00
Jeff Squyres
66d0a02f90 For a problem for some iWARP drivers that don't handle RDMA CM REJECT
properly at all.  NetEffect's current driver (OFED 1.4.0) will return
a CONNECT_ERROR event to the initiator rather than the REJECTED event.
Doh!  Additionally -- unfortunately -- NetEffect's vendor_id and
vendor_part_id are reported as 0 in OFED 1.4.0, so we can't
automatically detect these cards and work around the problem.  So all
we can do is add a new MCA parameter
(btl_openib_connect_rdmacm_ignore_connect_errors -- yes, it's long on
purpose ;-) ) that says that if we get a CONNECT_ERROR, bascially
treat it exactly as a REJECT for the WRONG_DIRECTION reason (which is
a "good" reject).  This allows OMPI to function with NetEffect/Intel
cards on OFED 1.4.0.

Note that NetEffect has been bought by Intel; I'm waiting for
information from them to update the ini file for their new OUI/PCI
ID's and/or new vendor_part_id values.

This commit was SVN r20454.
2009-02-05 18:45:59 +00:00
Jeff Squyres
08c35ca135 Somehow this mca param registration code got duplicated; remove one of
them

This commit was SVN r20452.
2009-02-05 16:52:30 +00:00
George Bosilca
36d496066b Correctly deal with the whole array.
This commit was SVN r20451.
2009-02-05 16:44:43 +00:00
George Bosilca
2c00133fdc Silence a possible casting warning.
This commit was SVN r20447.
2009-02-05 16:18:39 +00:00
Jeff Squyres
90c28810f4 Fix CID 1122: comm->c_name is a char array (not a pointer), so
comparing it to NULL is not useful.

This commit was SVN r20444.
2009-02-05 15:31:10 +00:00
George Bosilca
ee6ff2372e Fix the compilation for Windows.
This commit was SVN r20441.
2009-02-05 13:55:26 +00:00
Jeff Squyres
73ea7a9aa5 Fix CIDs 1211, 1212, 1214: fix error checking in MPI_REDUCE_LOCAL.
This commit was SVN r20435.
2009-02-05 02:18:03 +00:00
Ralph Castain
b100513022 Add a few new MPI_Info options to the dpm - documentation to follow.
Fix a mistake in the dpm that hardcoded the update of routes to the HNP. This needs to be done by the individual routing modules so they can take whatever action is required - which will usually include updating the HNP, but might not...and might include additional steps. New routing modules are coming that violated this assumption, so it had to be moved back into init_routes.

All current routed modules know what to do - anyone with routed modules not in the current trunk may need to adjust them (see any of the current routed modules for examples of what to do).

This commit was SVN r20427.
2009-02-04 22:30:23 +00:00
George Bosilca
745cec03e2 Fix two problems with the way we handle the lvalue in the case the Fortran and C integers
have different sizes:
1. Do not modify the read only parameter of the Fortran MPI interface (i.e be
    standard compliant).
2. When Fortran integers are 64 bits long, don't generate unlawful code.

Thanks to Christoph van Wullen for the bug report.

This commit was SVN r20420.
2009-02-04 15:41:55 +00:00
Jeff Squyres
2cafa5d640 Re-add missing assignment of component variable from MCA param that
somehow must have gotten deleted along the way...

This commit was SVN r20386.
2009-01-30 11:36:14 +00:00
George Bosilca
04a3b29b76 Silence some compiler warnings, and reindent the code.
This commit was SVN r20385.
2009-01-29 18:04:54 +00:00
Jeff Squyres
35c5e28a8e Up to SVN r20383
This commit was SVN r20384.

The following SVN revision numbers were found above:
  r20383 --> open-mpi/ompi@e0638c84c8
2009-01-29 17:59:04 +00:00
George Bosilca
d0a05e90ba Remove the dependency on datatype_pack.h from the convertor_raw file.
Revert r20381 as two header files are "special".

This commit was SVN r20382.

The following SVN revision numbers were found above:
  r20381 --> open-mpi/ompi@25b25aef41
2009-01-28 21:50:01 +00:00
Ralph Castain
25b25aef41 Fix the trunk so it will compile.
Note: this does -not- fix the compiler warnings, but just fixes the missing includes so the trunk will build again.

This commit was SVN r20381.
2009-01-28 21:26:42 +00:00
George Bosilca
2d4a668540 Don't write more iovec than expected.
This commit was SVN r20375.
2009-01-28 16:32:56 +00:00
George Bosilca
0513e018b1 Fix the length of the line.
This commit was SVN r20373.
2009-01-28 15:40:59 +00:00
George Bosilca
321ac99814 Add a function to allow extraction of the iovec covering
the mmory layout of the convertor.

This commit was SVN r20372.
2009-01-28 15:40:15 +00:00
Rainer Keller
fb0e0b854a - Again, no need for #include "orte/util/show_help.h"
- Use BEGIN_C_DECLS and END_C_DECLS

This commit was SVN r20358.
2009-01-27 19:19:04 +00:00
Rainer Keller
9825e087b8 - In rb/rcache_rb.c, the reg->flags should only be operated under the
lock -- therefore move the OPAL_THREAD_UNLOCK after
   the if-OMPI_ERR_TEMP_OUT_OF_RESOURCE block.

 - As mca_rcache_rb_mru_delete is the only setter of rc, move the
   error-check right after mca_rcache_rb_mru_delete.

 - Removed a few nitty ompi/info/info.h and orte/util/show_help.h

This commit was SVN r20355.
2009-01-27 19:00:03 +00:00
Rainer Keller
de4c123ca2 - No dependancy on orte/util/show_help.h, so get rid of #include
This commit was SVN r20354.
2009-01-27 16:30:21 +00:00
Rainer Keller
340d72a166 - There is no dependancy on mpool -- so no need to include
This commit was SVN r20353.
2009-01-27 16:18:56 +00:00
Jeff Squyres
ca0f7d77e9 Fix a help message regarding the btl_openib_receive_queues MCA
parameter.

This commit was SVN r20350.
2009-01-26 18:57:07 +00:00
Jeff Squyres
f9c5adb86f Fix to enable the --disable-mpi-io configure option.
This commit was SVN r20330.
2009-01-23 14:15:51 +00:00
Matthias Jurenz
7a2a081670 Updated VT version to 5.4.7
This commit was SVN r20318.
2009-01-22 13:20:09 +00:00
Matthias Jurenz
1288c662ea - bugfix: select cycle counter timer only on i*86, x86, IA64, and PPC platforms
- minor cleanups

This commit was SVN r20317.
2009-01-22 12:29:10 +00:00
Jeff Squyres
207a61e8d9 Fixes trac:1072: allow MPI C++ constants to be used as array sizes, such
as:

  char name[MPI::MAX_PORT_NAME];

This commit was SVN r20310.

The following Trac tickets were found above:
  Ticket 1072 --> https://svn.open-mpi.org/trac/ompi/ticket/1072
2009-01-21 23:02:51 +00:00
Jeff Squyres
90e69ac6ff Fix some man page nits noticed by the Debain OMPI maintainers. Thanks
Dirk!

This commit was SVN r20307.
2009-01-21 18:38:37 +00:00
Ralph Castain
5d9de3326c Check for valid local/node ranks before using the returned values
This commit was SVN r20304.
2009-01-21 00:54:50 +00:00
Jeff Squyres
1573aaceb7 Add missing header file.
This commit was SVN r20290.
2009-01-17 12:21:42 +00:00
Jeff Squyres
6bde41c785 Forgot this #define -- ooops.
This commit was SVN r20288.
2009-01-16 19:15:17 +00:00
Jeff Squyres
84a3f84fdf Possible fix for random openib segv.
This commit was SVN r20282.
2009-01-15 17:10:18 +00:00
Jeff Squyres
8483c3c66e It is not an error if there are no op components found; we'll just
fallback to the base functions.

This commit was SVN r20281.
2009-01-15 02:01:32 +00:00
Jeff Squyres
4d8a187450 Two major things in this commit:
* New "op" MPI layer framework
 * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2)

= Op framework =

Add new "op" framework in the ompi layer.  This framework replaces the
hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples
for pre-defined MPI_Ops, allowing components and modules to provide
the back-end functions.  The intent is that components can be written
to take advantage of hardware acceleration (GPU, FPGA, specialized CPU
instructions, etc.).  Similar to other frameworks, components are
intended to be able to discover at run-time if they can be used, and
if so, elect themselves to be selected (or disqualify themselves from
selection if they cannot run).  If specialized hardware is not
available, there is a default set of functions that will automatically
be used.

This framework is ''not'' used for user-defined MPI_Ops.

The new op framework is similar to the existing coll framework, in
that the final set of function pointers that are used on any given
intrinsic MPI_Op can be a mixed bag of function pointers, potentially
coming from multiple different op modules.  This allows for hardware
that only supports some of the operations, not all of them (e.g., a
GPU that only supports single-precision operations).

All the hard-coded back-end MPI_Op functions for (MPI_Op,
MPI_Datatype) tuples still exist, but unlike coll, they're in the
framework base (vs. being in a separate "basic" component) and are
automatically used if no component is found at runtime that provides a
module with the necessary function pointers.

There is an "example" op component that will hopefully be useful to
those writing meaningful op components.  It is currently
.ompi_ignore'd so that it doesn't impinge on other developers (it's
somewhat chatty in terms of opal_output() so that you can tell when
its functions have been invoked).  See the README file in the example
op component directory.  Developers of new op components are
encouraged to look at the following wiki pages:

  https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework

= MPI_REDUCE_LOCAL =

Part of the MPI-2.2 proposal listed here:

    https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24

is to add a new function named MPI_REDUCE_LOCAL.  It is very easy to
implement, so I added it (also because it makes testing the op
framework pretty easy -- you can do it in serial rather than via
parallel reductions).  There's even a man page!

This commit was SVN r20280.
2009-01-14 23:44:31 +00:00
Brian Barrett
cfc400eb57 * Enable eager sending for Accumulate
* If the accumulate is local, make it short-circuit the request path.  Accumulate requires local
  ops due to its window rules, so this is likely to help a bunch (on the codes I"m messing
  with at least)
* Due a better job at flushing everything that can go out on the wire in a resource constrained problem
* Move some debugging values around to make large problems somewhat easier to deal with

This commit was SVN r20277.
2009-01-14 20:15:15 +00:00
Edgar Gabriel
1072812bcf not every element in the pointer array list contains a valid entry. Thus, do not try to free elements if the list returns NULL.
This commit was SVN r20275.
2009-01-14 19:11:30 +00:00
Jeff Squyres
895edd04f8 Fix CID 468: remove some dead code. r_proc_list was set to NULL but
never used.

This commit was SVN r20272.
2009-01-14 18:15:17 +00:00