1
1

5050 Коммитов

Автор SHA1 Сообщение Дата
Shiqing Fan
b904d4826f Get rid of a warning of using void pointer in arithmetic.
This commit was SVN r23393.
2010-07-13 21:43:38 +00:00
Rolf vandeVaart
e27f953fa4 Fix casting in assert.
This commit was SVN r23388.
2010-07-13 12:02:12 +00:00
Rolf vandeVaart
fb19872806 Two new flag definitions needed by the new PML.
This commit was SVN r23386.
2010-07-13 11:30:43 +00:00
Rolf vandeVaart
19d007a6fc New PML to support failover between openib BTLs.
openib BTL changes coming soon.

This commit was SVN r23385.
2010-07-13 10:46:20 +00:00
Ralph Castain
570d19106b Allow singletons to use ompi-server for rendezvous via pubsub as well as comm_spawn without starting their own local daemons
This commit was SVN r23384.
2010-07-13 06:33:07 +00:00
Rolf vandeVaart
b4af9c0efc Fix casts so trunk compiles
This commit was SVN r23381.
2010-07-13 01:52:22 +00:00
Ralph Castain
4a94ea53d3 Minor cleanup - if any jobid in the remote group is different from the local group, then flag disconnect
This commit was SVN r23379.
2010-07-12 21:39:56 +00:00
Ralph Castain
84d63a46cd Remove a hard-coded limit of 64 independent jobs that could connect/accept together
This commit was SVN r23378.
2010-07-12 18:34:33 +00:00
Shiqing Fan
8de5654bf9 Add new files into the tarball.
This commit was SVN r23377.
2010-07-12 16:21:46 +00:00
Shiqing Fan
cdc7e0bec9 Mainly type casts.
Get rid of pthread and other unnecessary stuffs for Windows.

This commit was SVN r23376.
2010-07-12 16:17:56 +00:00
Shiqing Fan
e3be90ff22 Update CMake modules, adding initial support for openib.
This commit was SVN r23373.
2010-07-12 15:28:37 +00:00
Shiqing Fan
c51c262e67 Relevant Windows fixes for r23360.
This commit was SVN r23363.

The following SVN revision numbers were found above:
  r23360 --> open-mpi/ompi@31295e8dc2
2010-07-07 16:58:16 +00:00
Jeff Squyres
87e17a41da Ensure that the com_rules[] array entries are initialized to NULL in
case individual entries aren't used, but dynamic rules are enabled
(i.e., at least one or more of them are not NULL, meaning that they'll
all be assumed to be either NULL or a valid value).

This commit was SVN r23361.
2010-07-07 14:04:18 +00:00
Ralph Castain
31295e8dc2 As discussed on today's telecon, reorganize the debugger attachment code in orte to better support efforts within the tool community aimed at exploring alternative methods. Move the debugger attachment code from the orterun directory to a new debugger framework. Organize the existing standard support code into an "mpir" component. Organize the current extensions for co-spawning debugger daemons into a separate "mpirx" component.
Since the MPIR symbols are now included in the ORTE library, remove duplicate declarations in OMPI and replace them with extern references to their ORTE instantiations.

This commit was SVN r23360.
2010-07-06 23:35:42 +00:00
Jeff Squyres
1802325a39 Rename "libtrace" to be "libompitrace" so as not to conflict with an
already-existing "libtrace" on some BSD distros.

This commit was SVN r23357.
2010-07-06 21:48:15 +00:00
Jeff Squyres
c8bb7537e7 Remove include/opal/sys/cache.h -- its only purpose in life was to
#define CACHE_LINE_SIZE to 128.  This name has a conflict on NetBSD,
and it seems kinda odd to have a header file that ''only'' defines a
single value.  Also, we'll soon be raising hwloc to be a first-class
item, so having this file around seemed kinda weird.

Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an
int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the
rationale that we can fill this in at runtime with hwloc info (trunk
and v1.5/beyond, only).  The only place we ''needed'' a compile-time
CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a
new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value
(128).  That use isn't suitable for run-time hwloc information,
anyway.

This commit was SVN r23349.
2010-07-06 14:33:36 +00:00
Jeff Squyres
6d77118254 Fixes for FT code that came from recent shared memory updates.
This commit was SVN r23348.
2010-07-06 12:58:48 +00:00
Jeff Squyres
8fef296b8a Updates about thread support levels.
This commit was SVN r23341.
2010-07-02 13:14:09 +00:00
Ralph Castain
b4422e012c Fix a typo that breaks ompi_info if --enable-sensors
This commit was SVN r23338.
2010-07-02 02:38:55 +00:00
Jeff Squyres
222c4c8dd8 Reformat the verbatim sections of these man pages for narrower (80
char) displays. 

This commit was SVN r23325.
2010-07-01 18:52:45 +00:00
Jeff Squyres
e82e7f896e These compile warnings have been forever; I finally got inspired to
fix them.

This commit was SVN r23316.
2010-06-28 17:26:38 +00:00
Jeff Squyres
1fad51776d Also add <stdlib.h> for exit().
This commit was SVN r23308.
2010-06-28 15:17:42 +00:00
Jeff Squyres
f9d4426c19 OS X / Absoft needs <string.h>
This commit was SVN r23307.
2010-06-28 15:15:06 +00:00
Nadia Derbey
c22e6b3613 openib btl unsafe in case of extremely low srq settings
This commit was SVN r23301.
2010-06-24 09:59:45 +00:00
Jeff Squyres
ea05c73cfc Use the right number of characters for the strncmp. Thanks to Brad
for catching that!

This commit was SVN r23281.
2010-06-18 15:45:38 +00:00
Jeff Squyres
cdc5541cb0 Search for "dlname", not "dlopen". This value will be filled in if
there is a DSO to open.

This commit was SVN r23280.
2010-06-18 15:13:34 +00:00
Matthias Jurenz
1467f2db52 Added workaround for PGI compiler bug (see http://www.pgroup.com/support/release_tprs_90.htm TPR 4337):
Disable OpenMP if compiler version is less than 9.0-3.

This commit was SVN r23274.
2010-06-15 07:16:13 +00:00
Jeff Squyres
b620e63bdc Add in 2 cases for where this test may be skipped:
1. If opal wasn't built with libltdl support
 1. If opal was built statically (i.e., dlopen='' in the .la file)

This commit was SVN r23270.
2010-06-14 16:06:43 +00:00
Shiqing Fan
d391c57b0f A more proper fix for the HANDLE definition.
This commit was SVN r23269.
2010-06-14 14:17:07 +00:00
Samuel Gutierrez
2fb7c344fc Added a new System V (sysv) shared memory component for Open MPI.
Configure Option:
--enable-sysv

MCA Parameter:
mpi_common_sm

mpi_common_sm accepts a comma delimited list of: [sysv],mmap (order
dependent).  The first component that is successfully selected is used. For
example, -mca mpi_common_sm sysv,mmap will first try sysv. If sysv is not
successfully selected, then mmap will be used.  mmap will be used if 
mpi_common_sm is not provided.

Notes:
Please make certain that your system's shmmax limit, or equivalent, is larger
than mpool_sm_min_size.  Otherwise, shmget may fail.

This commit was SVN r23260.
2010-06-09 16:58:52 +00:00
George Bosilca
c8ee150c95 If we fail to correctly initialize the MX device, don't mark it as initialized.
This commit was SVN r23238.
2010-06-02 15:00:42 +00:00
Jeff Squyres
e45be29f0d This function shouldn't have an ibv_ prefix -- it's not part of
verbs (it's just a static convenience function here in this file).  

This commit was SVN r23237.
2010-06-02 12:54:56 +00:00
Jeff Squyres
7676d5adda Change "intra-communicator" to "inter-communicator". Thanks to
Simon/Number Cruncher for reporting the typo.

This commit was SVN r23236.
2010-06-02 12:35:53 +00:00
Christopher Yeoh
712907affa Removing memory barriers which are not needed because of
the extra memory barriers which were added in r22880. This 
reverts all of r22879

This commit was SVN r23234.

The following SVN revision numbers were found above:
  r22879 --> open-mpi/ompi@768ea2bab0
  r22880 --> open-mpi/ompi@cd5294944b
2010-06-02 00:38:47 +00:00
Jeff Squyres
5d386fc678 Per #2420, string handling of the Fortran array_of_argv argument to
MPI_COMM_SPAWN_MULTIPLE was just wrong.  This commit renames a few
variables to make their meaning a bit more clear and fixes up all
known issues with converting a 2D array of Fortran strings to a set of
C-style argv vectors.

Fixes trac:2420.

This commit was SVN r23217.

The following Trac tickets were found above:
  Ticket 2420 --> https://svn.open-mpi.org/trac/ompi/ticket/2420
2010-05-28 12:40:42 +00:00
Jeff Squyres
620c0eb160 Be a little more verbose about argv / array_of_argv parameters to
MPI_Comm_spawn / MPI_Comm_spawn_multiple, particularly the Fortran
variants.

This commit was SVN r23216.
2010-05-28 11:57:45 +00:00
Jeff Squyres
0061f2170d ompi/mpi/c/request_get_status.c (MPI_Request_get_status): If
opal_progress is called then check the status of the request before
returning. opal_progress is called only once.  This logic parallels
MPI_Test (ompi_request_default_test).

Thanks to Shaun Jackman for submitting the patch.

This commit was SVN r23215.
2010-05-27 21:37:11 +00:00
Jeff Squyres
464bd8c56e Fix typo
This commit was SVN r23212.
2010-05-27 21:19:38 +00:00
Rolf vandeVaart
27f070a575 Start setting a flag when a port error is detected on the openib BTL.
At this point, it is just cleared (and ignored) so default behavior has not changed.
However, future failover support can take advantage of this flag.
Reviewed by Pasha Shamis.

This commit was SVN r23204.
2010-05-24 18:57:55 +00:00
Jeff Squyres
fec7918eea Some paffinity functions had their return status overloaded:
* If < 0, it's an OPAL_ERR_* value
 * If >= 0, it's the actual output value of the function

This is problematic for the OPAL_SOS stuff.  This commit changes those
functions to always return OPAL_* statuses and send the output value
back through output parameters (like 95% of the rest of the code
base).  This avoids the confusion with OPAL_SOS stuff and makes
paffinity work again (e.g., mpirun --bind-to-core ...).

I updated all paffinitiy modules for the new function signatures, and
bumped the paffinity API version up to 2.0.1.  I don't think the
version change will matter, though, because we'll be introducing
support for hardware threads soon, which will either bump the
paffinity version again or we'll replace paffinity with 
a new framework.

This commit was SVN r23197.
2010-05-21 16:55:28 +00:00
Shiqing Fan
857f1669e2 Solve a few compilation problems on Windows.
This commit was SVN r23193.
2010-05-21 14:30:15 +00:00
Edgar Gabriel
f6598138ba fix some instances, where we might have allocated 0 bytes. Also, for allgather
make sure that we do not call coll_gather and coll_bcast in the very same
instances, since some collective (intra) modules do not seem to like the fact
if they are called for scount or rcount being zero (for regular
intra-communicator operations, this is handled on the MPI API layer).

Fixes trac:2405

This commit was SVN r23188.

The following Trac tickets were found above:
  Ticket 2405 --> https://svn.open-mpi.org/trac/ompi/ticket/2405
2010-05-20 22:23:44 +00:00
Edgar Gabriel
5881719d84 checks for sendcount and recvcount(s) being zero have slightly different
consequences depending on whether the communicator is an intra or an inter
communicator. 

fixes trac:2415

This commit was SVN r23187.

The following Trac tickets were found above:
  Ticket 2415 --> https://svn.open-mpi.org/trac/ompi/ticket/2415
2010-05-20 22:21:26 +00:00
George Bosilca
b56ab33ff6 Indent and fix some uninitialized variables.
This commit was SVN r23179.
2010-05-19 21:20:33 +00:00
George Bosilca
c51932c250 Don't forget to initialize "line" in all cases.
This commit was SVN r23178.
2010-05-19 21:19:45 +00:00
Rolf vandeVaart
03b3e75f86 Add two arguments to the PML error callback function. This
allows the BTL to specify a specific ompi_proc_t that had an
error.  Also add an optional descriptive string.  Currently, arguments
are not used but will be by future failover PML. 
Changes based on RFC.  Reviewed by George Bosilca.

This commit was SVN r23174.
2010-05-19 11:55:45 +00:00
Abhishek Kulkarni
c63c4d6892 Fix bugs where (OMPI_ERROR == *) checks cannot be converted to (OMPI_SUCCESS != *) since the return codes are overloaded to return an "index" on success.
The fix is to just check if the return value is positive or not, since all the SOS encoded errors are *always* negative.

The real fix (as Ralph points out) is to change these functions (opal_pointer_array_add and mca_base_param*) to return the index as a pointer.

This commit was SVN r23173.
2010-05-18 20:54:11 +00:00
Josh Hursey
f57e73d4e5 add a few more missing SOS includes
This commit was SVN r23168.
2010-05-18 15:00:07 +00:00
Abhishek Kulkarni
afbe3e99c6 * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
 SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
 back the native error code.

* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
  (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
  decode 'ret' to get the native error code.

This commit was SVN r23162.
2010-05-17 23:08:56 +00:00
Jeff Squyres
91507e595f Fix bug reported on user list; set the errhandler type properly.
This commit was SVN r23145.
2010-05-15 13:04:32 +00:00