1
1
Граф коммитов

7860 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
520147f209 Clean up the Fortran MPI sentinel values per problem reported on the
users mailing list:

  http://www.open-mpi.org/community/lists/users/2006/07/1680.php

Warning: this log message is not for the weak.  Read at your own
risk.

The problem was that we had several variables in Fortran common blocks
of various types, but their C counterparts were all of a type
equivalent to a fortran double complex.  This didn't seem to matter
for the compilers that we tested, but we never tested static builds
(which is where this problem seems to occur, at least with the Intel
compiler: the linker compilains that the variable in the common block
in the user's .o file was of one size/alignment but the one in the C
library was a different size/alignment).

So this patch fixes the sizes/types of the Fortran common block
variables and their corresponding C instantiations to be of the same
sizes/types. 

But wait, there's more.

We recently introduced a fix for the OSX linker where some C versions
of the fortran common block variables (e.g.,
_ompi_fortran_status_ignore) were not being found when linking
ompi_info (!).  Further research shows that the code path for
ompi_info to require ompi_fortran_status_ignore is, unfortunately,
necessary (a quirk of how various components pull in different
portions of the code base -- nothing in ompi_info itself requires
fortran or MPI knowledge, of course).

Hence, the real problem was that there was no code path from ompi_info
to the portion of the code base where the C globals corresponding to
the Fortran common block variables were instantiated.  This is because
the OSX linker does not automatically pull in .o files that only
contain unintialized global variables; the OSX linker typically only
pulls in a .o file from a library if it either has a function that is
used or have a global variable that is initialized (that's the short
version; lots of details and corner cases omitted).  Hence, we changed
the global C variables corresponding to the fortran common blocks to
be initialized, thereby causing the OSX linker to pull them in
automatically -- problem solved.  At the same time, we moved the
constants to another .c file with a function, just for good measure.

However, this didn't really solve the problem:

1. The function in the file with the C versions of the fortran common
   block variables (ompi/mpi/f77/test_constants_f.c) did not have a
   code path that was reachable from ompi_info, so the only reason
   that the constants were found (on OSX) was because they were
   initialized in the global scope (i.e., causing the OSX compiler to
   pull in that .o file).

2. Initializing these variable in the global scope causes problems for
   some linkers where -- once all the size/type problems mentioned
   above were fixed -- the alignments of fortran common blocks and C
   global variables do not match (even though the types of the Fortran
   and C variables match -- wow!).  Hence, initializing the C
   variables would not necessarily match the alignment of what Fortran
   expected, and the linker would issue a warning (i.e., the alignment
   warnings referenced in the original post).

The solution is two-fold:

1. Move the Fortran variables from test_constants_f.c to
   ompi/mpi/runtime/ompi_mpi_init.c where there are other global
   constants that *are* initialized (that had nothing to do with
   fortran, so the alignment issues described above are not a factor),
   and therefore all linkers (including the OSX linker) will pull in
   this .o file and find all the symbols that it needs.

2. Do not initialize the C variables corresponding to the Fortran
   common blocks in the global scope.  Indeed, never initialize them
   at all (because we never need their *values* - we only check for
   their *locations*).  Since nothing is ever written to these
   variables (particularly in the global scope), the linker does not
   see any alignment differences during initialization, but does make
   both the C and Fortran variables have the same addresses (this
   method has been working in LAM/MPI for over a decade).

There were some comments here in the OMPI code base and in the LAM
code base that stated/implied that C variables corresponding to
Fortran common blocks had to have the same alignment as the Fortran
common blocks (i.e., 16).  There were attempts in both code bases to
ensure that this was true.  However, the attempts were wrong (in both
code bases), and I have now read enough Fortran compiler documentation
to convince myself that matching alignments is not required (indeed,
it's beyond our control).  As long as C variables corresponding to
Fortran common blocks are not initialized in the global scope, the
linker will "figure it out" and adjust the alignment to whatever is
required (i.e., the greater of the alignments).  Specifically (to
counter comments that no longer exist in the OMPI code base but still
exist in the LAM code base):

- there is no need to make attempts to specially align C variables
  corresponding to Fortran common blocks
- the types and sizes of C variables corresponding to Fortran common
  blocks should match, but do not need to be on any particular
  alignment 

Finally, as a side effect of this effort, I found a bunch of
inconsistencies with the intent of status/array_of_statuses
parameters.  For all the functions that I modified they should be
"out" (not inout).

This commit was SVN r11057.
2006-07-31 15:07:09 +00:00
Galen Shipman
c9e0eda190 Initialize the completion queue to a reasonable size based on maximum number
of send/receives outstanding.

Use ibv_cq_resize if available after initial creation of completion queue if
cq_size is too small (based on number of peers). 

This commit was SVN r11053.
2006-07-30 00:58:40 +00:00
Jeff Squyres
a181a0ee1e Move dr back up to 1.2
This commit was SVN r11049.
2006-07-28 18:44:47 +00:00
Josh Hursey
d1e1a68645 This commit contains the necessary changes to get "mpirun a.out" working
correctly with MPI_Comm_spawn.

The problem wiht MPI_Comm_spawn was that the 'parent' process was 
rmgr.create'ing and then rmgr.launch'ing the children via the rmgr proxy
component. The HNP saw these commands and processed them normally, but
since we never went through the HNP's rmgr (urm component) spawn() 
logic the triggers and key/value pairs were never created. So the
children were launched correctly, but since the HNP did not
have any triggers setup, never triggered the xcast for the
children to finish orte_init().

This fix puts the trigger and key/value pair initialization in 
rmgr_urm_spawn() for the 'mpirun a.out' case, *and* in the 
rmgr_base_unpack routine that deals with the creation of the
job for the child as requested by the proxy component. This
will allow the triggers to be registered for the proxy's request
which only happens during MPI_Comm_spawn*

Small change for a lot of debugging. Notice that his reverts r11037
to its previous version, and adds a newline to handle the spawn
cases.

This commit was SVN r11046.

The following SVN revision numbers were found above:
  r11037 --> open-mpi/ompi@5813fb7d2a
2006-07-28 17:17:31 +00:00
Jeff Squyres
7f372b4e1f No functional changes -- only re-indent some portions of the code to
make it consistent with the indenting in the rest of the file
(otherwise it was quite difficult to understand -- saw this while I
was reviewing 11039).

This commit was SVN r11042.
2006-07-28 15:47:16 +00:00
Jeff Squyres
6918a3d6f5 Added bullet about host info key to MPI_COMM_SPAWN[_MULTIPLE]
This commit was SVN r11041.
2006-07-28 14:52:52 +00:00
Donald Kerr
2e5e01a8df Remove dependency on known port range and allow udapl to provide the port number.
This commit was SVN r11040.
2006-07-28 13:58:21 +00:00
David Daniel
45894aecee Adding support for MPI_Comm_spawn() to use the 'host' key in an MPI_Info
object if provided.

The associated value is a comma-separated list of hosts -- which must be
in the initial allocation -- and is used to populate the application
context map.

This commit was SVN r11039.
2006-07-27 23:45:33 +00:00
Josh Hursey
5813fb7d2a It seems that MPI_Comm_spawn{_multiple} has been broken since r10708
By reverting this file (changeset from commit r10708) to its previous
version fixes the problem.

This should be moved to the v1.1 branch where it is also broken.

This commit was SVN r11037.

The following SVN revision numbers were found above:
  r10708 --> open-mpi/ompi@febc143d8c
2006-07-27 21:21:10 +00:00
Brian Barrett
63f0a5f52d * Don reports that /dev/poll is borked with his version of Solaris. For
now, don't use it and go back to poll/select like everyone else.

This commit was SVN r11034.
2006-07-27 16:56:09 +00:00
Donald Kerr
fcb932a6d9 Workaround for bug in Solaris udapl library where dat_evd_dequeue does not dequeue DAT_CONNECTION_REQUEST_EVENT.
This commit was SVN r11032.
2006-07-27 16:13:46 +00:00
Terry Dontje
9c070dafef Added missing include files and fixed some compilation errors
in the original code.

This commit was SVN r11031.
2006-07-27 14:44:54 +00:00
Sven Stork
456c872f52 - one more fix for the dist target
This commit was SVN r11029.
2006-07-27 14:20:37 +00:00
Gleb Natapov
72575d81d2 Create separate pool for control messages. It is unlimited, but the maximum number of element that are allocated from it is limited by number of connections.
This commit was SVN r11028.
2006-07-27 14:09:30 +00:00
Jeff Squyres
00c2b62321 Update svn:ignore
This commit was SVN r11027.
2006-07-27 13:36:10 +00:00
Sven Stork
8c10ea1ceb - fix dist target
This commit was SVN r11026.
2006-07-27 13:25:22 +00:00
Brian Barrett
6b00c8ed99 * add listing of which backtrace component got compiled in
This commit was SVN r11025.
2006-07-27 03:48:12 +00:00
Brian Barrett
0e6588c9a4 * ignore some things we should ignore
This commit was SVN r11024.
2006-07-27 03:20:09 +00:00
Brian Barrett
aaf31c6ade * Make the backtrace printing functionality a framework
* Copy Linux and Solaris backtrace support from util/stacktrace.c
* Added backtrace support for Mac OS X.

This commit was SVN r11023.
2006-07-27 02:56:02 +00:00
Brian Barrett
7ea33eac02 Merge in rest of event library update branch, updating the event library to
libevent-1.1a.

svn merge -r10917:11006 https://svn.open-mpi.org/svn/ompi/tmp/libevent-update

This commit was SVN r11022.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r10917
  r11006
2006-07-27 01:51:18 +00:00
Brian Barrett
7c2ca73a1b * merge in tmp/libevent-update branch, r10914 - 10917. Rest coming soon.
This commit was SVN r11021.

The following SVN revisions from the original message are invalid or
inconsistent and therefore were not cross-referenced:
  r10914
2006-07-27 01:48:09 +00:00
Jeff Squyres
7c0d95a0c3 Sync with 1.1.1 NEWS.
This commit was SVN r11016.
2006-07-26 21:50:35 +00:00
Brian Barrett
07514ccf42 * don't install $(headers) and $(nodist_headers) by default, and definitely
not in include_HEADERS.  Fixes bug #222.

This commit was SVN r11014.
2006-07-26 21:20:41 +00:00
Rolf vandeVaart
45719b7de9 Submitted by: Rolf vandeVaart
Reviewed by: Jeff Squyres

Fix for ticket #220.  Missing a few C++ methods.
 MPI::Datatype::Create_indexed_block
 MPI::Datatype::Create_resize
 MPI::Datatype::Get_true_extent

This commit was SVN r11010.
2006-07-26 20:27:14 +00:00
Jeff Squyres
95851353e9 Add note about OSX F77 issue.
This commit was SVN r11004.
2006-07-26 15:48:17 +00:00
Jeff Squyres
9c0a2d8916 Fix a note in the file to attribute the help correctly.
This commit was SVN r11003.
2006-07-26 15:47:51 +00:00
Brian Barrett
4ac0867ef3 * update NEWS file with some features that have been added recently
This commit was SVN r10996.
2006-07-26 14:18:34 +00:00
Brian Barrett
c744f650ba * really didn't mean for this patch (the threaded accept() code) to come in with
r10841, so revert it (and it's fixes) out.  Will bring back once cleaned up from
  the code used in the tbird experiment

This commit was SVN r10991.

The following SVN revision numbers were found above:
  r10841 --> open-mpi/ompi@dfa1221c3b
2006-07-25 22:32:01 +00:00
Jeff Squyres
77e0c7b383 Remove compiler warning. Remove this when CM cancel is fully implemented.
This commit was SVN r10986.
2006-07-25 21:46:04 +00:00
Jeff Squyres
c2d4dfce78 Remove unused variable
This commit was SVN r10985.
2006-07-25 21:43:21 +00:00
Jeff Squyres
bdab8d744c Send a pointer to the data, not the data itself. Otherwise, we could
get a segv in some cases.

This commit was SVN r10984.
2006-07-25 21:42:44 +00:00
Rainer Keller
ee27f7e2c7 - As according to MPI-1.2, sec 3.2.5, p22, single request
functions MPI_Test, MPI_Testany, MPI_Wait, MPI_Waitany
   should not reset the status.MPI_ERROR as passed by user.
 - This needed implementing the MPI_Waitsome and MPI_Testsome.

This commit was SVN r10980.
2006-07-25 15:29:37 +00:00
Rainer Keller
31c66d92aa Minor fixes to match standard -- and run strict test of mpi_test_suite:
- bsend_init: use *request after error-checking
 - Always reset the status->cancelled
 - cancel, wait: need to check *request for MPI_REQUEST_NULL, not
   NULL...
   (actually ompi_request_wait handles MPI_PROC_NULL, so no need
   to check&set of status_empty in wait.c)

This commit was SVN r10972.
2006-07-24 16:59:01 +00:00
Terry Dontje
215542ad38 Added code to pull in ucontext.h to provide the prototype for
the printstack function for Solaris systems.

This commit was SVN r10967.
2006-07-24 12:14:24 +00:00
Gleb Natapov
4b605295b3 remove unused field.
This commit was SVN r10965.
2006-07-24 06:12:16 +00:00
Gleb Natapov
3b34dc8df8 remove MCA_BTL_IB_FRAG_ALIGN. Alignment is handled in free_list_t.
This commit was SVN r10945.
2006-07-23 12:33:49 +00:00
Sven Stork
f9fd98449c - add missing prefetch header to fix dist target
This commit was SVN r10934.
2006-07-21 08:20:42 +00:00
Ralph Castain
65acc9325a Fix a bug that crept in during the last change to support "mpirun a.out" operations. Since we now reserve a range of vpids for each app_context, we no longer need to track the rank and offset the starting vpid each time through the mapper - the name service automatically accounts for the offset when allocating the next starting vpid for the job.
This should be shifted to v1.1.

This commit was SVN r10916.
2006-07-20 21:06:15 +00:00
Jeff Squyres
0c102e6e5b Fix OSX linker problems with the Fortran bindings:
- ensure to initialize the values that we use for fortran constants
  (even tough their *values* don't matter -- only their *addresses* do,
  but initializing them or not has implications for the OSX linker)
- move the fortran constants to a file with functions in it, because
  the OSX linker sometimes does not import global variables from
  object files that do not have functions (I'm not even going to
  pretend to get all the subtle details about the OSX linker right
  here -- it's just "better" to have global variables in object files
  with functions that otherwise get pulled in during linker
  resolution).

This commit was SVN r10908.
2006-07-20 19:48:03 +00:00
Gleb Natapov
91f48f9a79 Merge with gleb-pml branch. Add out of resource handling support to PML layer.
If resource is not available request is added to one of the pending list and retried later.

This commit was SVN r10900.
2006-07-20 14:44:35 +00:00
Gleb Natapov
383694c68d Add support to get alignemnt buffers from free_list_t. Convert openib BTL to new interface.
This commit was SVN r10899.
2006-07-20 14:39:05 +00:00
Jeff Squyres
7899057d4e Add a check for now that invokes an MPI exception if you try to
SPAWN[_MULTIPLE] from a singleton (and displays a pretty help message
explaining that you need to use mpirun).  This can be removed when
fixes for ORTE come over that allow SPAWN[_MULTIPLE] from singletons. 

This commit was SVN r10898.
2006-07-20 14:27:13 +00:00
Gleb Natapov
90fc0c5cc7 don't lookup registration in the empty cache.
This commit was SVN r10897.
2006-07-20 14:01:57 +00:00
Brian Barrett
4c101c6394 * rename the collectives sm bootstrap area to be consistent with other
shared memory segments
* make sure to properly unlink the collectives sm bootstrap area at
  shutdown
* Add missing / in the path for the mpool shared memory segment
* make sure to release the common_mmap structure in the SM btl
  after unlinking the file during shutdown

This commit was SVN r10886.
2006-07-19 20:55:29 +00:00
Jeff Squyres
32c1f38976 Be a little smarter when checking the MCA parameters that specify what
components to load:

- only allow the ^ to be the first character of the value
- if we find ^ elsewhere in the value, print an error and fail  

This commit was SVN r10880.
2006-07-19 14:19:44 +00:00
Brian Barrett
0b15943a7a * return the MPI_ERROR field of the status as the return code for
MPI_WAIT, MPI_TEST, MPI_WAITANY, and MPI_TESTANY.  It isn't really
  clear what the standard wants as the return code for these functions, 
  and this is what Sun MPI, LAM/MPI, and MPICH2 all do.

  Fixes trac:172

This commit was SVN r10872.

The following Trac tickets were found above:
  Ticket 172 --> https://svn.open-mpi.org/trac/ompi/ticket/172
2006-07-18 21:28:45 +00:00
George Bosilca
0c4f18b397 As this object was created using the OBJ_NEW it should be destroyed using OBJ_RELEASE.
This commit was SVN r10869.
2006-07-18 18:42:30 +00:00
Rainer Keller
ac58e85c83 - Add the missing collective (and other) functions to mpi.f03
- Correct intent(out) to inout for various recvbufs to match
   standards possibility for MPI_IN_PLACE.

This commit was SVN r10868.
2006-07-18 18:12:09 +00:00
George Bosilca
d34b51b8ec Correctly compute the gaps inside the datatype. They depend on the shape of the
final datatype not on the shape of the added datatype. The gaps exist if the
extent of the final datatype is not equal to its size.

This commit was SVN r10867.
2006-07-18 15:47:12 +00:00
Ralph Castain
8bec270f90 Fix a bug noted by Jeff - we were no longer accurately recording in the registry that a process had been terminated when the user initiated the "kill" process (via cntrl-c).
Added another system-level test function for ORTE that just spins until terminated by a ctrl-c signal.

Modified orterun - added a couple of newlines to the output when abnormally terminating so the prompt always is on a new line.

This commit was SVN r10866.
2006-07-18 14:42:27 +00:00