1
1
Граф коммитов

11530 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
ace1717ca7 Patch from Brian to add in proper linker libraries
This commit was SVN r17919.
2008-03-21 23:00:54 +00:00
Jeff Squyres
05a7b1ed55 Remove svn:executable from these files.
This commit was SVN r17918.
2008-03-21 21:16:11 +00:00
Brian Barrett
f176c67cd2 Set the nodeid to something somewhat sane if we're not using modex, and
don't set the LOCAL flag just because both procs have an invalid nodeid.

This commit was SVN r17917.
2008-03-21 20:20:00 +00:00
Brian Barrett
5a7ebf5f25 Do not try to update the local process with modex information (from the local
process) as it stomps on information if the modex doesn't exist for the
current platform

This commit was SVN r17916.
2008-03-21 19:20:47 +00:00
Jeff Squyres
a4ec8a9d53 Spring cleaning -- no one is using this stuff; remove it from the tree.
This commit was SVN r17913.
2008-03-21 17:14:42 +00:00
Ralph Castain
f8642e9390 Add debug to tell us when we opened a socket and to whom
This commit was SVN r17911.
2008-03-21 15:47:47 +00:00
Ralph Castain
19ffdfef42 Add some debugging output to tell us what interfaces were considered and used by OOB
This commit was SVN r17909.
2008-03-21 15:35:40 +00:00
Jeff Squyres
4fbcb75ce8 With 5 commits over a 16 hour period and 3 broken tarball builds and a
still-broken trunk build on common platforms (e.g., 64 bit Linux
RHEL4U4), I think it's clear that this code is not ready for
prime-time.

I'm backing out all the commits in the trunk/ompi/op tree from r17901
onwards.  This code can be re-committed when compiles and runs on
common platforms.

cd ompi/op
svn merge -r 17907:17900 https://svn.open-mpi.org/svn/ompi/trunk/ompi/op .

This commit was SVN r17908.

The following SVN revision numbers were found above:
  r17901 --> open-mpi/ompi@b9520e61dc
2008-03-21 14:47:01 +00:00
Jeff Squyres
8284f64af1 With r17906, this commit should make the trunk compile again.
This commit was SVN r17907.

The following SVN revision numbers were found above:
  r17906 --> open-mpi/ompi@df4a6c3fc5
2008-03-21 13:49:23 +00:00
Rich Graham
df4a6c3fc5 fix function prototypes for new 3 buffer routines.
This commit was SVN r17906.
2008-03-21 13:44:15 +00:00
Ralph Castain
b2655ab585 Per Brian's suggestion, remove unnecessary library dependency - libtool automagically picks up the other libraries when we include libmpi
This commit was SVN r17905.
2008-03-21 12:47:04 +00:00
Rich Graham
0974160e29 correct several of the new macros.
This commit was SVN r17904.
2008-03-21 03:45:43 +00:00
Rich Graham
a7c836a2b0 fix location of the restrict key word.
Make the tag in the fan-in/fan-out algorithm be fragment based.

This commit was SVN r17903.
2008-03-21 01:40:36 +00:00
Rich Graham
2c66d396b7 take care of some bit-rot with the fanin-fanout method.
This commit was SVN r17902.
2008-03-21 01:08:49 +00:00
Rich Graham
b9520e61dc get the sm optimized allreduce working for all but user defined
operations.  Added to the reduction operations a set of reduction
functions that take 2 input buffers and one output buffer to avoid
some extra memory copies.  These can't be used with user defined
operations.  The intel c collective suite passes both original, and
new (new, not the user defined operations).

This commit was SVN r17901.
2008-03-20 23:51:16 +00:00
Galen Shipman
dcac824f59 Fix problem in releasing fragments during GET_END event (didn't check that
portals btl has ownership and therefor didn't free the frag as it should) this
causes leakage and hangs in MPI_Finalize. 

Also added a bit more debugging. 

This commit was SVN r17900.
2008-03-20 22:46:32 +00:00
Ralph Castain
c2fd5dd416 Clarify method used to translate application proc termination codes to exit status codes
This commit was SVN r17899.
2008-03-20 18:50:05 +00:00
Brian Barrett
2bf4784893 Set a meaningful orte_system_info.nodeid on Catamount
This commit was SVN r17898.
2008-03-20 16:55:57 +00:00
Ralph Castain
f8a10dfb93 Complete the fix of the orted vs mpirun race condition for finalizing. The darned mpirun is just too fast! Rather than try to slow it down, we set the orte_finalizing flag -prior- to telling mpirun the orted is leaving. This ensures we don't mistakenly declare the lifeline lost when mpirun leaves in a hurry.
This commit was SVN r17897.
2008-03-20 16:55:24 +00:00
Ralph Castain
6bb139e4f2 One more correction to mpirun exit codes - cleanup the application proc's exit codes in the orted so that non-zero exit codes generated by mpirun itself don't get "munged".
Modify the multi_abort function so they all return different exit codes - allows us to tell which one was being reported.

This commit was SVN r17895.
2008-03-20 13:54:11 +00:00
Ralph Castain
27a73ad9ee Fix a race condition between the orteds and HNP that can cause the orteds to output the "lost lifeline" message.
This has been a long-time problem. I tried to reduce the problem by having the orteds tell the HNP they were finalizing, and having the HNP wait until all orteds had reported or we timed out.

What was observed was that all the orteds were correctly reporting that they are leaving, but the HNP is able to exit before the orteds, thus closing the orteds lifeline socket and generating the error output. This is caused by the fact that the orteds have to whack all remaining session directories, which includes that blasted monster shared memory file! Cleaning up the SM file can take quite a while.

The HNP doesn't have that problem as there is no SM file there! So it gets out first.

What we had done in the past to resolve that problem was put a little test in the OOB that checks to see if we are finalizing. If we are, then we ignore the lifeline connection being lost. That check was still in the code - however, we had lost the line in orte_finalize that set the flag!!

This commit was SVN r17893.
2008-03-20 13:30:51 +00:00
Ralph Castain
8ee26a55ca Just turn these off for now - will revisit later
This commit was SVN r17891.
2008-03-20 13:25:35 +00:00
Jeff Squyres
e0fb3957cb Patch from Brian:
* The opal_sys_timer_get_cycles() call was implemented for
   Sparc v9 using inline assembly, but not in the assembly files.
   This would only currently matter on Linux Sparc systems using
   a compiler that didn't support inline assembly (not many of
   those), but it should be there for completion.
 * The linux timer component would always build on non-Alpha
   platforms, rather than only building on platforms where
   opal_sys_timer_get_cycles() was implemented.  This would
   only matter on a very narrow set of platforms that we don't
   really support, but still, it could be more right.  We now
   only build the component on platforms where we have the
   assembly call to get the cycle counter.
 * Added a comment to opal/sys/timer.h to note that the linux
   timer component needed to be updated if another platform was
   added.

This should be harmless to commit.  It will only really change
behaviors on platforms we don't have assembly support for, which
currently won't make it through configure.  It really only matters
when (if?) we support atomic operations through libatomic_ops.

This commit was SVN r17887.
2008-03-20 00:29:36 +00:00
Ralph Castain
67a2cc8a8e Fix a bug noted by Tim P where we would report the incorrect app_context as "not found". If you gave us the command line:
mpirun -n 1 hostname : -n 1 bogus

we would erroneously report that hostname had not been found instead of bogus.

This commit was SVN r17886.
2008-03-19 21:13:13 +00:00
Tim Mattox
715b05d663 Update the NEWS for 1.2.6.
This commit was SVN r17885.
2008-03-19 21:04:54 +00:00
Ralph Castain
ec64bf3da8 Clarify the error output so we can understand if it was a daemon or process that lost its lifeline
This commit was SVN r17880.
2008-03-19 19:06:52 +00:00
Ralph Castain
2ed0e60321 Bring some sanity to the exit code returned by mpirun. Ensure that we provide a non-zero code if something goes wrong, including someone exiting after calling mpi_init without calling mpi_finalize.
Jeff is preparing an (undoubtedly lengthy) explanation/matrix of how these codes are determined for the OMPI FAQ.

This commit was SVN r17879.
2008-03-19 19:00:51 +00:00
Galen Shipman
80ac7c87cd don't forget command file..
This commit was SVN r17878.
2008-03-19 16:24:29 +00:00
Galen Shipman
77c8532cc9 do things in a less hacky way..
This commit was SVN r17877.
2008-03-19 16:23:56 +00:00
Lenny Verkhovsky
13ff2a0f34 local declaration instead of using global variable
This commit was SVN r17876.
2008-03-19 13:04:40 +00:00
Jeff Squyres
4314609a00 * Remove a meaningless clause (it could never be true)
* Fix an error message to correctly display if we were before
   MPI_INIT or after MPI_FINALIZE (refs trac:1243)

This commit was SVN r17873.

The following Trac tickets were found above:
  Ticket 1243 --> https://svn.open-mpi.org/trac/ompi/ticket/1243
2008-03-18 22:26:43 +00:00
George Bosilca
efa89bfa3f Revert r17857. The context should be set in one case ... when we call prepare_{src|dst}
without calling a get or put. So, just keep it here until a better solution is
found.

This commit was SVN r17872.

The following SVN revision numbers were found above:
  r17857 --> open-mpi/ompi@d460ccfbf9
2008-03-18 19:01:27 +00:00
Ralph Castain
f39ce707b5 Remove an ORTE debug flag from an MPI function
This commit was SVN r17871.
2008-03-18 18:25:45 +00:00
Jeff Squyres
3a5084da94 Add config.h.in to svn:ignore
This commit was SVN r17868.
2008-03-18 17:10:27 +00:00
Jeff Squyres
a9028d21dd This file is generated; it should not be in SVN.
This commit was SVN r17867.
2008-03-18 16:46:53 +00:00
Jeff Squyres
ac2e329353 Oops! That should not have been removed...
This commit was SVN r17865.
2008-03-18 14:42:30 +00:00
Jeff Squyres
bd92720d41 More fixes to make it compile and play nice on OS X. Still more fixes
are required; sending mail to devel shortly...

This commit was SVN r17864.
2008-03-18 14:38:52 +00:00
Ralph Castain
32a82349df More fixes to cleanup compiler warnings for rank_file code
This commit was SVN r17863.
2008-03-18 13:21:38 +00:00
Ralph Castain
8f31a62600 Fix compilation errors so this will compile, remove unused variables
This commit was SVN r17862.
2008-03-18 13:01:26 +00:00
Lenny Verkhovsky
647bce6d3e Support for new RMAPS rank mapping component
This commit was SVN r17860.
2008-03-18 09:39:07 +00:00
Lenny Verkhovsky
14c32f87d5 Added new RMAPS component for rank mapping
This commit was SVN r17859.
2008-03-18 09:33:49 +00:00
George Bosilca
8943ae0b4e Cleanup plus some typos.
This commit was SVN r17858.
2008-03-18 03:03:33 +00:00
George Bosilca
d460ccfbf9 No need to check for NULL there. The bml_btl is set correctly
on the upper level.

This commit was SVN r17857.
2008-03-18 03:02:31 +00:00
George Bosilca
3997639ec6 Hide what should be hidden, and expose the others. Plus some indentation.
This commit was SVN r17856.
2008-03-18 03:00:08 +00:00
George Bosilca
39353ebb44 Cleanup.
This commit was SVN r17855.
2008-03-18 02:56:50 +00:00
George Bosilca
76deec135e The .h file is not used anymore (it contain the descriptor cache). Update the
Makefile.am file as well.

This commit was SVN r17854.
2008-03-18 02:50:24 +00:00
George Bosilca
1d04ec4ded Correct the connection logic for TCP. Now we have not only a cleaner
connection, but a more thread safe one. Thanks to Pierre for his
help on this.

This commit was SVN r17853.
2008-03-18 02:42:16 +00:00
Jeff Squyres
61290c0e51 Remove a useless file.
This commit was SVN r17852.
2008-03-18 01:50:47 +00:00
Ralph Castain
be7d0a8a4d Fix a problem introduced by the conversion of orte_pointer_array to opal_pointer_array. We used to derive the app context's index from the returned index of the orte_pointer_array_add function - this parameter was lost in the transition to opal_pointer_array_add. As a result, we no longer knew the index of the app_context, so everything is launched with app0.
This commit was SVN r17851.
2008-03-17 23:48:10 +00:00
Jeff Squyres
12426b64ea Per MPI-2 ballot 3, the definition of MPI::BOTTOM has changed. w00t!
Fixes trac:1175.

This commit was SVN r17850.

The following Trac tickets were found above:
  Ticket 1175 --> https://svn.open-mpi.org/trac/ompi/ticket/1175
2008-03-17 21:42:27 +00:00