1
1
Граф коммитов

13585 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
814a8f5e0f * Fix #1916: endian problems in iwarp wireup on big endian machines
(now works on both big and little endian machines)
 * Be a little more flexible when looking for active devices in
   btl_openib_component.c
 * Add device name and port number to lots of verbose and help
   messages
 * Add a bunch of verbose messages to give insight into what is
   occurring during all the CPC wireups

This commit was SVN r21418.
2009-06-11 17:30:30 +00:00
Ralph Castain
4881cd0df3 Revert the prior change out from the individual .h files - the problem was in the Makefile.am's, causing the make dist to fail.
This commit was SVN r21414.
2009-06-11 03:15:47 +00:00
Ralph Castain
91ab2b3e4f Specify complete path to included header files so it compiles in all environments
This commit was SVN r21412.
2009-06-11 02:46:30 +00:00
Jeff Squyres
6777f01380 Also look for /dev/ipath
This commit was SVN r21410.
2009-06-11 00:35:21 +00:00
Avneesh Pant
295c0c36bc Add new QLogic adapter to device parameter file.
This commit was SVN r21409.
2009-06-10 23:37:38 +00:00
Ralph Castain
70b8c89b44 Fix slave spawn, which was hanging because the local daemon never saw the slave job report - it doesn't do it in the normal way, and so the slave launch system itself has to "fake it".
Also complete implementation to printout app_context objects so we see all the fields.

This commit was SVN r21408.
2009-06-10 19:01:08 +00:00
Ralph Castain
f24cefe3d2 Shift the check for adequate file descriptors to after we check if this proc is part of the job to be launched - no point in doing the check more often than absolutely required
This commit was SVN r21406.
2009-06-10 15:18:48 +00:00
Ralph Castain
87d7d693f0 Add a notifier call when the oob retries are exceeded so sys admins are aware of the problem
This commit was SVN r21405.
2009-06-10 15:17:16 +00:00
Terry Dontje
fa9f356c8c This commit fixes trac:1931. By separating out structures and defines used by
the debugger plugins into files suffixed by _dbg.h.

This commit was SVN r21404.

The following Trac tickets were found above:
  Ticket 1931 --> https://svn.open-mpi.org/trac/ompi/ticket/1931
2009-06-10 14:57:28 +00:00
Ralph Castain
f966d9f972 Fix visibility issues with opal_graph functions.
Fix the carto test so it can compile - need to update input file so it can run

This commit was SVN r21403.
2009-06-09 15:02:57 +00:00
Ralph Castain
c3c1ab1337 Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate.
Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer.

This commit was SVN r21402.
2009-06-09 14:33:35 +00:00
George Bosilca
f9a510fd8a There is no need for an atomic read if we are not in a threaded case.
This commit was SVN r21394.
2009-06-08 23:55:52 +00:00
George Bosilca
77a6f27d44 Update the call to orte_plm_base_create_jobid based on the new interface.
This commit was SVN r21393.
2009-06-08 23:53:53 +00:00
Ralph Castain
86d55d7ebf Fix tight loops over comm_spawn by checking to see if the system has enough child procs and file descriptors available before attempting to launch. If not, introduce a 1sec delay and then test again. This provides a chance for the orted to complete processing of proc terminations from other children, hopefully creating room for the new proc(s).
Update the loop_spawn test to remove a sleep so that it runs at max speed, letting the new code catch when we overrun ourselves and wait for room to be cleared for the next comm_spawn.

This commit was SVN r21390.
2009-06-08 18:28:26 +00:00
Shiqing Fan
5a90b3068e Two type casts.
This commit was SVN r21388.
2009-06-07 12:51:46 +00:00
Ralph Castain
0a67bcb653 Minor cleanups
This commit was SVN r21387.
2009-06-06 15:44:00 +00:00
Ralph Castain
ccf6b2cb8c Implement the ability to register callbacks when specified error states occur
This commit was SVN r21386.
2009-06-06 01:15:31 +00:00
Ralph Castain
0336460b0a Continue implementation of resilient operations by supporting reuse of jobids for restarted procs. Ensure that restarted processes have valid node and local ranks, and that node rank values are passed to direct-launched processes.
This commit was SVN r21385.
2009-06-06 01:08:47 +00:00
Josh Hursey
70333b9441 Some components were still using OMPI_*_VERSION instead of OPAL_*_VERSION, so convert them over (Jeff is taking care of PLPA, so that is not included here).
This commit was SVN r21384.
2009-06-05 15:34:59 +00:00
Matthias Jurenz
fac893838f Added configure option '--with-bfd-objects' to include the object files from the static BFD library into the VT libraries
(it's strongly recommend for RPM builds of OMPI to avoid BFD incompatibilities)

This commit was SVN r21383.
2009-06-05 11:28:55 +00:00
Rainer Keller
499f0850ae - Update from yesterdays patch -- just check once and for all, getting
rid of optimization flags beforehand...

This commit was SVN r21370.
2009-06-05 01:00:59 +00:00
Rainer Keller
1b8d0cf146 - AC_TRY_RUN does not work in case of cross-compilation environments.
Tests based on preprocessor CPP output, won't help either, as they
   don't spit out nice computed numerical values, but rather the
   #define itselve.

This commit was SVN r21364.
2009-06-04 13:00:16 +00:00
Rainer Keller
aadf1add6c - In case of Intel compiler using FFLAGS=-ipo (included in -fast), the
produced object file does not contain external sumbols -- bad for the
   configure tests to find out external symbols.

   The only way to check for Fortran naming convention is to rid of
     -ipo and -fast in case of ifort...

   Thanks to Michel Devel for bringing this up.

This commit was SVN r21363.
2009-06-03 15:13:07 +00:00
Ralph Castain
3815bfbba6 Provide a better error message when the oob cannot send a message after exhausting retries, and then have the proc abort so the job doesn't just hang forever.
Since it could be a daemon that needs to abort, cleanup the abort sequence so the daemon can exit as cleanly as possible.

This commit was SVN r21361.
2009-06-02 23:57:12 +00:00
Ralph Castain
882b40182b Add --show-progress option to mpirun
This commit was SVN r21360.
2009-06-02 23:52:59 +00:00
Ralph Castain
30a357bd8d Provide a "progress meter" for launch that outputs progress as we are launching, especially on large jobs. Also, provide a timeout mechanism so that we cleanly abort if we don't get a response from the next daemon in a specified time.
This commit was SVN r21359.
2009-06-02 23:52:02 +00:00
Ralph Castain
4d3aa5a8a4 Once again, into the breach!
Yes, friends, our favorite PCIE BTL has resurfaced as mgmt vacillates over its existence. This is an updated version that actually mostly works, in its final stages of debugging.

Some generalization still remains to be done...

This commit was SVN r21358.
2009-06-02 22:26:36 +00:00
Jeff Squyres
b363fc2a6b Per the discussion yesterday, remove all use of MPI_Flogical -- it's
not an official MPI type, so it shouldn't be named that way.  Instead,
use ompi_fortran_logical_t.

This commit was SVN r21353.
2009-06-02 12:04:17 +00:00
Ralph Castain
303e3a1d39 Add a resilient mapping capability - currently maps by fault groups (if provided), still need to add the remapping capability for failed procs.
This commit was SVN r21350.
2009-06-02 03:23:20 +00:00
Shiqing Fan
70158790e5 First commit to support MPI Extended Interface on Windows.
This commit was SVN r21344.
2009-06-01 19:24:39 +00:00
Shiqing Fan
e46bf10efd Correctly include win32 util header.
This commit was SVN r21343.
2009-06-01 19:16:00 +00:00
Rainer Keller
b572dc3591 - As discussed revert r21330, Fortran-configure info should
not end up in OPAL
 - Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).

This commit was SVN r21342.

The following SVN revision numbers were found above:
  r21330 --> open-mpi/ompi@95596d1814
2009-06-01 19:02:34 +00:00
Jeff Squyres
2f810a1e5e Fix a compiler warning and do some slightly-smarter unsigned int
checks.

This commit was SVN r21341.
2009-06-01 14:12:46 +00:00
Jeff Squyres
df7c387155 I'm (temporarily?) removing this entry because there's no .window file
in this directory and it's causing "make dist" to break.

Shiqing -- is there a missing file in this directory?  If so, please
add it and restore the EXTRA_DIST line I just removed.  Thanks!

This commit was SVN r21340.
2009-06-01 14:07:08 +00:00
Jeff Squyres
055b50213b A slightly better way to fix the problem noted in r21137.
This commit was SVN r21339.

The following SVN revision numbers were found above:
  r21137 --> open-mpi/ompi@f6da7d86a2
2009-06-01 13:58:37 +00:00
Josh Hursey
1fef44085b As a result of r21308, if we are building with "--disable-mpi-io " make sure that MPI_MAX_DATAREP_STRING is defined in mpi.h
MTT caught builds failing with the following error which flagged the problem:
{{{
param.cc:802: error: ‘MPI_MAX_DATAREP_STRING’ was not declared in this scope
}}}

This commit was SVN r21337.

The following SVN revision numbers were found above:
  r21308 --> open-mpi/ompi@2f9765926e
2009-06-01 12:44:53 +00:00
Ralph Castain
137104b786 Initial support for CMs - needs to be pruned as CM support develops
This commit was SVN r21335.
2009-05-30 20:57:23 +00:00
Ralph Castain
4a31c65126 Add initial support for CMs
This commit was SVN r21334.
2009-05-30 20:55:55 +00:00
Ralph Castain
a1005a716f Allow CM's to select the default errmgr component. Add support for error function callbacks
This commit was SVN r21333.
2009-05-30 20:43:42 +00:00
Ralph Castain
a95731fc68 Minor update to let apps set their own component selections if desired, while preserving slave behavior
This commit was SVN r21332.
2009-05-30 20:42:23 +00:00
Rainer Keller
95596d1814 - Move alignment and size output generated by configure-tests
into the OPAL namespace, eliminating cases like opal/util/arch.c
   testing for ompi_fortran_logical_t.
   As this is processor- and compiler-related information
   (e.g. does the compiler/architecture support REAL*16)
   this should have been on the OPAL layer.
 - Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t

 - Tested locally (Linux/x86-64) with mpich and intel testsuite
   but would like to get this week-ends MTT output


 - PLEASE NOTE: configure-internal macro-names and
   ompi_cv_ variables have not been changed, so that
   external platform (not in contrib/) files still work.

This commit was SVN r21330.
2009-05-30 15:54:29 +00:00
Ralph Castain
4e0223a638 Add the ability to directly launch procs via rsh/ssh. Collect common functions in plm/base. Create a new global param to set assume_same_shell, alias'd back to plm_rsh_assume_same_shell (not deprecated).
This commit was SVN r21328.
2009-05-30 01:10:25 +00:00
Nysal Jan
85773de539 Move 'multi-frag receive' fixes to csum PML
This commit was SVN r21323.
2009-05-29 08:07:33 +00:00
Rainer Keller
9ef87898e0 (originally, did not want to do this during business hours, but
well..)
 - As Jeff suggested, for m4 macros, dont use _ OPAL, but
   rather OPAL_ prefix
 - Set the variable before AC_SUBST, so that replacement happens
   in f77 header-file, too.

This commit was SVN r21316.
2009-05-28 20:28:43 +00:00
Jeff Squyres
07c7a7a255 Make ompi_info behave a bit better if STL decides not to play with 0
length string constructors.  Print a developer warning if we have a
too-long field (in developer builds only).

This commit was SVN r21315.
2009-05-28 16:15:06 +00:00
Jeff Squyres
590fbcff6c Shorten the labels so that we don't throw C++ exceptions for 0-length
strings (true fix for that coming after this).  Also rename so that
its names that MPI users will care about.

This commit was SVN r21314.
2009-05-28 15:51:46 +00:00
Jeff Squyres
97fa83d24f Fix a compiler warning: string::size_type is unsigned, so checking for
>=0 is meaniningless.

This commit was SVN r21313.
2009-05-28 15:44:11 +00:00
Jeff Squyres
f960f2d944 Fix compiler warning
This commit was SVN r21312.
2009-05-28 13:34:48 +00:00
Jeff Squyres
5ea1b776f7 Remove a compiler warning about an empty format string. The proper
way to have no abort message is to pass NULL (the errmanager is smart
enough to handle this case and not emit any extra message).

This commit was SVN r21311.
2009-05-28 13:32:37 +00:00
Jeff Squyres
1834fc4ac6 Nysal noticed some repeated header files; removed.
This commit was SVN r21310.
2009-05-28 12:05:42 +00:00