Abhishek Kulkarni
63845591a0
This fix adds a missing topic (cant-open-logfile) to help-orte-top.txt
...
and changes another incorrectly titled topic from pid-required
to no-contact-given.
This commit was SVN r21431.
2009-06-13 23:05:44 +00:00
Ralph Castain
b087336c6c
Refinements to platform file
...
This commit was SVN r21429.
2009-06-12 19:48:30 +00:00
Ralph Castain
c0c56e30c9
Add a missing function to the resilient mapper so it defines daemons in case they are needed
...
This commit was SVN r21428.
2009-06-12 19:48:13 +00:00
Ralph Castain
44bb265a52
Add a new MPI_Info key to preposition OMPI libraries - implementation underway, but this just defines and passes the new key
...
This commit was SVN r21425.
2009-06-12 17:53:13 +00:00
Ralph Castain
170327e575
Reorg the rmaps components to collect shared code for byslot and bynode mapping in the base so we quit duplicating it in every mapper
...
This commit was SVN r21424.
2009-06-12 17:52:17 +00:00
Ralph Castain
1ee4acb247
Cleanup how we handle pointer arrays in the odls base fns to avoid potential segfaults
...
This commit was SVN r21423.
2009-06-12 17:51:23 +00:00
Ralph Castain
89b7e20a8b
Add some Cisco-related platform files
...
This commit was SVN r21422.
2009-06-12 17:50:14 +00:00
Jeff Squyres
814a8f5e0f
* Fix #1916 : endian problems in iwarp wireup on big endian machines
...
(now works on both big and little endian machines)
* Be a little more flexible when looking for active devices in
btl_openib_component.c
* Add device name and port number to lots of verbose and help
messages
* Add a bunch of verbose messages to give insight into what is
occurring during all the CPC wireups
This commit was SVN r21418.
2009-06-11 17:30:30 +00:00
Ralph Castain
4881cd0df3
Revert the prior change out from the individual .h files - the problem was in the Makefile.am's, causing the make dist to fail.
...
This commit was SVN r21414.
2009-06-11 03:15:47 +00:00
Ralph Castain
91ab2b3e4f
Specify complete path to included header files so it compiles in all environments
...
This commit was SVN r21412.
2009-06-11 02:46:30 +00:00
Jeff Squyres
6777f01380
Also look for /dev/ipath
...
This commit was SVN r21410.
2009-06-11 00:35:21 +00:00
Avneesh Pant
295c0c36bc
Add new QLogic adapter to device parameter file.
...
This commit was SVN r21409.
2009-06-10 23:37:38 +00:00
Ralph Castain
70b8c89b44
Fix slave spawn, which was hanging because the local daemon never saw the slave job report - it doesn't do it in the normal way, and so the slave launch system itself has to "fake it".
...
Also complete implementation to printout app_context objects so we see all the fields.
This commit was SVN r21408.
2009-06-10 19:01:08 +00:00
Ralph Castain
f24cefe3d2
Shift the check for adequate file descriptors to after we check if this proc is part of the job to be launched - no point in doing the check more often than absolutely required
...
This commit was SVN r21406.
2009-06-10 15:18:48 +00:00
Ralph Castain
87d7d693f0
Add a notifier call when the oob retries are exceeded so sys admins are aware of the problem
...
This commit was SVN r21405.
2009-06-10 15:17:16 +00:00
Terry Dontje
fa9f356c8c
This commit fixes trac:1931. By separating out structures and defines used by
...
the debugger plugins into files suffixed by _dbg.h.
This commit was SVN r21404.
The following Trac tickets were found above:
Ticket 1931 --> https://svn.open-mpi.org/trac/ompi/ticket/1931
2009-06-10 14:57:28 +00:00
Ralph Castain
f966d9f972
Fix visibility issues with opal_graph functions.
...
Fix the carto test so it can compile - need to update input file so it can run
This commit was SVN r21403.
2009-06-09 15:02:57 +00:00
Ralph Castain
c3c1ab1337
Correct a comment in paffinity.h about what paffinity_get returns - it was inaccurate.
...
Revamp the affinity detection/set procedure in mpi_init to correctly detect when we have already been bound to processors, given the revised understanding of paffinity_get. Add a new paffinity macro to make checking for already bound a little nicer.
This commit was SVN r21402.
2009-06-09 14:33:35 +00:00
George Bosilca
f9a510fd8a
There is no need for an atomic read if we are not in a threaded case.
...
This commit was SVN r21394.
2009-06-08 23:55:52 +00:00
George Bosilca
77a6f27d44
Update the call to orte_plm_base_create_jobid based on the new interface.
...
This commit was SVN r21393.
2009-06-08 23:53:53 +00:00
Ralph Castain
86d55d7ebf
Fix tight loops over comm_spawn by checking to see if the system has enough child procs and file descriptors available before attempting to launch. If not, introduce a 1sec delay and then test again. This provides a chance for the orted to complete processing of proc terminations from other children, hopefully creating room for the new proc(s).
...
Update the loop_spawn test to remove a sleep so that it runs at max speed, letting the new code catch when we overrun ourselves and wait for room to be cleared for the next comm_spawn.
This commit was SVN r21390.
2009-06-08 18:28:26 +00:00
Shiqing Fan
5a90b3068e
Two type casts.
...
This commit was SVN r21388.
2009-06-07 12:51:46 +00:00
Ralph Castain
0a67bcb653
Minor cleanups
...
This commit was SVN r21387.
2009-06-06 15:44:00 +00:00
Ralph Castain
ccf6b2cb8c
Implement the ability to register callbacks when specified error states occur
...
This commit was SVN r21386.
2009-06-06 01:15:31 +00:00
Ralph Castain
0336460b0a
Continue implementation of resilient operations by supporting reuse of jobids for restarted procs. Ensure that restarted processes have valid node and local ranks, and that node rank values are passed to direct-launched processes.
...
This commit was SVN r21385.
2009-06-06 01:08:47 +00:00
Josh Hursey
70333b9441
Some components were still using OMPI_*_VERSION instead of OPAL_*_VERSION, so convert them over (Jeff is taking care of PLPA, so that is not included here).
...
This commit was SVN r21384.
2009-06-05 15:34:59 +00:00
Matthias Jurenz
fac893838f
Added configure option '--with-bfd-objects' to include the object files from the static BFD library into the VT libraries
...
(it's strongly recommend for RPM builds of OMPI to avoid BFD incompatibilities)
This commit was SVN r21383.
2009-06-05 11:28:55 +00:00
Rainer Keller
499f0850ae
- Update from yesterdays patch -- just check once and for all, getting
...
rid of optimization flags beforehand...
This commit was SVN r21370.
2009-06-05 01:00:59 +00:00
Rainer Keller
1b8d0cf146
- AC_TRY_RUN does not work in case of cross-compilation environments.
...
Tests based on preprocessor CPP output, won't help either, as they
don't spit out nice computed numerical values, but rather the
#define itselve.
This commit was SVN r21364.
2009-06-04 13:00:16 +00:00
Rainer Keller
aadf1add6c
- In case of Intel compiler using FFLAGS=-ipo (included in -fast), the
...
produced object file does not contain external sumbols -- bad for the
configure tests to find out external symbols.
The only way to check for Fortran naming convention is to rid of
-ipo and -fast in case of ifort...
Thanks to Michel Devel for bringing this up.
This commit was SVN r21363.
2009-06-03 15:13:07 +00:00
Ralph Castain
3815bfbba6
Provide a better error message when the oob cannot send a message after exhausting retries, and then have the proc abort so the job doesn't just hang forever.
...
Since it could be a daemon that needs to abort, cleanup the abort sequence so the daemon can exit as cleanly as possible.
This commit was SVN r21361.
2009-06-02 23:57:12 +00:00
Ralph Castain
882b40182b
Add --show-progress option to mpirun
...
This commit was SVN r21360.
2009-06-02 23:52:59 +00:00
Ralph Castain
30a357bd8d
Provide a "progress meter" for launch that outputs progress as we are launching, especially on large jobs. Also, provide a timeout mechanism so that we cleanly abort if we don't get a response from the next daemon in a specified time.
...
This commit was SVN r21359.
2009-06-02 23:52:02 +00:00
Ralph Castain
4d3aa5a8a4
Once again, into the breach!
...
Yes, friends, our favorite PCIE BTL has resurfaced as mgmt vacillates over its existence. This is an updated version that actually mostly works, in its final stages of debugging.
Some generalization still remains to be done...
This commit was SVN r21358.
2009-06-02 22:26:36 +00:00
Jeff Squyres
b363fc2a6b
Per the discussion yesterday, remove all use of MPI_Flogical -- it's
...
not an official MPI type, so it shouldn't be named that way. Instead,
use ompi_fortran_logical_t.
This commit was SVN r21353.
2009-06-02 12:04:17 +00:00
Ralph Castain
303e3a1d39
Add a resilient mapping capability - currently maps by fault groups (if provided), still need to add the remapping capability for failed procs.
...
This commit was SVN r21350.
2009-06-02 03:23:20 +00:00
Shiqing Fan
70158790e5
First commit to support MPI Extended Interface on Windows.
...
This commit was SVN r21344.
2009-06-01 19:24:39 +00:00
Shiqing Fan
e46bf10efd
Correctly include win32 util header.
...
This commit was SVN r21343.
2009-06-01 19:16:00 +00:00
Rainer Keller
b572dc3591
- As discussed revert r21330, Fortran-configure info should
...
not end up in OPAL
- Will post an updated patch for the OMPI_ALIGNMENT_ parts (within C).
This commit was SVN r21342.
The following SVN revision numbers were found above:
r21330 --> open-mpi/ompi@95596d1814
2009-06-01 19:02:34 +00:00
Jeff Squyres
2f810a1e5e
Fix a compiler warning and do some slightly-smarter unsigned int
...
checks.
This commit was SVN r21341.
2009-06-01 14:12:46 +00:00
Jeff Squyres
df7c387155
I'm (temporarily?) removing this entry because there's no .window file
...
in this directory and it's causing "make dist" to break.
Shiqing -- is there a missing file in this directory? If so, please
add it and restore the EXTRA_DIST line I just removed. Thanks!
This commit was SVN r21340.
2009-06-01 14:07:08 +00:00
Jeff Squyres
055b50213b
A slightly better way to fix the problem noted in r21137.
...
This commit was SVN r21339.
The following SVN revision numbers were found above:
r21137 --> open-mpi/ompi@f6da7d86a2
2009-06-01 13:58:37 +00:00
Josh Hursey
1fef44085b
As a result of r21308, if we are building with "--disable-mpi-io " make sure that MPI_MAX_DATAREP_STRING is defined in mpi.h
...
MTT caught builds failing with the following error which flagged the problem:
{{{
param.cc:802: error: ‘MPI_MAX_DATAREP_STRING’ was not declared in this scope
}}}
This commit was SVN r21337.
The following SVN revision numbers were found above:
r21308 --> open-mpi/ompi@2f9765926e
2009-06-01 12:44:53 +00:00
Ralph Castain
137104b786
Initial support for CMs - needs to be pruned as CM support develops
...
This commit was SVN r21335.
2009-05-30 20:57:23 +00:00
Ralph Castain
4a31c65126
Add initial support for CMs
...
This commit was SVN r21334.
2009-05-30 20:55:55 +00:00
Ralph Castain
a1005a716f
Allow CM's to select the default errmgr component. Add support for error function callbacks
...
This commit was SVN r21333.
2009-05-30 20:43:42 +00:00
Ralph Castain
a95731fc68
Minor update to let apps set their own component selections if desired, while preserving slave behavior
...
This commit was SVN r21332.
2009-05-30 20:42:23 +00:00
Rainer Keller
95596d1814
- Move alignment and size output generated by configure-tests
...
into the OPAL namespace, eliminating cases like opal/util/arch.c
testing for ompi_fortran_logical_t.
As this is processor- and compiler-related information
(e.g. does the compiler/architecture support REAL*16)
this should have been on the OPAL layer.
- Unifies f77 code using MPI_Flogical instead of opal_fortran_logical_t
- Tested locally (Linux/x86-64) with mpich and intel testsuite
but would like to get this week-ends MTT output
- PLEASE NOTE: configure-internal macro-names and
ompi_cv_ variables have not been changed, so that
external platform (not in contrib/) files still work.
This commit was SVN r21330.
2009-05-30 15:54:29 +00:00
Ralph Castain
4e0223a638
Add the ability to directly launch procs via rsh/ssh. Collect common functions in plm/base. Create a new global param to set assume_same_shell, alias'd back to plm_rsh_assume_same_shell (not deprecated).
...
This commit was SVN r21328.
2009-05-30 01:10:25 +00:00
Nysal Jan
85773de539
Move 'multi-frag receive' fixes to csum PML
...
This commit was SVN r21323.
2009-05-29 08:07:33 +00:00