1
1
Граф коммитов

5338 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
6b59614bf1 Overlooked vprotocol framework
This commit was SVN r23917.
2010-10-22 03:21:10 +00:00
Jeff Squyres
26d28ea2a4 Change the ordering -- gotta initialize the cset before we can examine it!
This commit was SVN r23916.
2010-10-21 15:17:54 +00:00
Jeff Squyres
5b0ffb7d1e Update some m4 usage as suggested by Eric Blake.
This commit was SVN r23915.
2010-10-19 22:46:06 +00:00
Jeff Squyres
4322a78e60 Update wrapper compiler scripts to search for perl during configure, per request from BSD maintainers.
This commit was SVN r23914.
2010-10-19 22:45:54 +00:00
Rolf vandeVaart
70fe48698c Change some of the bfo code to be more like the
ob1 code.  Create some new macros and functions to handle
some differences.

This commit was SVN r23913.
2010-10-19 17:46:51 +00:00
Rolf vandeVaart
364fcd8975 Should not overwrite des_context. Leftover from debugging.
This commit was SVN r23912.
2010-10-19 12:40:30 +00:00
Jeff Squyres
5e88527117 Typo corrections from Jed Brown.
This commit was SVN r23911.
2010-10-18 17:59:00 +00:00
Jeff Squyres
2dc1452e5b Remove blank line that confuses the man-page-to-html software.
This commit was SVN r23910.
2010-10-18 13:28:00 +00:00
Jeff Squyres
3ee3b2ef4b Fix typos found by Jeremiah Willcock.
This commit was SVN r23909.
2010-10-18 13:14:15 +00:00
Matthias Jurenz
a23afc1307 fixed typo
This commit was SVN r23905.
2010-10-18 12:11:32 +00:00
Matthias Jurenz
9ce76256f9 - fixed compile error on Red Hat 5.x which occurred if using GNU compiler with -D_FORTIFY_SOURCE=2
(wrap __fprintf_chk() instread of fprintf())
- incremented VT version number in docu

This commit was SVN r23898.
2010-10-18 10:59:00 +00:00
Brian Barrett
9febaa475e * Add shell of functionality required for supporting Portals4
* Update places where orte-free builds have failed

This commit was SVN r23891.
2010-10-14 22:49:09 +00:00
Jeff Squyres
cc78a714ea These macros do not appear to be used anywhere.
This commit was SVN r23888.
2010-10-14 14:52:07 +00:00
Rolf vandeVaart
24e5e38dce Remove a variable that is not needed. Just piggy
back on a pointer value.

This commit was SVN r23887.
2010-10-13 22:01:23 +00:00
Rolf vandeVaart
20c5e6e0d6 Fix a few more cases where we are using a function
as an argument to a macro which could result in it
being called twice.  I did not observe any issues,
but it should be fixed.  Also did some minor refactoring
for clarity and following code convention.

This commit was SVN r23886.
2010-10-12 20:11:48 +00:00
Jeff Squyres
0b8691e950 Remove clauses that make no sense.
This commit was SVN r23885.
2010-10-12 18:58:57 +00:00
Rolf vandeVaart
44d7006f34 Just some more refactoring and cleanup of bfo PML.
This commit was SVN r23884.
2010-10-12 13:34:35 +00:00
Rolf vandeVaart
e9a7fea42d Fix up some of the failover code in the openib BTL.
Need to use MCA_BTL_IB_FAILED state to signel failure,
not MCA_BTL_IB_CLOSED.

This commit was SVN r23883.
2010-10-11 17:38:27 +00:00
Jeff Squyres
eb117c65ec Fix copyright macro
This commit was SVN r23878.
2010-10-08 18:01:14 +00:00
Jeff Squyres
c891ed34e2 More verbatim escaping.
This commit was SVN r23873.
2010-10-07 22:26:51 +00:00
Jeff Squyres
1f8d14aea0 More verbatim escaping
This commit was SVN r23872.
2010-10-07 22:24:03 +00:00
Jeff Squyres
b7d48fce0c Properly terminate verbatim nroff sequences (so that webified man
pages are rendered properly!).

This commit was SVN r23865.
2010-10-07 21:13:11 +00:00
Jeff Squyres
21a5f855e5 Fix more verbatim mistakes
This commit was SVN r23864.
2010-10-07 21:04:27 +00:00
Jeff Squyres
2db4c2617e Remove erroneous .nf
This commit was SVN r23863.
2010-10-07 20:59:37 +00:00
Jeff Squyres
d30d66c8b7 Silence compiler warning.
This commit was SVN r23859.
2010-10-07 13:42:52 +00:00
Mike Dubman
f9bebe53f9 - fix fca support for MPI_IN_PLACE in allgather and allgatherv collectives
This commit was SVN r23841.
2010-10-06 19:09:02 +00:00
Mike Dubman
f525245498 - support for MPI_IN_PLACE during gather ops
- fix ABI check and message

This commit was SVN r23840.
2010-10-06 16:27:45 +00:00
Josh Hursey
ee42c673fe Fix formatting in group and communicator code (- No functionality changes -)
Mostly TAB to spaces changes, though a couple style fixes were included as well.

The tab/space issue was causing problems with off-trunk branch merging.

This commit was SVN r23827.
2010-10-04 14:54:58 +00:00
Rolf vandeVaart
a91bd44463 Do not hand a function into this macro as the
function will get called twice.

This commit was SVN r23824.
2010-10-01 18:59:15 +00:00
Ralph Castain
94ccc84d85 Not sure why I chased this one down with Jeff as nobody really seems to care...
This commit was SVN r23820.
2010-09-30 18:33:08 +00:00
Rolf vandeVaart
59e3fa8ed3 Some more formatting fixes and code refactoring. All
these changes are in the bfo so this has no affect on ob1.

This commit was SVN r23815.
2010-09-29 13:46:45 +00:00
Rolf vandeVaart
f808dd2881 Cosmetic changes to fix spaces. No code change.
This commit was SVN r23803.
2010-09-27 21:01:49 +00:00
Jeff Squyres
73bcc4a36b Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers
This commit was SVN r23801.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-24 22:53:28 +00:00
Rolf vandeVaart
3cc1fa45bf Fix a few more extraneous spaces. Also update csum
priority logic to match ob1.

This commit was SVN r23798.
2010-09-24 13:14:18 +00:00
Jeff Squyres
7ef20f60f3 Autoconf updates to make us compatible with AC 2.68. Thanks to Ralf W. for the patch!
This commit was SVN r23797.
2010-09-23 22:37:52 +00:00
Samuel Gutierrez
90a132b0a2 disable system v shared memory support when checkpoint/restart is enabled. this combo could presumably work properly someday.
This commit was SVN r23792.
2010-09-22 22:05:07 +00:00
Steve Wise
9862132836 Add T4 device IDs to openib btl params ini file.
This commit was SVN r23791.
2010-09-22 18:16:53 +00:00
Rolf vandeVaart
0331889495 Some more spaces, tabs, include file ordering changes.
No real code changes here.  

This commit was SVN r23789.
2010-09-22 13:48:22 +00:00
Ralph Castain
3631e4e936 Revert remaining svn kruft from r23764
This commit was SVN r23786.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-22 01:11:40 +00:00
Shiqing Fan
a4c2ed7a87 Fix a few things for Windows build - type cast, modified variable names and unresolved symbols.
This commit was SVN r23783.
2010-09-21 09:40:26 +00:00
Matthias Jurenz
8e8c407616 revert r23764 in ompi/contrib/vt/vt
This commit was SVN r23782.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-21 07:09:24 +00:00
Samuel Gutierrez
1c8f3e1add fix common sm segf when used with cr - thanks to Ananda for finding this issue.
This commit was SVN r23781.
2010-09-20 22:20:43 +00:00
Rolf vandeVaart
77560269f2 More fixes of spaces, tabs, and ordering of include files
to make the 3 PMLs the same where they are the same.  No
real code changes.

This commit was SVN r23779.
2010-09-20 21:22:33 +00:00
Mike Dubman
58aa7fd161 enabling *gather*
This commit was SVN r23773.
2010-09-20 06:29:54 +00:00
Mike Dubman
f754bde8eb fixing r23764 leftovers, adopting Jeff's note
This commit was SVN r23772.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-20 06:27:43 +00:00
Mike Dubman
bd9a1f28a3 revert r23764 in ompi/mca/coll/fca
This commit was SVN r23771.

The following SVN revision numbers were found above:
  r23764 --> open-mpi/ompi@40a2bfa238
2010-09-20 06:06:45 +00:00
Jeff Squyres
099816e59e Somehow this file got missed.
This commit was SVN r23766.
2010-09-18 04:37:37 +00:00
Ralph Castain
40a2bfa238 WARNING: Work on the temp branch being merged here encountered problems with bugs in subversion. Considerable effort has gone into validating the branch. However, not all conditions can be checked, so users are cautioned that it may be advisable to not update from the trunk for a few days to allow MTT to identify platform-specific issues.
This merges the branch containing the revamped build system based around converting autogen from a bash script to a Perl program. Jeff has provided emails explaining the features contained in the change.

Please note that configure requirements on components HAVE CHANGED. For example. a configure.params file is no longer required in each component directory. See Jeff's emails for an explanation.

This commit was SVN r23764.
2010-09-17 23:04:06 +00:00
Rolf vandeVaart
91c1ee86d7 Fix for fix of fix for handling misalignment when sending
onesided multifrag.

This fixes trac:2532.

This commit was SVN r23760.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2010-09-16 18:58:11 +00:00
Rolf vandeVaart
65e8277add Mostly fixes for tabs, spaces and indentations.
Also, some other changes to bring the csum PML up
to date with changes that happened in ob1 over the
last two years. This includes a few bug
fixes and some minor refactoring.  

This commit was SVN r23757.
2010-09-15 18:48:06 +00:00
Matthias Jurenz
33e29247b6 Do not use default options from config/defaults if configuring inside Open MPI
This commit was SVN r23753.
2010-09-14 06:51:14 +00:00
Rolf vandeVaart
31a168695e Some more cleanup of extraneous spaces and tabs. Also
some changes to script to run diffs between PMLs.

This commit was SVN r23749.
2010-09-13 14:58:00 +00:00
Matthias Jurenz
68e6745250 Renamed macros
omp_get_thread_num -> MY_THREAD
   omp_get_num_threads -> THREAD_NUM
to avoid conflicts with the 'omp.h' of the PGI compiler version 10.0.x

This commit was SVN r23748.
2010-09-13 08:48:45 +00:00
Ralph Castain
e96b5f486f Reorganize the opal interface code in opal/util/if.c per prior emails and telecon discussions. Move the interface discovery code into a framework so that configuration logic can separate it out (instead of the prior #if-#else confusion).
All interface APIs for accessing the info remain unchanged in opal/util/if.c.

This has been tested on Mac, Linux, and NetBSD. Nobody else seemed interested in testing it, so there may be some future problems revealed as people try it on other OSs.

This commit was SVN r23743.
2010-09-13 01:58:51 +00:00
Jeff Squyres
ea13687547 Minor change to VT: use the "foreign" keyword with AM_INIT_AUTOMAKE so
that Automake doesn't assume that these are GNU standard packages
that conform to the GNU coding standards.  Without this keyword,
Automake can replace the VT INSTALL files with the GNU standard
INSTALL text file.

This change is ''necessary'' for the autogen improvements that are
coming shortly (see
http://www.open-mpi.org/community/lists/devel/2010/09/8478.php); I'm
committing this change ahead of time so that I can pass it upstream to
the VT developers.

This commit was SVN r23740.
2010-09-12 08:44:41 +00:00
Rolf vandeVaart
3bb587937a Just fix up some trailing spaces, tabs instead of spaces,
missing periods on copyrights, extraneous spaces on blank
lines.  No actual code change.

This commit was SVN r23739.
2010-09-10 21:01:52 +00:00
George Bosilca
8e9d9e136d Update the GM bandwidth.
This commit was SVN r23734.
2010-09-08 21:50:56 +00:00
Rolf vandeVaart
c8d6672453 Set default udapl bandwidth to more realistic value.
This commit was SVN r23728.
2010-09-08 14:38:16 +00:00
Matthias Jurenz
a0c061a7ec Temporary disabled OpenMP support completely if the PGI compiler is used to work around some uncleared compile errors
This commit was SVN r23719.
2010-09-06 11:57:50 +00:00
Mike Dubman
104d57f69a * Support allgatherv, convert displs and rcounts arrays to bytes.
* change comm_init API - no need to pass local rank groups, fca calculates that on its own.
* remove local rank list from module - libfca maintains that now.
* in fca_bcast and fca_reduce - pass root rank index and let libfca figure out the local rank index.

This commit was SVN r23716.
2010-09-05 09:49:59 +00:00
Nadia Derbey
e265dc51e5 Added Bull vendor id for ConnectX card
This commit was SVN r23715.
2010-09-03 14:13:19 +00:00
Jeff Squyres
b9ac24eadd Based on
http://www.open-mpi.org/community/lists/devel/2010/09/8455.php, rever
this patch.  George, Brice, and Scott can decide what they want to do
here.  

This commit was SVN r23714.
2010-09-03 13:48:36 +00:00
Abhishek Kulkarni
c3a653ebb3 Fix MPI segfaults during MPI_Init() with the MX BTL and MTL.
Thanks to Scott Atchley for the patch.

This commit was SVN r23713.
2010-09-03 12:38:14 +00:00
Jeff Squyres
2b2b29a6d4 For some reason, the MX btl sets btl_bandwidth in megabits/s instead
of megabytes/s. So we get crazy btl_weights in case of heterogeneous
multirail. And --mca btl_mx_bandwidth <width> cannot work around the
problem (it probably doesn't help because it's overriden by the
runtime link width detection anyway?).

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>

This commit was SVN r23712.
2010-09-03 12:03:06 +00:00
Rolf vandeVaart
14e7bcc383 Create new entries in the wrapper data files so the
administrator can specify compiler flags that get
inserted into the command before the user's flags.
These flags can be specified at configure time.
Reviewed by Jeff Squyres.

This fixes ticket #2474.

This commit was SVN r23709.
2010-09-02 10:47:55 +00:00
Mike Dubman
48274c1c77 better control for enable/disable specific coll APIs
This commit was SVN r23708.
2010-09-02 09:22:24 +00:00
Rolf vandeVaart
47940f2aa0 Fix the fix (r23649) for ticket 2532. We were neglecting to
update the remain_len field for the buffer.

This really fixes ticket #2532.

This commit was SVN r23706.

The following SVN revision numbers were found above:
  r23649 --> open-mpi/ompi@f42c2a737f
2010-09-01 14:12:08 +00:00
Mike Dubman
8ef56bf258 * drop support for FCA v1.2
* add support for FCA ABI
* add support for allgather

This commit was SVN r23705.
2010-09-01 11:29:10 +00:00
Jeff Squyres
c0685fc673 Fix problem noted by Sebastian Andrzej Siewior; we should not be using AS_VAR_GET. Per advice from Ralf, change them all to AS_VAR_IF and AS_VAR_COPY. CMR:v1.5. A separate patch has to be created for v1.4 because files have moved around.
This commit was SVN r23681.
2010-08-27 22:48:57 +00:00
Jeff Squyres
ce91a8572d Twice the code for half the price! :-)
Somehow, there's an entire 2nd (identical) copy of the sm btl
configure.m4 in here -- this commit removes the duplicate copy,
leaving only 1 copy of each relevant m4 macro.

Thanks to Ralph for spotting it!

This commit was SVN r23675.
2010-08-27 01:24:55 +00:00
Jeff Squyres
60dacba04e This stuff has been outdated for years -- might as well remove it (it
isn't included in the tarball and was only used to generate the
initial f90 scripts -- we've moved well beyond this XML by updating
the scripts without also updating the corresponding XML).

This commit was SVN r23670.
2010-08-26 11:38:38 +00:00
Ralph Castain
2e223abe33 Restore the auto-poll method for detecting debugger attachment, but only in the mpirx debugger module and only if the corresponding rate mca param is set.
Guess we missed it before, but add the debugger framework to the orte-info and ompi_info tools

This commit was SVN r23667.
2010-08-25 22:52:33 +00:00
Jeff Squyres
2c52096976 Several EXTRA_STATE parameter types were erroneously "INTEGER" (they
should be "INTEGER(kind=MPI_ADDRESS_KIND)").  This has been wrong for
''years''.  Apparently no one who uses the F90 bindings also uses MPI
attributes.  Sigh.

This commit was SVN r23664.
2010-08-25 16:46:36 +00:00
Shiqing Fan
7a1bdd2327 Get rid of a warning of "pointer of type ‘void *’ used in arithmetic" on Linux, which is also an error on Windows.
This commit was SVN r23660.
2010-08-25 08:26:11 +00:00
Ethan Mallove
f42c2a737f Fixes trac:2532 - "MPI_Put can result in SIGBUS on SPARC"
Reviewed by Rolf V and Brian B

This commit was SVN r23649.

The following Trac tickets were found above:
  Ticket 2532 --> https://svn.open-mpi.org/trac/ompi/ticket/2532
2010-08-24 18:10:43 +00:00
Samuel Gutierrez
3b572e14ce Fix build issues on Windows. Thanks to Shiqing for pointing this out.
This commit was SVN r23646.
2010-08-24 14:01:05 +00:00
Mike Dubman
fca50c4a09 comply to code-style: no c++ style commends
This commit was SVN r23645.
2010-08-24 13:42:21 +00:00
Mike Dubman
9cb2e0490b removed #if 0
This commit was SVN r23643.
2010-08-24 13:32:28 +00:00
Shiqing Fan
a987eafc90 Add another sm definition for ignoring posix sm on Windows, and exclude those source files.
This commit was SVN r23640.
2010-08-24 09:28:56 +00:00
Samuel Gutierrez
3b162593e6 New POSIX shared memory component and other common sm enhancements.
NOTE: mmap is still the default.

Some highlights:
o Silent component failover.
o The sysv component will only be queried for selection if it is placed before
  the mmap component (for example, -mca mpi_common_sm sysv,posix,mmap).  In the
  default case, sysv will never be queried/selected.
o Per some on-list discussion, now unlinking mmaped file in both mmap and posix
  components (see: "System V Shared Memory for Open MPI: Request for Community
  Input and Testing" thread).
o  Assuming local process homogeneity with respect to all utilized shared
   memory facilities. That is, if one local process deems a particular shared
   memory facility acceptable, then ALL local processes should be able to
   utilize that facility. As it stands, this is an important point because one
   process dictates to all other local processes which common sm component will
   be selected based on its own, local run-time test.
o Addressed some of George's code reuse concerns.

This commit was SVN r23633.
2010-08-23 16:04:13 +00:00
Shiqing Fan
7a301bc417 Add support for ompi_ext on Windows.
This commit was SVN r23632.
2010-08-23 13:16:30 +00:00
Shiqing Fan
c110edbf44 Use exclude lists for non-ordinary sub directories check.
This commit was SVN r23631.
2010-08-23 09:43:05 +00:00
Brian Barrett
94be1e043d Fix make distcheck issue with mpiextensions framework
This commit was SVN r23625.
2010-08-18 17:18:20 +00:00
Rainer Keller
33f2b9398e - This warning now is not supported anymore. Using it generates
a warning itselve (when another warning is generated within the file),
   which can be rather anying.
   Therefore check for output regarding this unrecognized warning.

This commit was SVN r23624.
2010-08-18 06:01:23 +00:00
Rainer Keller
104afe39e4 - ompi_ext.m4: For VPATH builds, create the subdirectories first
- mpiext.h: For OMPI_DECLSPEC, include the ompi_config.h

This commit was SVN r23623.
2010-08-17 22:40:22 +00:00
Brian Barrett
13c827dda8 Make trunk compile on Red Storm again
This commit was SVN r23622.
2010-08-17 21:51:38 +00:00
Brian Barrett
6ae9790d19 * Add option of init/fini hooks for MPI extensions to be called at the end of
MPI_INIT and start of MPI_FINALIZE.
* Clean up MPI Extensions build system to acknowledge that OMPI's the only
  project with extensions, as well as remove some build artifacts necessary
  for more general components.

This commit was SVN r23616.
2010-08-17 04:44:22 +00:00
Mike Dubman
a036c24253 revert fix to comply with #2534
- use op->o_name directly
- cosmetic prints

This commit was SVN r23614.
2010-08-15 11:04:34 +00:00
Jeff Squyres
e6f0422f7c r20280 introduced the op framework and changed all the back-end op
string names from MPI_<foo> to MPI_OP_<foo>.  While these names are
OMPI-internal-only (i.e., not exposed to MPI applications), this
change is a difference between the released 1.3/1.4 series.

The Voltaire FCA library uses these strings for its own internal
purposes; since the names changed between the 1.3/1.4 series and the
upcoming 1.5 series, it caused a problem for the FCA library.  They
volunteered to put in a hot fix in FCA, but it seems to me that we
shouldn't change the names to begin with -- there was no real reason
to change them to MPI_OP_<foo>.  So this commit changes them back to
MPI_<foo>. 

This commit was SVN r23600.

The following SVN revision numbers were found above:
  r20280 --> open-mpi/ompi@4d8a187450
2010-08-12 13:56:01 +00:00
Shiqing Fan
330999e36c Some fixes for C/R enhancement on Windows. Add the option and fix some type casts, just let it compile.
This commit was SVN r23599.
2010-08-12 13:31:37 +00:00
Mike Dubman
16d7169680 refactoring:
* split fca_open() into fca_register() and fca_open()

This commit was SVN r23598.
2010-08-12 12:05:23 +00:00
Mike Dubman
ba5bc9b674 fixes:
* fixup lookup of supported ops by name:
        in ompi 1.5.x the op string representation were changed from MPI_XXX to MPI_OP_XXX (relative to OMPI 1.4.x)
		* keep compat between diff versions of FCA
		* better error handling (return error if symbol not found)
		* register to opal_progress and call fca_progress API

This commit was SVN r23597.
2010-08-12 08:15:55 +00:00
Rolf vandeVaart
5e59de9ce6 Update comment to explain why macro should only
be used for 64-bit SPARC because of performance
implications.  Also added minor optimization to
macro.

This fixes trac:2526. 

This commit was SVN r23590.

The following Trac tickets were found above:
  Ticket 2526 --> https://svn.open-mpi.org/trac/ompi/ticket/2526
2010-08-10 21:13:27 +00:00
Rainer Keller
7c85144ac6 - Hmm, these mca parameters indeeed are registered twice ,-]
Thanks, Jeff!

   This should be added to CMR:v1.5:#2527

This commit was SVN r23589.
2010-08-10 21:11:59 +00:00
Josh Hursey
e12ca48cd9 A number of C/R enhancements per RFC below:
http://www.open-mpi.org/community/lists/devel/2010/07/8240.php

Documentation:
  http://osl.iu.edu/research/ft/

Major Changes: 
-------------- 
 * Added C/R-enabled Debugging support. 
   Enabled with the --enable-crdebug flag. See the following website for more information: 
   http://osl.iu.edu/research/ft/crdebug/ 
 * Added Stable Storage (SStore) framework for checkpoint storage 
   * 'central' component does a direct to central storage save 
   * 'stage' component stages checkpoints to central storage while the application continues execution. 
     * 'stage' supports offline compression of checkpoints before moving (sstore_stage_compress) 
     * 'stage' supports local caching of checkpoints to improve automatic recovery (sstore_stage_caching) 
 * Added Compression (compress) framework to support 
 * Add two new ErrMgr recovery policies 
   * {{{crmig}}} C/R Process Migration 
   * {{{autor}}} C/R Automatic Recovery 
 * Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} ErrMgr component 
 * Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} configure option) 
   * {{{OMPI_CR_Checkpoint}}} (Fixes trac:2342) 
   * {{{OMPI_CR_Restart}}} 
   * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
   * {{{OMPI_CR_INC_register_callback}}} (Fixes trac:2192) 
   * {{{OMPI_CR_Quiesce_start}}} 
   * {{{OMPI_CR_Quiesce_checkpoint}}} 
   * {{{OMPI_CR_Quiesce_end}}} 
   * {{{OMPI_CR_self_register_checkpoint_callback}}} 
   * {{{OMPI_CR_self_register_restart_callback}}} 
   * {{{OMPI_CR_self_register_continue_callback}}} 
 * The ErrMgr predicted_fault() interface has been changed to take an opal_list_t of ErrMgr defined types. This will allow us to better support a wider range of fault prediction services in the future. 
 * Add a progress meter to: 
   * FileM rsh (filem_rsh_process_meter) 
   * SnapC full (snapc_full_progress_meter) 
   * SStore stage (sstore_stage_progress_meter) 
 * Added 2 new command line options to ompi-restart 
   * --showme : Display the full command line that would have been exec'ed. 
   * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes trac:2413) 
 * Deprecated some MCA params: 
   * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
   * snapc_base_global_snapshot_dir deprecated, use sstore_base_global_snapshot_dir 
   * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
   * snapc_base_store_in_place deprecated, replaced with different components of SStore 
   * snapc_base_global_snapshot_ref deprecated, use sstore_base_global_snapshot_ref 
   * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
   * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-------------- 
 * Fixes trac:1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint handles and does the right thing. 
 * Fixes trac:2097 : {{{ompi-info}}} should now report all available CRS components 
 * Fixes trac:2161 : Manual checkpoint movement. A user can 'mv' a checkpoint directory from the original location to another and still restart from it. 
 * Fixes trac:2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
 * Move {{{ompi_cr_continue_like_restart}}} to {{{orte_cr_continue_like_restart}}} to be more flexible in where this should be set. 
 * opal_crs_base_metadata_write* functions have been moved to SStore to support a wider range of metadata handling functionality. 
 * Cleanup the CRS framework and components to work with the SStore framework. 
 * Cleanup the SnapC framework and components to work with the SStore framework (cleans up these code paths considerably). 
 * Add 'quiesce' hook to CRCP for a future enhancement. 
 * We now require a BLCR version that supports {{{cr_request_file()}}} or {{{cr_request_checkpoint()}}} in order to make the code more maintainable. Note that {{{cr_request_file}}} has been deprecated since 0.7.0, so we prefer to use {{{cr_request_checkpoint()}}}. 
 * Add optional application level INC callbacks (registered through the CR MPI Ext interface). 
 * Increase the {{{opal_cr_thread_sleep_wait}}} parameter to 1000 microseconds to make the C/R thread less aggressive. 
 * {{{opal-restart}}} now looks for cache directories before falling back on stable storage when asked. 
 * {{{opal-restart}}} also support local decompression before restarting 
 * {{{orte-checkpoint}}} now uses the SStore framework to work with the metadata 
 * {{{orte-restart}}} now uses the SStore framework to work with the metadata 
 * Remove the {{{orte-restart}}} preload option. This was removed since the user only needs to select the 'stage' component in order to support this functionality. 
 * Since the '-am' parameter is saved in the metadata, {{{ompi-restart}}} no longer hard codes {{{-am ft-enable-cr}}}. 
 * Fix {{{hnp}}} ErrMgr so that if a previous component in the stack has 'fixed' the problem, then it should be skipped. 
 * Make sure to decrement the number of 'num_local_procs' in the orted when one goes away. 
 * odls now checks the SStore framework to see if it needs to load any checkpoint files before launching (to support 'stage'). This separates the SStore logic from the --preload-[binary|files] options. 
 * Add unique IDs to the named pipes established between the orted and the app in SnapC. This is to better support migration and automatic recovery activities. 
 * Improve the checks for 'already checkpointing' error path. 
 * A a recovery output timer, to show how long it takes to restart a job 
 * Do a better job of cleaning up the old session directory on restart. 
 * Add a local module to the autor and crmig ErrMgr components. These small modules prevent the 'orted' component from attempting a local recovery (Which does not work for MPI apps at the moment) 
 * Add a fix for bounding the checkpointable region between MPI_Init and MPI_Finalize. 

This commit was SVN r23587.

The following Trac tickets were found above:
  Ticket 1924 --> https://svn.open-mpi.org/trac/ompi/ticket/1924
  Ticket 2097 --> https://svn.open-mpi.org/trac/ompi/ticket/2097
  Ticket 2161 --> https://svn.open-mpi.org/trac/ompi/ticket/2161
  Ticket 2192 --> https://svn.open-mpi.org/trac/ompi/ticket/2192
  Ticket 2208 --> https://svn.open-mpi.org/trac/ompi/ticket/2208
  Ticket 2342 --> https://svn.open-mpi.org/trac/ompi/ticket/2342
  Ticket 2413 --> https://svn.open-mpi.org/trac/ompi/ticket/2413
2010-08-10 20:51:11 +00:00
Rainer Keller
9fff01704f - Add on to r23580: we do check for F90's DOUBLE COMPLEX, but do not do
so for F77. The DDT-engine is taken care of, it maps to C's dblcplx
   accordingly.

   Manually added to CMR:

This commit was SVN r23586.

The following SVN revision numbers were found above:
  r23580 --> open-mpi/ompi@16bf3c2f30
2010-08-10 20:33:50 +00:00
Jeff Squyres
16bf3c2f30 Fix an issue with ompi_info reporting the wrong sizes/alignments for
some Fortran types.  Thanks to Gus Correa and others for helping
identify this issue.

This commit was SVN r23580.
2010-08-09 16:56:32 +00:00
Rainer Keller
2ee01042c9 - Spelling fixes and line breaks in the parameter descriptions.
Please cmr:v1.5

This commit was SVN r23578.
2010-08-09 16:10:31 +00:00
Matthias Jurenz
8f940bd53b Fixed typo
This commit was SVN r23572.
2010-08-09 08:52:05 +00:00
Mike Dubman
7d1a8a154d fca:
- keep compat to fca v1.2 and fca 2.0
- fix segv
- keep compat to ompi 1.4.x

This commit was SVN r23569.
2010-08-08 13:28:41 +00:00
Matthias Jurenz
e0844f3a40 - enforced creating of event/summary files even process/thread doesn't produce trace data
(reworked r23550)
- append "vampirtrace" to ${datarootdir} and ${includedir} even the options '--includedir' and '--datarootdir' are specified
  (this is meaningful for the creation of the Open MPI distribution packages)
- disable OpenMP support in otfprofile if the PGI compiler is used to work around the following errors:

	compiler version  compiler error
	< 9.0-3           PGCC-S-0000-Internal compiler error. calc_dw_tag:no tag
	(see Technical Problem Report 4337 at http://www.pgroup.com/support/release_tprs_90.htm)

	10.1 - 10.6       this kind of pragma may not be used here
	                        #pargma omp barrier

This commit was SVN r23564.

The following SVN revision numbers were found above:
  r23550 --> open-mpi/ompi@3ef374478f
2010-08-06 12:47:40 +00:00
Rolf vandeVaart
0324fdb407 Created two new macros that are used when filling in either the
status structure or the _ucount field in the status structure.
On 64-bit sparc, the macros resolve into integer array assignments.
For all others, they are just simple assignments.  This fixes 
possible BUS errors seen when running on the SPARC processor.
This bug was introduced when the _count field changed from an int
into a size_t.  See the changes to request.h for additional details.

This commit fixes trac:2514.

This commit was SVN r23554.

The following Trac tickets were found above:
  Ticket 2514 --> https://svn.open-mpi.org/trac/ompi/ticket/2514
2010-08-04 19:36:40 +00:00
Mike Dubman
2914d11793 fix datatype API
This commit was SVN r23552.
2010-08-04 14:01:54 +00:00
Matthias Jurenz
3ef374478f Do only write active process/thread ids to the OTF master control file (*.otf).
Vampir 7.2 is unable to load trace files where processes/threads are defined which didn't produced event records.

This commit was SVN r23550.
2010-08-04 07:12:39 +00:00
Ralph Castain
586f5b8bf5 Add missing includes per Greg Koenig
This commit was SVN r23546.
2010-08-03 17:30:59 +00:00
Shiqing Fan
a096cc9082 it's not a component for Windows, so get rid of the Windows support files.
This commit was SVN r23543.
2010-08-02 17:12:40 +00:00
Mike Dubman
7cbe9b43c2 initial release of Voltaire FCA (fabric collective accelerator) collective component
- compatible with FCA v1.2

This commit was SVN r23539.
2010-08-02 11:25:53 +00:00
Josh Hursey
ba7e94dd89 Some relatively minor C/R related cleanup
* Fix a configure warning for checking --enable-ft-thread
 * In hnp and orted ErrMgr components check to see if other components have already recovered this process before trying to recover it again.
 * Fix 'npernode' for restarting using the resilient rmaps component
 * export ompi_info_set, so that internal functionality can use it.

This commit was SVN r23535.
2010-07-30 18:59:34 +00:00
Rolf vandeVaart
3d9b05ba2b Fix bug introduced by r23463. We now handle positive
error codes correctly again.  Also fix a typo.
Reviewed by Jeff Squyres. 

This commit was SVN r23531.

The following SVN revision numbers were found above:
  r23463 --> open-mpi/ompi@2af3e6e5ae
2010-07-28 19:19:27 +00:00
Jeff Squyres
c59743d7e3 Move the predefined gap test to ompi/debuggers (we already have the
dlopen_test there, so why not put the other debugger test there with
it?).

This commit was SVN r23527.
2010-07-28 16:22:10 +00:00
Jeff Squyres
49b8008986 Remove the peruse test from any possibility of being run during "make
check" (it's been deactivated for 2+ years now, anyway).  It needs to
be launched via "mpirun" and needs >= 2 processes, so it wasn't a good
candidate for "make check", anyway.

The test itself has moved to OMPI's internal testing suites.

This commit was SVN r23526.
2010-07-28 16:04:18 +00:00
Nadia Derbey
8974d5cc9e Fixed a potential memory leak in mpi_type_create_struct_f
This commit was SVN r23484.
2010-07-23 12:41:21 +00:00
Jeff Squyres
ee3b22e4b7 Oops -- use the right function name, otherwise you get compile/link
errors when you configure with --enable-heterogeneous.

This is why we have MTT.  :-)

This commit was SVN r23481.
2010-07-23 01:30:01 +00:00
Jeff Squyres
51a051b072 This commit, along with r23467, r23468, r23470, r23471 should fix #2241.
This commit:

 * Adds the configury to figure out how many Fortran INTEGERs are 
   necessary to represent the C MPI_Status (which now includes a size_t
   member).
 * Sets MPI_STATUS_SIZE to this value in mpif-config.h.in.
 * Adds a big comment in status_c2f.c explaining why the no changes 
   were necessary to how we copy statuses between Fortran and C.

This commit was SVN r23472.

The following SVN revision numbers were found above:
  r23467 --> open-mpi/ompi@733d25a8a3
  r23468 --> open-mpi/ompi@963fcb13a5
  r23470 --> open-mpi/ompi@418b989781
  r23471 --> open-mpi/ompi@bc74a446ac
2010-07-22 02:23:47 +00:00
Jeff Squyres
bc74a446ac Add some comments to reinforce the fact that MPI applications should not be using the non-public members of ompi_status_public_t. Refs trac:2241.
This commit was SVN r23471.

The following Trac tickets were found above:
  Ticket 2241 --> https://svn.open-mpi.org/trac/ompi/ticket/2241
2010-07-22 01:59:33 +00:00
Jeff Squyres
418b989781 Divide by size, not status->_count. Gives a much better answer. :-)
This commit was SVN r23470.
2010-07-22 01:53:01 +00:00
Jeff Squyres
62fe827bdf s/MPI:Exception/MPI::Exception/g. Think of all the poor users who,
for years, were probably tremendously confused by this typo -- trying
to code their applications by catching MPI:Exception instances, but
failing to compile them.  "Why, cruel world, why?!"

Now we have fixed the error; all is right with the world again.

This commit was SVN r23469.
2010-07-22 01:24:12 +00:00
Jeff Squyres
963fcb13a5 If the value to be returned is larger than what can be represented in
the count parameter, then invoke MPI_ERR_TRUNCATE.

This commit was SVN r23468.
2010-07-22 01:15:46 +00:00
George Bosilca
733d25a8a3 First step toward fixing the MPI_Get_count issues from the ticket #2241. Next
step is the configure and Fortran mojo that Jeff will put in. Until then I
guess the Fortran interface is broken (at least all functions using the hidden
count firld in the MPI_Status).

This commit was SVN r23467.
2010-07-21 20:07:00 +00:00
Jeff Squyres
2af3e6e5ae Minor updates:
* ompi_errcode_get_mpi_code() already checks for >0 error codes; the
   checks in OMPI_ERRHANDLER_INVOKE, OMPI_ERRHANDLER_CHECK, and
   OMPI_ERRHANDLER_RETURN were superfluous.
 * Ensure to use/return an OPAL_SOS-decoded value in
   ompi_errcode_get_mpi_code().  
 * Symbols beginning with !__ technically belong in the compiler
   namespace; we shouldn't be using those.  
 * Other minor style updates in ompi_errcode_get_mpi_code().

This commit was SVN r23463.
2010-07-21 16:27:08 +00:00
Rolf vandeVaart
45019a3abf Correctly handle zero-length match fragment.
This commit was SVN r23459.
2010-07-21 15:27:06 +00:00
Jeff Squyres
3031b59cfe Change to use the new opal_fd_*() functions.
This commit was SVN r23451.
2010-07-20 19:54:17 +00:00
Jeff Squyres
35690ecad5 Fixes trac:2472. Use large integers to hold displacements for one-sided
operations, not ints. 

Sorry for the mid-day configure.ac change, folks...

This commit was SVN r23449.

The following Trac tickets were found above:
  Ticket 2472 --> https://svn.open-mpi.org/trac/ompi/ticket/2472
2010-07-20 18:45:48 +00:00
Jeff Squyres
64cb8f5d7f Another round of man page cleanups from Debian mantainer Manuel
Prinz.  Many thanks!

This commit was SVN r23445.
2010-07-20 14:07:18 +00:00
Jeff Squyres
e736281adf Add an extra pair of (), just for defensive programming.
This commit was SVN r23444.
2010-07-20 12:23:00 +00:00
Nadia Derbey
837fb29fab Wrong event_type value passed in to show_help when getting xrc async events
This commit was SVN r23442.
2010-07-20 06:37:17 +00:00
Christopher Yeoh
cfea0db3a2 removes spurious compilation warning
This commit was SVN r23441.
2010-07-20 06:32:36 +00:00
Ralph Castain
248320b91a Enable connect_accept between multiple singleton jobs without the presence of an external rendezvous agent (e.g., ompi-server). This also enables connect_accept between processes in more than two jobs regardless of how they were started.
Create an ability to store the contact info for multiple HNPs being used to route between different job families. Modify the dpm orte module to pass the resulting store during the connect_accept procedure so that all jobs involved in the resulting communicator know how to route OOB messages between them.

Add a test provided by Philippe that tests this ability.

This commit was SVN r23438.
2010-07-20 04:22:45 +00:00
George Bosilca
519bbf6b6b Remove my patch (r23238) and push Scott Atchley patch. Thanks Scott.
This commit was SVN r23435.

The following SVN revision numbers were found above:
  r23238 --> open-mpi/ompi@c8ee150c95
2010-07-19 20:46:12 +00:00
Jeff Squyres
a8f69c9e3b This is no longer necessary; the orte DPM does the necessary
opal_progress_increment() and opal_progress_decrement().

This commit was SVN r23434.
2010-07-19 19:34:10 +00:00
Donald Kerr
f79c89e0e9 help maintain order established, and defined, during mca_btl_openib_add_procs()
This commit was SVN r23425.
2010-07-16 13:13:37 +00:00
Rolf vandeVaart
3abb5556a6 Fix bug pointed out by George Bosilca. Also
remove unneeded temp variable.

This commit was SVN r23424.
2010-07-15 19:32:31 +00:00
Shiqing Fan
5184208fca Correct the CMake temporary path.
This commit was SVN r23414.
2010-07-14 13:33:35 +00:00
Shiqing Fan
30c9f9c097 A few more files need to be excluded from windows build source.
This commit was SVN r23413.
2010-07-14 11:25:30 +00:00
Rolf vandeVaart
b7a27ab36a Add support for openib BTL failover to be used with bfo PML.
By default, feature is configured out so no effect on 
normal operation.

This commit was SVN r23412.
2010-07-14 10:08:19 +00:00
Shiqing Fan
5b37e2922c Use semicolon as the separator for Windows, as colon is normally part of the windows path.
This commit was SVN r23411.
2010-07-14 09:12:10 +00:00
Shiqing Fan
fb5a0ecdc0 Fix rcache for Windows.
This commit was SVN r23409.
2010-07-14 09:04:34 +00:00
Jeff Squyres
bfe6c95dce Remove .windows because it doesn't exist in this directory.
This commit was SVN r23406.
2010-07-14 02:14:32 +00:00
Shiqing Fan
b904d4826f Get rid of a warning of using void pointer in arithmetic.
This commit was SVN r23393.
2010-07-13 21:43:38 +00:00
Rolf vandeVaart
e27f953fa4 Fix casting in assert.
This commit was SVN r23388.
2010-07-13 12:02:12 +00:00
Rolf vandeVaart
fb19872806 Two new flag definitions needed by the new PML.
This commit was SVN r23386.
2010-07-13 11:30:43 +00:00
Rolf vandeVaart
19d007a6fc New PML to support failover between openib BTLs.
openib BTL changes coming soon.

This commit was SVN r23385.
2010-07-13 10:46:20 +00:00
Ralph Castain
570d19106b Allow singletons to use ompi-server for rendezvous via pubsub as well as comm_spawn without starting their own local daemons
This commit was SVN r23384.
2010-07-13 06:33:07 +00:00
Rolf vandeVaart
b4af9c0efc Fix casts so trunk compiles
This commit was SVN r23381.
2010-07-13 01:52:22 +00:00
Ralph Castain
4a94ea53d3 Minor cleanup - if any jobid in the remote group is different from the local group, then flag disconnect
This commit was SVN r23379.
2010-07-12 21:39:56 +00:00
Ralph Castain
84d63a46cd Remove a hard-coded limit of 64 independent jobs that could connect/accept together
This commit was SVN r23378.
2010-07-12 18:34:33 +00:00
Shiqing Fan
8de5654bf9 Add new files into the tarball.
This commit was SVN r23377.
2010-07-12 16:21:46 +00:00
Shiqing Fan
cdc7e0bec9 Mainly type casts.
Get rid of pthread and other unnecessary stuffs for Windows.

This commit was SVN r23376.
2010-07-12 16:17:56 +00:00
Shiqing Fan
e3be90ff22 Update CMake modules, adding initial support for openib.
This commit was SVN r23373.
2010-07-12 15:28:37 +00:00
Shiqing Fan
c51c262e67 Relevant Windows fixes for r23360.
This commit was SVN r23363.

The following SVN revision numbers were found above:
  r23360 --> open-mpi/ompi@31295e8dc2
2010-07-07 16:58:16 +00:00
Jeff Squyres
87e17a41da Ensure that the com_rules[] array entries are initialized to NULL in
case individual entries aren't used, but dynamic rules are enabled
(i.e., at least one or more of them are not NULL, meaning that they'll
all be assumed to be either NULL or a valid value).

This commit was SVN r23361.
2010-07-07 14:04:18 +00:00
Ralph Castain
31295e8dc2 As discussed on today's telecon, reorganize the debugger attachment code in orte to better support efforts within the tool community aimed at exploring alternative methods. Move the debugger attachment code from the orterun directory to a new debugger framework. Organize the existing standard support code into an "mpir" component. Organize the current extensions for co-spawning debugger daemons into a separate "mpirx" component.
Since the MPIR symbols are now included in the ORTE library, remove duplicate declarations in OMPI and replace them with extern references to their ORTE instantiations.

This commit was SVN r23360.
2010-07-06 23:35:42 +00:00
Jeff Squyres
1802325a39 Rename "libtrace" to be "libompitrace" so as not to conflict with an
already-existing "libtrace" on some BSD distros.

This commit was SVN r23357.
2010-07-06 21:48:15 +00:00
Jeff Squyres
c8bb7537e7 Remove include/opal/sys/cache.h -- its only purpose in life was to
#define CACHE_LINE_SIZE to 128.  This name has a conflict on NetBSD,
and it seems kinda odd to have a header file that ''only'' defines a
single value.  Also, we'll soon be raising hwloc to be a first-class
item, so having this file around seemed kinda weird.

Therefore, I replaced CACHE_LINE_SIZE with opal_cache_line_size, an
int (in opal/runtime/opal_init.c and opal/runtime/opal.h) on the
rationale that we can fill this in at runtime with hwloc info (trunk
and v1.5/beyond, only).  The only place we ''needed'' a compile-time
CACHE_LINE_SIZE was in the BTL SM (for struct padding), so I made a
new BTL_SM_ preprocessor macro with the old CACHE_LINE_SIZE value
(128).  That use isn't suitable for run-time hwloc information,
anyway.

This commit was SVN r23349.
2010-07-06 14:33:36 +00:00
Jeff Squyres
6d77118254 Fixes for FT code that came from recent shared memory updates.
This commit was SVN r23348.
2010-07-06 12:58:48 +00:00
Jeff Squyres
8fef296b8a Updates about thread support levels.
This commit was SVN r23341.
2010-07-02 13:14:09 +00:00
Ralph Castain
b4422e012c Fix a typo that breaks ompi_info if --enable-sensors
This commit was SVN r23338.
2010-07-02 02:38:55 +00:00
Jeff Squyres
222c4c8dd8 Reformat the verbatim sections of these man pages for narrower (80
char) displays. 

This commit was SVN r23325.
2010-07-01 18:52:45 +00:00
Jeff Squyres
e82e7f896e These compile warnings have been forever; I finally got inspired to
fix them.

This commit was SVN r23316.
2010-06-28 17:26:38 +00:00
Jeff Squyres
1fad51776d Also add <stdlib.h> for exit().
This commit was SVN r23308.
2010-06-28 15:17:42 +00:00
Jeff Squyres
f9d4426c19 OS X / Absoft needs <string.h>
This commit was SVN r23307.
2010-06-28 15:15:06 +00:00
Nadia Derbey
c22e6b3613 openib btl unsafe in case of extremely low srq settings
This commit was SVN r23301.
2010-06-24 09:59:45 +00:00
Jeff Squyres
ea05c73cfc Use the right number of characters for the strncmp. Thanks to Brad
for catching that!

This commit was SVN r23281.
2010-06-18 15:45:38 +00:00
Jeff Squyres
cdc5541cb0 Search for "dlname", not "dlopen". This value will be filled in if
there is a DSO to open.

This commit was SVN r23280.
2010-06-18 15:13:34 +00:00
Matthias Jurenz
1467f2db52 Added workaround for PGI compiler bug (see http://www.pgroup.com/support/release_tprs_90.htm TPR 4337):
Disable OpenMP if compiler version is less than 9.0-3.

This commit was SVN r23274.
2010-06-15 07:16:13 +00:00
Jeff Squyres
b620e63bdc Add in 2 cases for where this test may be skipped:
1. If opal wasn't built with libltdl support
 1. If opal was built statically (i.e., dlopen='' in the .la file)

This commit was SVN r23270.
2010-06-14 16:06:43 +00:00
Shiqing Fan
d391c57b0f A more proper fix for the HANDLE definition.
This commit was SVN r23269.
2010-06-14 14:17:07 +00:00
Samuel Gutierrez
2fb7c344fc Added a new System V (sysv) shared memory component for Open MPI.
Configure Option:
--enable-sysv

MCA Parameter:
mpi_common_sm

mpi_common_sm accepts a comma delimited list of: [sysv],mmap (order
dependent).  The first component that is successfully selected is used. For
example, -mca mpi_common_sm sysv,mmap will first try sysv. If sysv is not
successfully selected, then mmap will be used.  mmap will be used if 
mpi_common_sm is not provided.

Notes:
Please make certain that your system's shmmax limit, or equivalent, is larger
than mpool_sm_min_size.  Otherwise, shmget may fail.

This commit was SVN r23260.
2010-06-09 16:58:52 +00:00
George Bosilca
c8ee150c95 If we fail to correctly initialize the MX device, don't mark it as initialized.
This commit was SVN r23238.
2010-06-02 15:00:42 +00:00
Jeff Squyres
e45be29f0d This function shouldn't have an ibv_ prefix -- it's not part of
verbs (it's just a static convenience function here in this file).  

This commit was SVN r23237.
2010-06-02 12:54:56 +00:00
Jeff Squyres
7676d5adda Change "intra-communicator" to "inter-communicator". Thanks to
Simon/Number Cruncher for reporting the typo.

This commit was SVN r23236.
2010-06-02 12:35:53 +00:00
Christopher Yeoh
712907affa Removing memory barriers which are not needed because of
the extra memory barriers which were added in r22880. This 
reverts all of r22879

This commit was SVN r23234.

The following SVN revision numbers were found above:
  r22879 --> open-mpi/ompi@768ea2bab0
  r22880 --> open-mpi/ompi@cd5294944b
2010-06-02 00:38:47 +00:00
Jeff Squyres
5d386fc678 Per #2420, string handling of the Fortran array_of_argv argument to
MPI_COMM_SPAWN_MULTIPLE was just wrong.  This commit renames a few
variables to make their meaning a bit more clear and fixes up all
known issues with converting a 2D array of Fortran strings to a set of
C-style argv vectors.

Fixes trac:2420.

This commit was SVN r23217.

The following Trac tickets were found above:
  Ticket 2420 --> https://svn.open-mpi.org/trac/ompi/ticket/2420
2010-05-28 12:40:42 +00:00
Jeff Squyres
620c0eb160 Be a little more verbose about argv / array_of_argv parameters to
MPI_Comm_spawn / MPI_Comm_spawn_multiple, particularly the Fortran
variants.

This commit was SVN r23216.
2010-05-28 11:57:45 +00:00
Jeff Squyres
0061f2170d ompi/mpi/c/request_get_status.c (MPI_Request_get_status): If
opal_progress is called then check the status of the request before
returning. opal_progress is called only once.  This logic parallels
MPI_Test (ompi_request_default_test).

Thanks to Shaun Jackman for submitting the patch.

This commit was SVN r23215.
2010-05-27 21:37:11 +00:00
Jeff Squyres
464bd8c56e Fix typo
This commit was SVN r23212.
2010-05-27 21:19:38 +00:00
Rolf vandeVaart
27f070a575 Start setting a flag when a port error is detected on the openib BTL.
At this point, it is just cleared (and ignored) so default behavior has not changed.
However, future failover support can take advantage of this flag.
Reviewed by Pasha Shamis.

This commit was SVN r23204.
2010-05-24 18:57:55 +00:00
Jeff Squyres
fec7918eea Some paffinity functions had their return status overloaded:
* If < 0, it's an OPAL_ERR_* value
 * If >= 0, it's the actual output value of the function

This is problematic for the OPAL_SOS stuff.  This commit changes those
functions to always return OPAL_* statuses and send the output value
back through output parameters (like 95% of the rest of the code
base).  This avoids the confusion with OPAL_SOS stuff and makes
paffinity work again (e.g., mpirun --bind-to-core ...).

I updated all paffinitiy modules for the new function signatures, and
bumped the paffinity API version up to 2.0.1.  I don't think the
version change will matter, though, because we'll be introducing
support for hardware threads soon, which will either bump the
paffinity version again or we'll replace paffinity with 
a new framework.

This commit was SVN r23197.
2010-05-21 16:55:28 +00:00
Shiqing Fan
857f1669e2 Solve a few compilation problems on Windows.
This commit was SVN r23193.
2010-05-21 14:30:15 +00:00
Edgar Gabriel
f6598138ba fix some instances, where we might have allocated 0 bytes. Also, for allgather
make sure that we do not call coll_gather and coll_bcast in the very same
instances, since some collective (intra) modules do not seem to like the fact
if they are called for scount or rcount being zero (for regular
intra-communicator operations, this is handled on the MPI API layer).

Fixes trac:2405

This commit was SVN r23188.

The following Trac tickets were found above:
  Ticket 2405 --> https://svn.open-mpi.org/trac/ompi/ticket/2405
2010-05-20 22:23:44 +00:00
Edgar Gabriel
5881719d84 checks for sendcount and recvcount(s) being zero have slightly different
consequences depending on whether the communicator is an intra or an inter
communicator. 

fixes trac:2415

This commit was SVN r23187.

The following Trac tickets were found above:
  Ticket 2415 --> https://svn.open-mpi.org/trac/ompi/ticket/2415
2010-05-20 22:21:26 +00:00
George Bosilca
b56ab33ff6 Indent and fix some uninitialized variables.
This commit was SVN r23179.
2010-05-19 21:20:33 +00:00
George Bosilca
c51932c250 Don't forget to initialize "line" in all cases.
This commit was SVN r23178.
2010-05-19 21:19:45 +00:00
Rolf vandeVaart
03b3e75f86 Add two arguments to the PML error callback function. This
allows the BTL to specify a specific ompi_proc_t that had an
error.  Also add an optional descriptive string.  Currently, arguments
are not used but will be by future failover PML. 
Changes based on RFC.  Reviewed by George Bosilca.

This commit was SVN r23174.
2010-05-19 11:55:45 +00:00
Abhishek Kulkarni
c63c4d6892 Fix bugs where (OMPI_ERROR == *) checks cannot be converted to (OMPI_SUCCESS != *) since the return codes are overloaded to return an "index" on success.
The fix is to just check if the return value is positive or not, since all the SOS encoded errors are *always* negative.

The real fix (as Ralph points out) is to change these functions (opal_pointer_array_add and mca_base_param*) to return the index as a pointer.

This commit was SVN r23173.
2010-05-18 20:54:11 +00:00
Josh Hursey
f57e73d4e5 add a few more missing SOS includes
This commit was SVN r23168.
2010-05-18 15:00:07 +00:00
Abhishek Kulkarni
afbe3e99c6 * Wrap all the direct error-code checks of the form (OMPI_ERR_* == ret) with
(OMPI_ERR_* = OPAL_SOS_GET_ERR_CODE(ret)), since the return value could be a
 SOS-encoded error. The OPAL_SOS_GET_ERR_CODE() takes in a SOS error and returns
 back the native error code.

* Since OPAL_SUCCESS is preserved by SOS, also change all calls of the form
  (OPAL_ERROR == ret) to (OPAL_SUCCESS != ret). We thus avoid having to
  decode 'ret' to get the native error code.

This commit was SVN r23162.
2010-05-17 23:08:56 +00:00
Jeff Squyres
91507e595f Fix bug reported on user list; set the errhandler type properly.
This commit was SVN r23145.
2010-05-15 13:04:32 +00:00
Rolf vandeVaart
9e300703ec Add reference to trac ticket as requested by code review.
This commit was SVN r23123.
2010-05-13 13:55:54 +00:00
Jeff Squyres
c7c3de87f5 Add ummunotify support to Open MPI. See
http://marc.info/?l=linux-mm-commits&m=127352503417787&w=2 for more
details.

 * Remove the ptmalloc memory component; replace it with a new "linux"
   memory component.
 * The linux memory component will conditionally compile in support
   for ummunotify.  At run-time, if it has ummunotify support and
   finds run-time support for ummunotify (i.e., /dev/ummunotify), it
   uses it.  If not, it tries to use ptmalloc via the glibc memory
   hooks. 
 * Add some more API functions to the memory framework to accomodate
   the ummunotify model (i.e., poll to see if memory has "changed").
 * Add appropriate calls in the rcache to the new memory APIs to see
   if memory has changed, and to react accordingly.
 * Add a few comments in the openib BTL to indicate why we don't need
   to notify the OPAL memory framework about specific instances of
   registered memory.
 * Add dummy API calls in the solaris malloc component (since it
   doesn't have polling/"did memory change" support).

This commit was SVN r23113.
2010-05-11 21:43:19 +00:00
Jeff Squyres
359d7e122e Fix a problem noted by Paul Kapinos: MPI_COMM_SET_ERRHANDLER and
MPI_WIN_SET_ERRHANDLER had their MPI handle parameters marked as INOUT
instead of IN, thereby disallowing passing pre-defined handles through
because they are constants (e.g., MPI_COMM_WORLD).  This wasn't really
a problem for MPI_WIN_SET_ERRHANDLER since there are no predefined
windows, but it wasn't right.

This commit was SVN r23098.
2010-05-04 20:05:35 +00:00
Ralph Castain
9dfb5c7c62 Rename the orte state framework to be "db", which more accurately reflects its overall capabilities since it can store any kind of data (not just state, although that will be its primary purpose). Update tools and tests accordingly. Add a daemon module for storing data on the daemons - requires --enable-multicast, so it won't build unless that is set
This commit was SVN r23082.
2010-05-03 04:11:03 +00:00
Matthias Jurenz
29f02d88c6 - fixed buffer overflow which occurred if marker or event comment records generated
- fixed bug in MPI-I/O tracing: tracking MPI file handles even if MPI_File_open isn't recorded
- fixed compiler warnings in the PAPI component
- incremented version number to 5.8.2

This commit was SVN r23074.
2010-04-30 10:04:26 +00:00
Jeff Squyres
b6e401a512 Fix minor typo.
This commit was SVN r23067.
2010-04-29 11:45:25 +00:00
Jeff Squyres
28046f601e Fix typos.
This commit was SVN r23055.
2010-04-28 00:07:04 +00:00
Ralph Castain
b9893aacc5 Add a sensor framework to ORTE that monitors applications and notifies the errmgr when they exceed specified boundaries. Two modules are included here:
1. file activity - can monitor file size, access and modification times. If these fail to change over a specified number of sampling iterations (rate is an mca param), then the errmgr is notified.

2. memory usage - checks amount of memory used by a process. Limit and sampling rate can be set.

This support must be enabled by configuring --enable-sensors.

ompi_info and orte-info have been updated to include the new framework.

Also includes some initial steps toward restoring the recovery capability. Most notably, the ODLS API has been extended to include a "restart_proc" entry for restarting a local process, and organizes the various ERRMGR framework globals into a single struct as we do in the other ORTE frameworks. Fix an oversight in the ERRMGR framework where a pointer array was constructed, but not initialized.

Implementation continues.

This commit was SVN r23043.
2010-04-26 22:15:57 +00:00
George Bosilca
321213e779 Fix segmentation fault on heterogeneous architectures. Don't mess with the
ompi_ptr_t by translating into void*. Instead keep it as an ompi_ptr_t all
the way. Thanks to Timur Magomedov for helping to track down this issue and
test the patch.

cmr:v1.4
cmr:v1.5

This commit was SVN r23030.
2010-04-23 15:14:55 +00:00
Shiqing Fan
077f6e6398 Type casts for building dynamical Fortran libraries.
And export correct function names.

This commit was SVN r23020.
2010-04-22 15:48:27 +00:00
Jeff Squyres
359464a144 Add an "affinity" Open MPI extension (also describe the
--enable-mpi-ext configure switch in the top-level README file).

See Josh's excellent wiki page about OMPI extensions:

    https://svn.open-mpi.org/trac/ompi/wiki/MPIExtensions

This extension exposes a new API to MPI applications: 

{{{
int OMPI_Affinity_str(char ompi_bound[OMPI_AFFINITY_STRING_MAX],
                      char current_binding[OMPI_AFFINITY_STRING_MAX],
                      char exists[OMPI_AFFINITY_STRING_MAX]);
}}}

It returns 3 things.  Each are a prettyprint string describing sets of
processors in terms of sockets and cores:

 1. What Open MPI bound this process to.  If Open MPI didn't bind this
    process, the prettyprint string says so.
 1. What this process is currently bound to.  If the process is
    unbound, the prettyprint string says so.  This string is a
    separate OUT parameter to detect the case where some other entity
    bound the process (potentially after Open MPI bound it).
 1. What processors are availabile in the system, mainly for reference.

This commit was SVN r23018.
2010-04-21 17:28:08 +00:00
Shiqing Fan
d1e66bdd01 Use variables instead of hard-coded compiler flags, in order to support various C/C++ compilers on Windows.
This commit was SVN r23016.
2010-04-21 12:45:00 +00:00
Shiqing Fan
e539322807 Move definitions to the main config file.
This commit was SVN r23015.
2010-04-21 09:17:10 +00:00
Matthias Jurenz
d92819826b - fixed detection of older PGI compilers on CrayXT platforms
- added detection for Intel compilers on CrayXT platforms

This commit was SVN r23011.
2010-04-20 10:33:02 +00:00
Matthias Jurenz
1ae62f6fb6 Fixed the OpenMP barrier for the progress report which had a deadlock
This commit was SVN r22991.
2010-04-19 14:49:14 +00:00
Matthias Jurenz
8441b4f7e0 Improved configure tests for CrayXT platforms:
- added default option file
- added detection of the compiler loaded by the programming environment

This commit was SVN r22988.
2010-04-19 13:46:56 +00:00
Ralph Castain
4d06125a33 Establish a method by which a process knows if it has been bound by mpirun. This helps resolve a problem where a process gets "bound" to all available resources, which looks to the opal paffinity system as "not bound". This can cause mpi_init to attempt to "bind" the process itself, causing unintended behavior.
This commit was SVN r22985.
2010-04-17 01:58:26 +00:00
Ralph Castain
41428e6b61 Issue a warning if a requested binding operation results in processes being bound to all available processes, which is the equivalent of not being bound at all.
See the following email thread for further details:

http://www.open-mpi.org/community/lists/devel/2010/04/7745.php

This commit was SVN r22984.
2010-04-17 01:02:41 +00:00
Samuel Gutierrez
7654b39349 Fix segfault in two error paths.
This commit was SVN r22978.
2010-04-15 15:51:57 +00:00
Jeff Squyres
181331d65e Very minor nits/updates.
This commit was SVN r22977.
2010-04-15 14:44:55 +00:00
Rolf vandeVaart
892091c77d After fix 22669 was applied which allowed for more than 8 interfaces, it was discovered that the connection algorithm did not scale. Therefore, switch to a simpler algorithm in the extremely rare case when one has more than 8 interfaces. This commit fixes trac:2301.
This commit was SVN r22976.

The following Trac tickets were found above:
  Ticket 2301 --> https://svn.open-mpi.org/trac/ompi/ticket/2301
2010-04-14 14:18:35 +00:00
Matthias Jurenz
15a2260ca9 Do not build MPI Correctness Checking support inside Open MPI
This commit was SVN r22967.
2010-04-13 08:56:28 +00:00
Matthias Jurenz
175fd07de4 VT enhancements:
- extendet support for BlueGene/P:
	- building shared VT libraries
	- tracing 3rd-party libraries (e.g. libc I/O)
	- tracing multi-threaded applications 
VT configure fixes:
- fixed detection on CTool for 3rd-party library tracing
VT fixes:
- reduced memory overhead by using the trace buffer for string/array elements of some records
- do not shutdown call-stack if max. number of buffer flushes reached, because the additional function leaves suggest a wrong application flow
- vtunify-mpi:
	- fixed conversion of VTUnify_MPI_Aint arrays 
- vtwrapper:
	- if an OPARI modified object file (*.mod.o) cannot be renamed, abort only if the compiler wrapper runs in "only-compile" mode (-c) 
OTF fixes:
- otfinfo:
	- fixed and enhanced calculation of trace file size
	- changed unit of timer resolution (s -> Hz) 
- otfprofile:
	- fixed progress
	- kill '_' and '\' in process names to make LaTex happier

This commit was SVN r22963.
2010-04-13 07:20:56 +00:00
Rainer Keller
a48a11821b - mca_base_param_reg_string_name allocates default_pml.
As it is strdup, just free(default_pml).

   cmr:v1.5

This commit was SVN r22955.
2010-04-12 19:54:07 +00:00
Pavel Shamis
fc077a2102 Fix a minor bug in the error flow of check_if_device_support_modify_srq
Signed-off-by: Ishai Rabinovitz <ishai@mellanox.co.il> 

This commit was SVN r22953.
2010-04-12 11:28:44 +00:00
Rolf vandeVaart
0adb570693 Add pml_ob1_verbose flag. Fix the current location it is being used
This commit was SVN r22939.
2010-04-07 13:51:42 +00:00
Jeff Squyres
eaed49594c Fix typo (I'm assuming this was a copy-n-paste error :-) ).
This commit was SVN r22902.
2010-03-29 21:54:02 +00:00
Ralph Castain
24c3b4f849 Add the sysinfo framework to the "info" tools, especially since the odls_base_open function calls it!
This commit was SVN r22901.
2010-03-29 20:47:29 +00:00
Ralph Castain
522a23d6a3 A few changes to the FT-related configure options:
1. fix a bug that caused an infinite loop in configure when specifying want-ft but not want-ft-thread by removing a stale reference to the opal-progress-thread option

2. add want-ft=orcm so we can build the orcm errmgr component

3. cleanup the use of "ompi_want_ft_xxx" and replace it with "opal_want_ft_xxx" so that naming conventions are preserved

This commit was SVN r22885.
2010-03-25 22:53:48 +00:00
Christopher Yeoh
a6175bbefc Adds copyright notice that should have gone in with r22700
This commit was SVN r22881.

The following SVN revision numbers were found above:
  r22700 --> open-mpi/ompi@774a7a58b0
2010-03-25 04:03:52 +00:00
Christopher Yeoh
768ea2bab0 fixes trac:2351 - race in use of ompi free lists
Adds memory barriers which are definitely needed on powerpc

This commit was SVN r22879.

The following Trac tickets were found above:
  Ticket 2351 --> https://svn.open-mpi.org/trac/ompi/ticket/2351
2010-03-25 03:38:14 +00:00
Christopher Yeoh
81e06a2baf fixes trac:2340 - race in mca_mpool_base_free
This commit was SVN r22878.

The following Trac tickets were found above:
  Ticket 2340 --> https://svn.open-mpi.org/trac/ompi/ticket/2340
2010-03-25 03:29:27 +00:00
Christopher Yeoh
0b93c87c2c Correct year for copyright notices
This commit was SVN r22877.
2010-03-25 03:14:21 +00:00
George Bosilca
c0ff44b9fe Don't let ROMIO mishandle the displacement for contiguous data with a non-zero
true_lb. Thanks to Pascal Deveze for the patch.

This commit was SVN r22864.
2010-03-23 01:23:45 +00:00
Matthias Jurenz
f01f70b8f6 - added missing header include
- moved warning message into an ifdef of OTF_VERBOSE

This commit was SVN r22860.
2010-03-22 13:08:13 +00:00
Shiqing Fan
66f1f1a69a Need to export this function for MPI C++ library on Windows.
This commit was SVN r22856.
2010-03-22 09:09:49 +00:00
Ralph Castain
e291fc2c69 With Jeff's help, get the libraries to link as required.
Update ompi_info and orte-info to include the new framework.

Fix some selection logic and a typo'd variable name

Still remains ompi_ignored until we complete testing

This commit was SVN r22848.
2010-03-18 02:12:59 +00:00
Matthias Jurenz
9aec91838b Fixed segfault which might occur if the application uses fork's
This commit was SVN r22847.
2010-03-17 11:51:04 +00:00
George Bosilca
1ed7fe5057 The mpool should take the same route as the rest of the pcie modules.
This commit was SVN r22844.
2010-03-17 04:16:23 +00:00
Ralph Castain
b400b84162 Merge in the modified thread configure option branch per today's telecon.
Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0.

Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message

Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple

This commit was SVN r22841.
2010-03-16 23:10:50 +00:00
Ralph Castain
4990cc41b6 Remove stale config file - the pcie btl has already been removed
This commit was SVN r22840.
2010-03-16 23:06:38 +00:00
Jeff Squyres
7b3ac4fb73 Refs trac:2273
After talking to both Brian and George, the conensus was to just
remove the flag and the test function.  Begone, evil spirits, BEGONE!

This commit was SVN r22831.

The following Trac tickets were found above:
  Ticket 2273 --> https://svn.open-mpi.org/trac/ompi/ticket/2273
2010-03-16 00:47:10 +00:00
Rainer Keller
814fb9399f - Further patches for support on NetBSD (and DragonFly) by
Aleksej Saushev.
   Dont use bash or bashism in shell scripts
   We should use Posix' setpgid(0,0), which is equivalent to setpgrp().

This commit was SVN r22829.
2010-03-15 05:33:42 +00:00
Jeff Squyres
bb314911b3 If we get OMPI_ERR_UNREACH from the PML, print a slightly more
specific error.  Suggested by Nick Edmonds: 

    http://www.open-mpi.org/community/lists/users/2010/03/12339.php

This commit was SVN r22828.
2010-03-14 00:09:55 +00:00
Rainer Keller
f6e4694d67 - Print the name correctly when a certain sync module is disabled
This should be cmr'd to v1.5 and v1.4.2 (but the svn post hook won't
   let me at the moment).

This commit was SVN r22827.
2010-03-13 21:07:34 +00:00
Josh Hursey
e9b5162d79 Fix the configure logic for --with-ft so that it properly takes a comma separated list.
Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those.

The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed.

This commit was SVN r22824.
2010-03-12 23:57:50 +00:00
Matthias Jurenz
86e58eb6d3 VT configure fixes:
- skip test for libdl on BlueGene? and CrayXT platforms (particularly on CrayXT this library can be linked but it isn't suitable)
- set cache variables for functions 'PMPI_Win_test', 'PMPI_Win_lock', 'PMPI_Win_unlock', and 'MPI_Register_datarep', if VT is configuring for Open MPI
- added test for pthread functions 'pthread_condattr_<set|get>pshared' and 'pthread_mutexattr_<set|get>pshared', because they are not available on some platforms
VT fixes:
- cut 'nm' collected symbol names at '??'
- vtunify:
        - fixed unsafe usage of some strncpy's
        - fixed potential segmentation fault in vtunify-mpi which might occur on 32bit platforms using MPICH2
OTF general:
- updated date in copyright header of each source file
OTF fixes:
- minor code cleanups (indentation, nicer error messages, more correct free's)
- otfaux:
        - fix to place final statistics after the very last record instead of right before
        - changed fatal error to a warning when a file is closed twice (or unexpectedly)

This commit was SVN r22820.
2010-03-12 11:03:45 +00:00
Josh Hursey
3db01f0795 Add the process name to the error message resulting from a failed mmap(), open(), or ftruncate() so that it is slightly easier to figure out which process in the system caused the problem with sm.
This commit was SVN r22803.
2010-03-10 00:18:04 +00:00
Samuel Gutierrez
15f9f35a49 Another small typo fix.
This commit was SVN r22802.
2010-03-09 21:23:21 +00:00
Samuel Gutierrez
dcb5a2331f Fixed some typos in comments.
This commit was SVN r22801.
2010-03-09 20:41:25 +00:00
Rainer Keller
0feb158aaf - Since r22727 orte_app_idx_t was introduced, being a uint32_t (was
previously an orte_std_cntr_t, which is int32_t).
   Comparison with < 0 don't make any sense, here.

This commit was SVN r22799.

The following SVN revision numbers were found above:
  r22727 --> open-mpi/ompi@2541aa98ab
2010-03-08 22:56:33 +00:00
Rainer Keller
06f5ba1c19 - Reverse the logic (OPAL_LIKELY -> OPAL_UNLIKELY)
This commit was SVN r22796.
2010-03-08 14:00:59 +00:00
Jeff Squyres
95d7e08a66 More more discussion and testing has occurred off-ticket.
Short version: there is a bug in OS X/Snow Leopard, but there is also
a bug in Open MPI.  Fixing the bug in Open MPI is both trivial (a
1-line change) and avoids the bug in OS X.  We'll file an OS X bug
report upstream with Apple, but it should no longer affect us here in
OMPI.

Fixes trac:2039.

More details:

Some background first: 

 1. IPv4 sockets can only accept incoming IPv4 connections.  However,
    IPv6 sockets can be configured to accept ''only'' incoming IPv6
    connection, or ''both'' incoming IPv4 and IPv6 connections.  An
    IPv6 socket attribute sets which listening behavior is used.
 1. IPv4 and IPv6 have different port namespaces.  Hence, it is
    permissable to bind a v4 socket to port X ''and'' also bind a v6
    socket to that same port X on the same interface (assuming that
    the v6 socket is only accepting incoming v6 connections).
    Incoming v4 connections to port X on the interface should get
    matched to the listening v4 socket; incoming v6 connections should
    get matched to the listening v6 socket.
 1. When v6 sockets accept ''both'' incoming v4 and v6 connections, it
    should claim port X in both namespaces.
 1. Linux's default behavior is to only allow one listening socket to
    be bound to a given port (i.e., ''either'' a v6 or v4 socket to be
    bound to a single port X -- not both).  A v6 socket can listen for
    both v4 and v6 incoming connections on that port, but still --
    only one socket will be bound to that port.
 1. Snow Leopard's default behavior is to share ports -- i.e., let
    both a v4 and a v6 listening socket to be bound to port X
    (assuming that the v6 socket is only accepting incoming v6
    connections).

The TCP BTL creates a listening socket for each address family.
Hence, it creates a v4 listening socket on INADDR_ANY and a v6
listening socket on the v6 equivalent of INADDR_ANY.  OMPI then
iteratively tries to find ports to listen on within the range of
[mca_btl_tcp_port_min, mca_btl_tcp_port_min + mca_btl_tcp_port_range).

On Linux, the v4 socket will be bound to port X and the v6 socket will
likely be bound to port Y (where X != Y).  On Snow Leopard, the v4
socket will be bound to port X and the v6 socket may ''also'' be bound
to port X.  Since the namespaces are separate, this shouldn't be a
problem.

However, Open MPI was accidentally setting the v6 listening behavior
to accept ''both'' v4 and v6 incoming connections.  This is a trivial
thing to fix -- change a 0 to a 1 in the code.  On Linux, this issue
didn't matter because the v4 and v6 sockets were on different ports.
So even though the v6 socket ''would'' have accepted incoming v4
connections, that never happened because OMPI would direct v4
connections to the v4 port.

But on Snow Leopard, the v4 and v6 listening ports could end up
sharing the same port number.  As mentioned above, this ''shouldn't''
have been a problem, but it looks like Snow Leopard has the following
bugs:

 * If a v4 socket is already bound to port X, we're pretty sure that a
   v6 socket with the "accept both v4 and v6 incoming connections"
   listening behavior should not be able to claim port X (because
   there's already a v4 socket listening on X).  However, Snow Leopard
   would allow binding a v4 socket to port X, and then allow a v6
   socket configured to allow incoming v4 and v6 connections to
   ''also'' be bound to port X.
 * After binding the v6 socket to port X, Snow Leopard then lets
   ''another'' v4 socket ''also'' get bound to port X.  Hence, there's
   now '''three''' sockets all listening on port X.

This obviously led to mis-matched TCP connections, and things went
downhill from there.

That being said, Snow Leopard doesn't exhibit this behavior if v6
sockets only allow incoming v6 connections.  And technically, that is
exactly the behavior we want (we want v6 sockets to only accept
incoming v6 connections).  So if we just change the flag to make our
v6 listening socket us this behavior, the problem on OS X goes away.

That's what this commit does -- it changes a 0 to a 1, indicating
"only let this v6 socket allow incoming v6 connections."

That was simple, wasn't it?

This commit was SVN r22788.

The following Trac tickets were found above:
  Ticket 2039 --> https://svn.open-mpi.org/trac/ompi/ticket/2039
2010-03-05 17:37:57 +00:00
Matthias Jurenz
75d71239d1 Fixed bug in parsing nm-file:
Do not trigger a parse error if address is out of range. Ignore symbol instead.

This commit was SVN r22778.
2010-03-04 16:03:53 +00:00
Shiqing Fan
4c1fc87502 Set the compile flags for F77 on Windows more correctly.
This commit was SVN r22774.
2010-03-04 11:41:42 +00:00
Matthias Jurenz
5b9515225d - fixed stack shutdown if maximum number of buffer flushes was reached
- fixed potential stack underflow in vtfilter which might be cause a segmentation fault

This commit was SVN r22773.
2010-03-04 08:08:20 +00:00
Iain Bason
18d9e96301 Fixed two problems:
1. The code that looks at btl_tcp_if_exclude before doing a
   modex_send uses strcmp rather than strncmp. That means that
   "lo0" gets sent even though "lo" is excluded.

2. The code that determines whether a particular local TCP
   interface can connect to a particular remote interface doesn't
   check for loopback interfaces. With this fix, users can now
   enable "lo" and be assured that it will only be used for intra-
   node communication.

This commit was SVN r22762.
2010-03-03 15:51:15 +00:00
George Bosilca
ec7fcf3f91 While building the profiling interface, ignore the
I/O functions if support for I/O is not requested.

This commit was SVN r22761.
2010-03-02 21:05:04 +00:00
Ralph Castain
c88fe1ea54 Create a new mca parameter to control creation of session directories. Defaults to true so that the current behavior of always creating them is preserved. If set to false (0), then don't create session directories. Helps in those environments where session directories are a problem.
Tell the sm btl that it cannot run if no session directories were created.

This commit was SVN r22756.
2010-03-02 15:18:33 +00:00
Matthias Jurenz
5f368a094f Restored support for Automake's silent rules
This commit was SVN r22741.
2010-03-01 13:10:27 +00:00
Matthias Jurenz
157942809c Use more portable 'nm' command instead of the BFD library to collect symbol information for instrumentation with the GNU, Intel, and PathScale
This commit was SVN r22737.
2010-03-01 12:20:41 +00:00
Ralph Castain
f4c3cceb5e Get the function prototypes to match so we eliminate an annoying warning
This commit was SVN r22726.
2010-02-27 16:41:16 +00:00
Jeff Squyres
b0eaebf46f Add Intel's OUI.
This commit was SVN r22723.
2010-02-26 19:54:16 +00:00
Rolf vandeVaart
2715141f6d Fix minor bug in the way we handle btl_tcp_if_include list.
This commit was SVN r22722.
2010-02-26 18:08:04 +00:00
Shiqing Fan
e1c009932b Add a few more fortran compile flags, and enable dynamic build for f77 library now.
This commit was SVN r22720.
2010-02-26 07:55:32 +00:00
Jeff Squyres
2e91de0bdd This has bugged me for a long, long time: rename btl_openib_iwarp.* ->
btl_openib_ip.*.  The routines in these files are not specific to
iwarp -- they are specific to IP interfaces used with IBV devices
(even IB or IBoE/RoCEE/whatever devices).

This commit was SVN r22718.
2010-02-25 21:04:09 +00:00
Jeff Squyres
a4a81698c2 Mostly a patch from Vasily/Mellanox to fix multi-port and 32/64 bit
issues with iwarp.c.  These fixes are needed for IBoE / ROCEE /
whateveritscalledtoday.  I added a few minor changes to his base
patch.

This commit was SVN r22717.
2010-02-25 20:57:05 +00:00
Eugene Loh
316892b49f Fix spelling of "degradation".
This commit was SVN r22714.
2010-02-25 19:41:59 +00:00
Pavel Shamis
9fbfe6b1c0 The fix resolves the bug #2307. QP creation may fail, since the calculation for _reserved_ does not check for QP type. As result the max_recv_wr may get wrong value . Needs to go to both cmr:v1.4.2 and cmr:v1.5.0
This commit was SVN r22713.
2010-02-25 11:15:20 +00:00
Ralph Castain
18c7aaff08 Update the grpcomm framework to be more thread-friendly.
Modify the orte configure options to specify --enable-multicast such that it directs components to build or not instead of littering the code base with #if's. Remove those #if's where they used to occur.

Add a new grpcomm "mcast" module to support multicast operations. Still some work required to properly perform daemon collectives for comm_spawn operations. New module only builds when --enable-multicast is provided, and when specifically selected.

This commit was SVN r22709.
2010-02-25 01:11:29 +00:00
Jeff Squyres
dd4945c194 New part ID's from Chelsio and Intel. May still get more from
Chelsio. 

This commit was SVN r22708.
2010-02-24 20:39:40 +00:00
Jeff Squyres
af6f1f4b00 Add pkg-config(1) config files to Open MPI. Additionally, fix a minor
bug: libmpi_f90 had libmpi.la in its LIBADD instead of libmpi_f77.la.

Fixes trac:2244.

This commit was SVN r22704.

The following Trac tickets were found above:
  Ticket 2244 --> https://svn.open-mpi.org/trac/ompi/ticket/2244
2010-02-24 18:46:06 +00:00
Pavel Shamis
99ee62771d The fix resolves bug #2292. We may to call for prepare_device_for_use() only after adding the btl to mca_btl_openib_component.openib_btls. Needs to go to both cmr:v1.4.2 and cmr:v1.5.0
This commit was SVN r22702.
2010-02-24 10:13:06 +00:00
Christopher Yeoh
774a7a58b0 Fixes case where there is unprotected access to
mca_osc_rdma_component.c_modules in ompi_osc_rdma_windx_to_module
Fixes case where there is unprotected access to
mca_osc_rdma_component.c_modules in ompi_osc_rdma_windx_to_module

This commit was SVN r22700.
2010-02-24 01:28:37 +00:00
Jeff Squyres
d9b6b5af0c This commit converts us to the "one big libmpi" scheme that has been
discussed extensively.  See
https://svn.open-mpi.org/trac/ompi/ticket/2092 and the RFC thread
http://www.open-mpi.org/community/lists/devel/2010/02/7447.php.

Specifically:

 * Create LT convenience libraries for OPAL and ORTE if the layer
   above them is being created (use the already-defined
   AM_CONDITIONALs to know if the project above us is being built).
 * ORTE slurps in the LT convenience library for OPAL; OMPI slurps in
   the LT convenience library for ORTE.
 * Wrapper compilers now only -l one library (e.g., ortecc only does
   -lopen-ret, and mpicc only does -lmpi).

This commit was SVN r22691.
2010-02-23 22:20:01 +00:00
Jeff Squyres
5ec2d8764b Amendment to r22671: change the name of the new communicator flag from
INTERNAL to EXTRA_RETAIN, because not all "internal" communicators
have this flag set (only internal communicators with CIDs less than
their parent).  Hence, what this flag ''really'' means is that there
was an extra RETAIN performed on it.  So name the flag just that --
EXTRA_RETAIN -- indicating that an extra RETAIN has occurred.

This commit was SVN r22690.

The following SVN revision numbers were found above:
  r22671 --> open-mpi/ompi@61dee816db
2010-02-23 21:24:07 +00:00
Jeff Squyres
583394e30b This help message got a little jumbled.
This commit was SVN r22689.
2010-02-23 21:09:16 +00:00
Christopher Yeoh
f79263550c This fixes trac:2265 removing a race in the openib btl endpoint when
increasing sequence numbers. cmr:v1.4

This commit was SVN r22684.

The following Trac tickets were found above:
  Ticket 2265 --> https://svn.open-mpi.org/trac/ompi/ticket/2265
2010-02-23 12:46:06 +00:00
Christopher Yeoh
c1dcf1c164 The release of memory used by registration lists in rcaches must be delayed until the rcache lock is not held or deadlock
can occur ( fixes trac:2111 ).
Should not deregister memory with the rcache lock held otherwise a deadlock can occur as the lower
level infiniband libraries can free memory ( fixes trac:2110 )

cmr:v1.4

This commit was SVN r22683.

The following Trac tickets were found above:
  Ticket 2110 --> https://svn.open-mpi.org/trac/ompi/ticket/2110
  Ticket 2111 --> https://svn.open-mpi.org/trac/ompi/ticket/2111
2010-02-23 11:31:58 +00:00
Christopher Yeoh
322e73d8c4 The ib_procs list in the openib btl is accessed without the ib lock in some cases. This causes races when running multithreaded. This patch adds protection of the ib_procs list with the ib_lock.
fixes trac:2149 cmr:v1.4

This commit was SVN r22682.

The following Trac tickets were found above:
  Ticket 2149 --> https://svn.open-mpi.org/trac/ompi/ticket/2149
2010-02-23 05:19:03 +00:00
Christopher Yeoh
a0b8f061a6 Destroying an rcache vma while the rcache lock is held
as this can result in a low level free of memory which
can require the rcache lock resulting in a deadlock

This fixes trac:2107 
cmr:v1.4

This commit was SVN r22679.

The following Trac tickets were found above:
  Ticket 2107 --> https://svn.open-mpi.org/trac/ompi/ticket/2107
2010-02-22 11:19:15 +00:00
Christopher Yeoh
11500e3267 Fixes bug where the wrong lock is taken in mca_btl_openib_alloc
when protecting the no_wqe_pending_frags list.

fixes trac:2118 add cmr:v1.4

This commit was SVN r22678.

The following Trac tickets were found above:
  Ticket 2118 --> https://svn.open-mpi.org/trac/ompi/ticket/2118
2010-02-22 08:14:45 +00:00
Christopher Yeoh
a14a5dc3c6 This fixes a bug where sometimes the rcache lock would be dropped when it wasn't actually held.
Also includes some minor copytight header additions that were missed in previous checkins
fixes trac:2101 added cmr:v1.4

This commit was SVN r22676.

The following Trac tickets were found above:
  Ticket 2101 --> https://svn.open-mpi.org/trac/ompi/ticket/2101
2010-02-22 07:40:42 +00:00
Shiqing Fan
fa6a050b80 Set the correct install source path.
This commit was SVN r22673.
2010-02-20 13:40:15 +00:00
Shiqing Fan
e0bfd9f836 A type cast.
This commit was SVN r22672.
2010-02-20 10:47:37 +00:00
Edgar Gabriel
61dee816db This commit fixes a bug on how to deal with the potential if a 'dependent'
communicator that we created has a lower CID than the parent comm. This can
happen when using the hierarch collective communication module or for
inter-communicators (since we make a duplicate of the original communicator).
This is not a problem as long as the user calls MPI_Comm_free on the parent 
communicator.  However, if the communicators are not freed by the user but
released by Open MPI in MPI_Finalize, we walk through the list of still
available communicators and free them one by one. Thus, local_comm is freed
before the actual inter-communicator. However, the local_comm pointer in the
inter communicator will still contain the 'previous' address of the local_comm
and thus this will lead to a segmentation violation. In order to prevent that
from happening, we increase the reference counter local_comm by one if its CID
is lower than the parent. We cannot increase however its reference counter if
the CID of local_comm is larger than the CID of the inter communicators, since
a regular MPI_Comm_free would leave in that the case the local_comm hanging
around and thus we would not recycle CID's properly, which was the reason and
the cause for this trouble.

This commit fixes tickets 2094 and 2166. Note however, that I want to close
them manually, since a slightly different patch is required for the 1.4
series. This commit will have to be applied for the 1.5 series. And I will
need a volunteer to review it.

This commit was SVN r22671.
2010-02-19 23:45:30 +00:00
Rainer Keller
548d6f7c61 - Incorporated a rewording proposal by Jeff.
This commit was SVN r22670.
2010-02-19 14:37:09 +00:00
George Bosilca
7eff2cdf85 Unrestricted number of interfaces.
This commit was SVN r22669.
2010-02-19 07:10:32 +00:00
Matthias Jurenz
111a424dac - removed hard-coded directory paths in vt_dyninst.c
- temporary disabled wrapper for 'fcntl' in vt_iowrap.c, due to curious behaviour on some platforms (e.g. segfault)

This commit was SVN r22659.
2010-02-18 10:36:20 +00:00
Pavel Shamis
a124f6b10b Adding a hash table for management dependences between SRQs and their BTL modules.
This commit was SVN r22653.
2010-02-18 09:48:16 +00:00
Ralph Castain
40be3d896c Ensure we set an error code when leaving, correctly check for slot_list_set return status
This commit was SVN r22643.
2010-02-17 22:59:19 +00:00
Jeff Squyres
c23e6f3d56 Add an opal_attribute_unused in here since we're no longer using this
parameter (I just discovered while researching for v1.4 that v1.4 has
effectively this same function definition: it just always returns
true!).

This commit was SVN r22642.
2010-02-17 21:12:49 +00:00
Jeff Squyres
898eedd78f Fixes trac:2233.
This commit adds a lengthy comment in ompi_datatype.h that explains
why a one-sided datatype check was removed.  The short version is that
we do have to allow some datatypes that may be unwise to use (e.g.,
"h" types of datatypes that have offsets in bytes -- MPI says it's ok
to use these), and our DDT engine can't currently detect datatypes
with absolute offsets, which MPI says it's ''not'' ok to use with
one-sided operations.  Hence, we don't check for some datatypes that
are invalid to use with one-sided operations, and erroneous programs
may crash and burn.  Life is hard.

The main point of this commit is that we now do allow datatypes for
one-sided operations that are supposed to be allowed.

This commit was SVN r22641.

The following Trac tickets were found above:
  Ticket 2233 --> https://svn.open-mpi.org/trac/ompi/ticket/2233
2010-02-17 20:16:55 +00:00
George Bosilca
3bceb20b1c Only get the receive datatype extent on the root process, as every
other process should ignore this value. Thanks to Michael Hofmann
for investigating this issue.

This commit closes trac:2268.

This commit was SVN r22639.

The following Trac tickets were found above:
  Ticket 2268 --> https://svn.open-mpi.org/trac/ompi/ticket/2268
2010-02-17 16:01:50 +00:00
Matthias Jurenz
1ce37bc5ce VT general:
- Updated date in copyright header of each source file
VT configure fixes:
- fixed configure's version detection for PAPI to support version 4.x
- added configure tests to detect Bull MPICH2
VT new features:
- added support for "re-locate" an existing VampirTrace? installation without re-build it from source (fixes OMPI's ticket #1990)
- added support for tracing functions in shared libraries instrumented by the GNU, Intel, Pathscale, ot PGI 9 compiler
- added support for PAPI-C counters which belong to different components
- extended usability of environment variable VT_METRICS for PAPI counters to specifiy whether a counter provides increasing or absolute values

This commit was SVN r22637.
2010-02-17 14:38:11 +00:00
Shiqing Fan
3a3018deef Convert the line endings for the added header files. They were changed automatically by Windows when adding new files.
This commit was SVN r22634.
2010-02-16 17:24:44 +00:00
Shiqing Fan
08ffdbe987 Changes for portable platform headers. Commit it on behalf of Ralph.
This commit was SVN r22619.
2010-02-15 22:14:59 +00:00
Pavel Shamis
9d0ae097c1 Updating vendor part ids for some mellanox devices
This commit was SVN r22617.
2010-02-15 09:45:34 +00:00
Jeff Squyres
dafc0c914b Restoring the build for now.
This commit was SVN r22611.
2010-02-12 12:03:17 +00:00
Rainer Keller
48254c78c9 - When svn version string becomes too long (>72 columns) some Fortran
compilers get confused. Continue on the next line.
   Thanks to Richard Tran Mills for noticing that.

This commit was SVN r22609.
2010-02-11 23:23:36 +00:00
Jeff Squyres
6c5f666890 Add a comment to the loopback check to explain why it is there. Also
slightly correct one other comment.

This commit was SVN r22606.
2010-02-11 14:59:04 +00:00
Rainer Keller
ea4de16561 - Check whether file is opened on network file-system.
If file does not exist, check the directory it lives in...
   Maybe used by caller, trying to open mmap() on NFS, Lustre or
   Panasas (thanks Sam).
   For now, this is used to warn about the usage of mmap on such FS.

   Please note, that Ralph mentioned the orte_no_session_dir parameter.
   The help message includes a reference to this.

   Tested on NFS and Lustre on Linux on
     smoky: mpirun --mca orte_tmpdir_base $HOME/tmp -np 2 ./mpi_stub
     jaguar: mpirun ... --mca orte_tmpdir_base /tmp/work/$USER ...

   Fixes trac:1354

   This should   cmr:v1.5   once it has soaked and is shown to work on
   Solaris

This commit was SVN r22604.

The following Trac tickets were found above:
  Ticket 1354 --> https://svn.open-mpi.org/trac/ompi/ticket/1354
2010-02-10 23:18:29 +00:00
Jeff Squyres
8f7edf6e3e After a '''lot''' of discussion and testing, this commit fixes some
long-standing bugs (see trac ticket list below).  They're currently
somewhat obscure bugs, but are becoming much more relevant in a world
where OpenFabrics devices fail and you replace them with a newer model
(i.e., the cluster is homogeneous... ''except'' for where you had to
replace one or two OpenFabrics devices, and the same model is no
longer available).

This commit includes a '''lengthy''' comment (that we spent a lot of
time writing!) about what exactly it does and does not do.  The
previous code was rather short and '''incredibly''' subtle.  The new
code is slightly longer, but is both much more explicit and much more
painstakingly documented.

This commit fixes multiple trac tickets.  The real one that we fix is
#1707; the others are fixed as a side-effect.  In short: fixing #1707
prevents Bad Things from happening later in the startup sequence.

Fixes trac:1707, #2164, #1574.

cmr:v1.4.2:reviewer=pasha
cmr:v1.5:reviewer=pasha

This commit was SVN r22592.

The following Trac tickets were found above:
  Ticket 1707 --> https://svn.open-mpi.org/trac/ompi/ticket/1707
2010-02-10 16:53:26 +00:00
Rainer Keller
3ca8adb540 - The only differences in the underlying header file between
GASNet-1.12.0  and  GASNet-1.14.0.

This commit was SVN r22591.
2010-02-10 14:07:44 +00:00
Nysal Jan
97d66bce78 This fixes trac:2154 - CSUM PML false positive. Needs to go to both cmr:v1.4.2 and cmr:v1.5
This commit was SVN r22590.

The following Trac tickets were found above:
  Ticket 2154 --> https://svn.open-mpi.org/trac/ompi/ticket/2154
2010-02-10 10:24:16 +00:00
Steve Wise
d40d2165c0 Never advertise a loopback address (127/8) to your peers.
This commit was SVN r22589.
2010-02-09 19:07:33 +00:00
Ralph Castain
ab5ceb3d5f Ensure we return the error code when something fails.
Thanks to Guillaume Thouvenin for finding it.

This commit was SVN r22588.
2010-02-09 16:48:55 +00:00
George Bosilca
144143a3ff Remove an unused local variable.
This commit was SVN r22566.
2010-02-05 22:27:24 +00:00
Josh Hursey
a3583b8f57 Fix --bynode option to remember for subsequent jobs where it left off last time.
Add a ''map_bynode'' info key to determine if the job to be started by comm_spawn* should be mapped by node or by slot. Default is to map according to the default policy set when the parent job was started.

cmr:v1.5.1

This commit was SVN r22564.
2010-02-05 15:37:49 +00:00
Shiqing Fan
84ecb6a81a Set up the correct compiler executables in the right place.
This commit was SVN r22560.
2010-02-04 23:02:17 +00:00
Brian Barrett
50e3a5c349 AC_CHECK_FUNCS. Removes an annoying warning during application link on
Catamount.

Should go to both cmr:v1.4:reviewer=jsquyres and cmr:v1.5:reviewer=jsquyres

This commit was SVN r22547.
2010-02-04 04:42:36 +00:00
Brian Barrett
8b4825ff37 Updates to make trunk run on Catamount again:
* Don't build the pstat component if all defines needed aren't there.
 * Update platform file to work better
 * Work around two places that depended on modex being operational

This commit was SVN r22536.
2010-02-03 05:07:40 +00:00