1
1
Граф коммитов

4654 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
527540aeb1 Rename req_bytes_delivered to req_bytes_expected for the receive
requests to really reflect what this field means.

This commit was SVN r20971.
2009-04-10 16:36:20 +00:00
George Bosilca
c148d33eb5 Play nicely with the reference count on the ompi_proc structure.
This commit was SVN r20970.
2009-04-10 16:32:02 +00:00
Nysal Jan
1decf8bf36 Move ob1 FIN/ACK fixes to csum PML
This commit was SVN r20954.
2009-04-08 10:43:35 +00:00
George Bosilca
dfc7cea329 Fix the deadlock issues on the osu_bw. The problem is that the PML is
event driver, and if there are no event generated by the BTLs ... well
nothing happens (i.e there is no progress at the PML level and all
pending fragments remain pending). By forcing the BTL to trigger the
callbacks for all ACK and FIN, we give more opportunities to the PML
to do real progress, but we pay this in terms of performance.

This commit was SVN r20953.
2009-04-07 16:56:37 +00:00
George Bosilca
44ce610b8b Add a comment to highlight the fact that this function reappend the
FIN message to the pending list when the send fails. Therefore, any
upper level function is not required to add it.
Make sure we don't send the FIN twice.

This commit was SVN r20952.
2009-04-07 16:48:58 +00:00
Iain Bason
73af921c22 Update keyval_create functions.
The fix for #1864 in r20926 caused gcc to emit some warnings. Also,
Jeff Squyres pointed out parallel bugs in mpi_type_create_keyval and
mpi_win_create_keyval.

This commit was SVN r20950.

The following SVN revision numbers were found above:
  r20926 --> open-mpi/ompi@0a24eadaad
2009-04-07 13:47:52 +00:00
George Bosilca
ccb79b963f This is the other half of the commit r20946 as I mess them up between
two of my testing machines. The fix require both commits!

This commit was SVN r20947.

The following SVN revision numbers were found above:
  r20946 --> open-mpi/ompi@e2bb4c9b8f
2009-04-06 21:49:52 +00:00
George Bosilca
e2bb4c9b8f Correct the handling of the pckt_pending list. The problem was that
we returned the pck before coping the values out. With this change
it seems to work at least on two architectures (even with the 
mpool size set back to 0).

This commit was SVN r20946.
2009-04-06 21:45:08 +00:00
Eugene Loh
c4adfd1806 Increase mpool_sm_min_size default to 64M so osu_bw will run.
This commit was SVN r20944.
2009-04-06 18:37:00 +00:00
Ralph Castain
547fd635d1 Update the Makefile for ortetools so the trunk can compile
This commit was SVN r20943.
2009-04-06 14:04:00 +00:00
George Bosilca
045b0e8871 Remove two warnings about "cast from pointer to integer of different size".
This commit was SVN r20941.
2009-04-05 21:02:36 +00:00
Nysal Jan
5032f59edf Fix checksum computation in the buffered send code
This commit was SVN r20935.
2009-04-03 07:09:24 +00:00
Nysal Jan
4e001fb10a Fix a compiler warning
This commit was SVN r20931.
2009-04-02 14:48:27 +00:00
Ralph Castain
ba1a98c398 Fix a warning message by pointing to the correct header
This commit was SVN r20930.
2009-04-02 13:54:59 +00:00
Nysal Jan
e561a6c43a Add missing checksum calculation. This fixes a checksum mismatch failure while using TCP BTL
This commit was SVN r20927.
2009-04-01 20:48:35 +00:00
Iain Bason
0a24eadaad Fix Fortran bindings for MPI_KEYVAL_CREATE and MPI_COMM_CREATE_KEYVAL.
The EXTRA_STATE parameter is passed by reference, and thus should be
dereferenced before it is stored.  Similarly, the stored value should
be passed by reference to the copy and delete routines.

This fixes trac:1864.

This commit was SVN r20926.

The following Trac tickets were found above:
  Ticket 1864 --> https://svn.open-mpi.org/trac/ompi/ticket/1864
2009-04-01 19:31:46 +00:00
Jeff Squyres
fc8993ba87 mpool name is the first show_help param, not the last.
This commit was SVN r20925.
2009-04-01 18:42:01 +00:00
Jeff Squyres
0d52271cd6 Per http://www.open-mpi.org/community/lists/announce/2009/03/0029.php
and https://svn.open-mpi.org/trac/ompi/ticket/1853, mallopt() hints do
not always work -- it is possible for memory to be returned to the OS
and therefore OMPI's registration cache becomes invalid.

This commit removes all use of mallopt() and uses a different way to
integrate ptmalloc2 than we have done in the past.  In particular, we
use almost exactly the same technique as MX:

 * Remove all uses of mallopt, to include the opal/memory mallopt
   component.
 * Name-shift all of OMPI's internal ptmalloc2 public symbols (e.g.,
   malloc -> opal_memory_ptmalloc2_malloc).
 * At run-time, use the existing glibc allocator malloc hook function
   pointers to fully hijack the glibc allocator with our own
   name-shifted ptmalloc2.
 * Make the decision whether to hijack the glibc allocator ''at run
   time'' (vs. at link time, as previous ptmalloc2 integration
   attempts have done).  Look at the OMPI_MCA_mpi_leave_pinned
   and OMPI_MCA_mpi_leave_pinned_pipeline environment variables and
   the existence of /sys/class/infiniband to determine if we should
   install the hooks or not.
 * As an added bonus, we can now tell if libopen-pal is linked
   statically or dynamically, and if we're linked statically, we
   assume that munmap intercept support doesn't work.

See the opal/mca/memory/ptmalloc2/README-open-mpi.txt file for all the
gory details about the implementation.

Fixes trac:1853.

This commit was SVN r20921.

The following Trac tickets were found above:
  Ticket 1853 --> https://svn.open-mpi.org/trac/ompi/ticket/1853
2009-04-01 17:52:16 +00:00
George Bosilca
b7c1ae4f76 Nothing important, just an identation.
This commit was SVN r20919.
2009-04-01 15:27:16 +00:00
George Bosilca
8221975490 We don't need the opal_bitmap definitions here.
This commit was SVN r20918.
2009-04-01 15:25:10 +00:00
Terry Dontje
4b43911c6a Remove superfluous spaces in manpages that were causing catman to
generate mangled windex files.  Made ompi-top.1 and ompi-iof.1 build
by default.  Also, added the orte-top synonym to the ompi-top manpage.

This commit was SVN r20915.
2009-04-01 14:40:27 +00:00
Nysal Jan
aff903f39c Don't print this message by default
This commit was SVN r20914.
2009-04-01 14:31:21 +00:00
Matthias Jurenz
e66e5104e7 Added check whether the trace environment already closed before create a new event (avoids potential segfault)
This commit was SVN r20913.
2009-04-01 07:53:30 +00:00
Matthias Jurenz
fa3cf2d2ba Added MPI wrapper function for 'MPI_Init_thread'
This commit was SVN r20912.
2009-04-01 07:52:14 +00:00
George Bosilca
c5b1bdd57c Correctly deal with the error case. The problem is tricky: the MPI standard doesn't allow
MPI_ERR_IN_STATUS to be returned from any functions that return only one completed request
(few exception here: wait_some and wait_all and the test versions). As we use an wait_all
in these send_receive functions we should convert the MPI_ERR_IN_STATUS to the real
error, i.e. the one comming from the MPI_ERROR field in the status corresponding to the
failed request.

This commit was SVN r20907.
2009-03-31 23:44:59 +00:00
George Bosilca
12ce14ec8c A possible patch for the SM problems. I moved the synchronization
after each process create it's FIFOs but before they access the
peer's FIFOs. Second, replace a one way synchronization by a real
barrier, so we know that every process is really where we expect
them to be.

This commit was SVN r20906.
2009-03-31 21:46:27 +00:00
Patrick Geoffray
278d508fa4 Missing send immediate function (NULL) in btl module definition.
Fixes trac:1859.

This commit was SVN r20905.

The following Trac tickets were found above:
  Ticket 1859 --> https://svn.open-mpi.org/trac/ompi/ticket/1859
2009-03-31 18:44:10 +00:00
George Bosilca
2b1804b0f4 Remove useless header files.
This commit was SVN r20897.
2009-03-30 23:09:18 +00:00
Jeff Squyres
b95a3d0eb9 * Remove an extraneous OBJ_CONSTRUCT
* Ensure we don't try to do opal_list_get_next() on an item we just
   deleted
 * set myaddrs = NULL when we're done with it, just for good measure

Once this is ported to OMPI v1.3 branch, it fixes
https://bugs.openfabrics.org/show_bug.cgi?id=1579.

This commit was SVN r20896.
2009-03-30 20:24:31 +00:00
Ralph Castain
cba3708893 Cleanup debugging output, remove an unnecessary re-compute of the checksum
This commit was SVN r20895.
2009-03-30 17:09:32 +00:00
Ralph Castain
d5e6104035 Continue to cleanup the csum pml module. Some minor corrections and debug output added.
This commit was SVN r20894.
2009-03-29 23:27:06 +00:00
Jeff Squyres
bf8defc475 Shaun Jackson noted that MPI_STATUS_IGNORE is actually (effectively)
NULL, so testing for NULL as a bad status parameter here is a bad
idea.

This commit was SVN r20891.
2009-03-28 01:24:41 +00:00
Donald Kerr
47dc1bd493 fix #1828; rework the private data connection establishment process; reviewed by terry d.
This commit was SVN r20889.
2009-03-26 17:54:44 +00:00
Donald Kerr
27ed29a0a1 wrap linux specific steps in __linux__ define
This commit was SVN r20888.
2009-03-26 15:09:01 +00:00
Rainer Keller
6ad07dbffa - First have the _config.h header,
then include system headers based on what's defined.

This commit was SVN r20886.
2009-03-26 12:56:22 +00:00
George Bosilca
25d457e3d4 The "missing" header ompi_config.h should not be added here. This file will
get installed for external tools, and they don't need any dependency on our
internal ompi_config.h file. Moreover, this file is crafted in such a way
that there is no need for ompi_config.h.

This commit was SVN r20884.
2009-03-25 21:09:13 +00:00
Donald Kerr
a1ba2e2164 add Sun vendor_id to Tavor Infinihost settings
This commit was SVN r20883.
2009-03-25 19:13:55 +00:00
Donald Kerr
cd7940e208 add vendor_id
This commit was SVN r20882.
2009-03-25 17:57:24 +00:00
Pavel Shamis
d25b7203a2 Adding send_immediate (sendi) implementation to openib btl.
This commit was SVN r20881.
2009-03-25 16:53:26 +00:00
Matthias Jurenz
c2d8fae9a0 Replaced usage of PATH_MAX by VT_PATH_MAX to avoid compile errors on some platforms (i.e. by using Intel compiler version 10.1.021)
This commit was SVN r20873.
2009-03-25 14:40:49 +00:00
Pavel Shamis
8888c9831c Prevent segfault for case when we release SRQ before srq_create.
This commit was SVN r20872.
2009-03-25 14:18:05 +00:00
Ralph Castain
f72e3ba9f9 Update the PML base send init macro to take a converter_flag field (discussed with George).
Update the csum pml module - still not quite right, but closer.

Modify the LANL platform files to keep pace.

This commit was SVN r20859.
2009-03-24 19:12:53 +00:00
Shiqing Fan
8bb6bb97a4 Make the compiler wrapper find the correct version of libraries, i.e. debug or release version based on build type.
This commit was SVN r20852.
2009-03-24 10:42:37 +00:00
Ralph Castain
d88df53a86 A touch more cleanup. Also, bring over the peruse cleanups from r20844
This commit was SVN r20849.

The following SVN revision numbers were found above:
  r20844 --> open-mpi/ompi@daba352af4
2009-03-24 01:36:31 +00:00
Ralph Castain
78323fd6b2 Minor cleanups to compile without warnings
This commit was SVN r20848.
2009-03-24 00:54:16 +00:00
Ralph Castain
75ca19d1d1 Turn off a function that hasn't been added to the code base yet...
This commit was SVN r20847.
2009-03-23 23:56:11 +00:00
Ralph Castain
17f51a0389 Add a new PML module that acts as a "mini-dr" - when requested, it performs a dr-like checksum on messages for BTL's that require it, as specified by MCA params.
Add two new configure options that specify:

1. when to add padding to the openib control header - this *only* happens when the configure option is specified

2. when to use the dr-like checksum as opposed to the memcpy checksum. Not selectable at runtime - to eliminate performance impacts, this is a configure-only option

Also removed an unused checksum version from opal/util/crc.h.

The new component still needs a little cleanup and some sync with recent ob1 bug fixes. It was created as a separate module to avoid performance hits in ob1 itself, though most of the code is duplicative. The component is only selectable by either specifying it directly, or configuring with the dr-like checksum -and- setting -mca pml_csum_enable_checksum 1.

Modify the LANL platform files to take advantage of the new module.

This commit was SVN r20846.
2009-03-23 23:52:05 +00:00
Ralph Castain
fb2b41d40a Give up on the pcie BTL and blow it away. The drivers for this initial implementation have been too customized by IBM - too hard to re-integrate the code.
Maybe someday, someone with enough interest/time can start over...

This commit was SVN r20845.
2009-03-23 23:27:57 +00:00
George Bosilca
daba352af4 As the request is not yet updated (i.e. _MATCHED cannot be called as we don't yet know the
expected length of the message) we should use the source and tag from the message header
instead of the value from the status structure attached to the request.
-This line, and those below, will be ignored--

M    pml_ob1_recvreq.c

This commit was SVN r20844.
2009-03-23 20:25:53 +00:00
Rainer Keller
a3c3babe01 - Ewww, r20817 messed up PGI on Jaguar big time!
Now, while #include "ompi_config.h" is good and fine in order
   to have OMPI_DECLSPEC,
   here it led to stdint.h (with the uint8_t) being included early
   but INSIDE a namespace "MPI" {}.
   Of course it was included anymore (thinkg #define _STDINT_H), when
   it was required in opal/class/opal_hash_list.h
   NOT good.

 - opal/class/opal_object.h: Yeah, one can have nested extern "C" {}
   but it's not necessary. Instead just have the outer *_C_DECLS.

This commit was SVN r20837.

The following SVN revision numbers were found above:
  r20817 --> open-mpi/ompi@6f808d9b05
2009-03-21 01:37:38 +00:00
Rainer Keller
353e489be8 - We're using opal_list_t, so better include it here...
- ompi/contrib/vt/vt/acinclude.m4: The missing escape \` messed up output on Jaguar
   Mailed to Matthias Jurenz

This commit was SVN r20836.
2009-03-21 01:28:31 +00:00
Jeff Squyres
804eb94f5f Fix the MCA param name to use the non-deprecated name.
This commit was SVN r20832.
2009-03-20 01:44:40 +00:00
Shiqing Fan
22b5c536af Clean up some unnecessary compiler flags.
This commit was SVN r20827.
2009-03-18 16:55:34 +00:00
Aurelien Bouteiller
fa9b6e729b Fix missing file in Makefile.am and the "CREATE FAILURE".
This commit was SVN r20821.
2009-03-18 13:42:48 +00:00
Rainer Keller
bff1b2a22b - Finally add the missing opal/util/output.h
for the OPAL_OUTPUT_VERBOSE macro.
 - ompi/errhandler/errhandler_predefined.h:
   Well, just the missing fwd declarations...

This commit was SVN r20820.
2009-03-17 22:37:15 +00:00
Rainer Keller
6f808d9b05 Preparation work for another commit (after RFC):
- This patch solely _adds_ required headers and is rather localized
   The next patch (after RFC) heavily removes headers (based on script)
 - ompi/communicator/communicator.h: For sources that use
   ompi_mpi_comm_world, don't require them to include "mpi.h"
 - ompi/debuggers/ompi_common_dll.c: mca_topo_base_comm_1_0_0_t needs
   #include "ompi/mca/topo/topo.h"
 - ompi/errhandler/errhandler_predefined.h:
   ompi/communicator/communicator.h depends on this header file!
   To prevent recursion just have fwd declarations.
   #include "ompi/types.h" for fwd declarations of the main structs.
 - ompi/mca/btl/btl.h: #include "opal/types.h" for ompi_ptr_t 
 - ompi/mca/mpool/base/mpool_base_tree.c: We use ompi_free_list_t and
   ompi_rb_tree_t, so have the proper classes
 - ompi/mca/op/op.h:
   Op is pretty self-contained: Nobody up to now has done
   #include "opal/class/opal_object.h"
 - ompi/mca/osc/pt2pt/osc_pt2pt_replyreq.h:
   #include "opal/types.h" for ompi_ptr_t 
 - ompi/mca/pml/base/base.h:
   We use opal_lists  
 - ompi/mca/pml/dr/pml_dr_vfrag.h:
   #include "opal/types.h" for ompi_ptr_t
 - ompi/mca/pml/ob1/pml_ob1_hdr.h:
   #include "ompi/mca/btl/btl.h" for mca_btl_base_segment_t
 - opal/dss/dss_unpack.c:
   #include "opal/types.h"
 - opal/mca/base/base.h:
   #include "opal/util/cmd_line.h" for opal_cmd_line_t
 - orte/mca/oob/tcp/oob_tcp.c:
   #include "opal/types.h" for opal_socklen_t
 - orte/mca/oob/tcp/oob_tcp.h:
   #include "opal/threads/threads.h" for opal_thread_t
 - orte/mca/oob/tcp/oob_tcp_msg.c:
   #include "opal/types.h" 
 - orte/mca/oob/tcp/oob_tcp_peer.c:
   #include "opal/types.h"  for opal_socklen_t
 - orte/mca/oob/tcp/oob_tcp_send.c:
   #include "opal/types.h" 
 - orte/mca/plm/base/plm_base_proxy.c:
   #include "orte/util/name_fns.h" for ORTE_NAME_PRINT
 - orte/mca/rml/base/rml_base_receive.c:
   #include "opal/util/output.h" for OPAL_OUTPUT_VERBOSE
 - orte/mca/rml/oob/rml_oob_recv.c:
   #include "opal/types.h" for ompi_iov_base_ptr_t
 - orte/mca/rml/oob/rml_oob_send.c:
   #include "opal/types.h" for ompi_iov_base_ptr_t
 - orte/runtime/orte_data_server.c
   #include "opal/util/output.h" for OPAL_OUTPUT_VERBOSE
 - orte/runtime/orte_globals.h:
   #include "orte/util/name_fns.h" for ORTE_NAME_PRINT

 Tested on Linux/x86-64

This commit was SVN r20817.
2009-03-17 21:34:30 +00:00
Rainer Keller
b9b84a9c29 - ompi/mca/mpool/rdma/mpool_rdma_module.c: At this level
without mpi.h we have no notion of MPI_SUCCESS...
 - ompi/mca/btl/sm/btl_sm.h: ptrdiff_t needs stddef.h 
 - ompi/mca/mpool/base/: If we use opal_pointer_array_t,
   better include the class header.

This commit was SVN r20816.
2009-03-17 20:21:36 +00:00
Aurelien Bouteiller
3cd5a0d833 Support for the MPI event logger improving event logging perfs.
This commit was SVN r20804.
2009-03-17 17:35:28 +00:00
Pavel Shamis
c6d038a8e8 Adding vendor_error code to error report.
This commit was SVN r20803.
2009-03-17 15:47:34 +00:00
Pavel Shamis
5afa2988f1 Updating RNR/IB timeout for openib btl
This commit was SVN r20801.
2009-03-17 15:03:06 +00:00
Rainer Keller
f5b4e250cb - By include "mpi.h" tell the compiler to look in local include dirs,
first.

This commit was SVN r20800.
2009-03-17 14:44:04 +00:00
Rainer Keller
7390b3476e - If we collect a return value, let the caller at least know about it...
This commit was SVN r20797.
2009-03-17 13:52:59 +00:00
Rainer Keller
6a72c0f4d1 - As long as a header declares _DECLSPEC functionality
it should include the corresponding _config.h header file.

   Tested on Linux/x86-64

This commit was SVN r20795.
2009-03-17 01:45:19 +00:00
Donald Kerr
d29a5e57c1 remove superfluous define
This commit was SVN r20785.
2009-03-16 02:24:01 +00:00
George Bosilca
a9be1b1dde Set the mem_node to a more meaningful value, as suggested by Ake Sandgren.
This commit was SVN r20780.
2009-03-14 22:08:26 +00:00
Eugene Loh
64f52b0168 Clean up in response to code review on CMR 1825:
minor changes in comments and edge-case handling.

This commit was SVN r20774.
2009-03-13 18:11:41 +00:00
Rainer Keller
d8cf4c0fec - Get pgcc on XT to complain less:
In case we use memcmp, strlen, strup and friends include <string.h>
   Also several constants.h are not included directly
 - Let's have mca_topo_base_cart_create  return ompi-errors in
   ompi/mca/topo/base/topo_base_cart_create.c

This commit was SVN r20773.
2009-03-13 02:10:32 +00:00
Donald Kerr
ef55aae401 fix #1829 : udapl btl support for relaxed ordering
This commit was SVN r20772.
2009-03-13 01:01:00 +00:00
Rainer Keller
04585ed8e3 - Explicit mark dependency
This commit was SVN r20771.
2009-03-12 22:48:41 +00:00
Rainer Keller
6fca443a71 - No, we don't want to have a notion of an MPI_Comm in this layer
We want ompi_communicator_t instead, rrrrrr.

This commit was SVN r20770.
2009-03-12 22:38:14 +00:00
George Bosilca
b29da4744f Amazing that there is only one compiler complaining about this ...
This commit was SVN r20768.
2009-03-12 21:30:08 +00:00
Rainer Keller
74b3acd4bd - No need to declare struct mca_mpool_base_resources_t;
Already in
   #include "ompi/mca/mpool/mpool.h"

This commit was SVN r20767.
2009-03-12 20:27:16 +00:00
Rainer Keller
296a6fb275 - So much fun along the way:
we normally don't do opal/include/opal/...
   Just use the std. opal/...

This commit was SVN r20766.
2009-03-12 19:21:11 +00:00
Rainer Keller
29b1b205fd - Remove two headers (and actually include rml.h) prior to test of
removal script...

This commit was SVN r20765.
2009-03-12 17:58:39 +00:00
Jeff Squyres
14ee1b7ba2 Refs trac:1826: remove barriers before all non-rooted collective ops.
This commit was SVN r20763.

The following Trac tickets were found above:
  Ticket 1826 --> https://svn.open-mpi.org/trac/ompi/ticket/1826
2009-03-12 02:23:08 +00:00
Ralph Castain
a8002c0f04 Remove missing files from Makefile.am so make dist will succeed
This commit was SVN r20743.
2009-03-06 02:57:51 +00:00
Rainer Keller
ec0ed48718 - Revert r20739
This commit was SVN r20742.

The following SVN revision numbers were found above:
  r20739 --> open-mpi/ompi@781caee0b6
2009-03-05 21:56:03 +00:00
Rainer Keller
a94438343b - Revert r20740
This commit was SVN r20741.

The following SVN revision numbers were found above:
  r20740 --> open-mpi/ompi@2a70618a77
2009-03-05 21:50:47 +00:00
Rainer Keller
2a70618a77 - Second patch, as discussed in Louisville.
Replace short macros in orte/util/name_fns.h
   to the actual fct. call.

 - Compiles on linux/x86-64

This commit was SVN r20740.
2009-03-05 21:14:18 +00:00
Rainer Keller
781caee0b6 - First of two or three patches, in orte/util/proc_info.h:
Adapt orte_process_info to orte_proc_info, and
   change orte_proc_info() to orte_proc_info_init().
 - Compiled on linux-x86-64
 - Discussed with Ralph

This commit was SVN r20739.
2009-03-05 20:36:44 +00:00
Shiqing Fan
99b415a7e0 On windows, the mca_common_* libraries should be installed in bin, otherwise the libraries that are dependent on them, e.g. shared build of mca_btl_sm, couldn't be loaded at runtime. This commit fixes the problem.
This commit was SVN r20735.
2009-03-05 14:57:35 +00:00
Ralph Castain
20b81ff634 Add the PCIE BTL. This won't actually work yet - still need to work through issues with system header files, generalize specification of resources, etc. - but it won't build unless specifically directed to do so. Meantime, any more changes that impact these areas of the code base can be reflected here rather than having to be dealt with later.
This commit was SVN r20734.
2009-03-05 02:40:25 +00:00
Terry Dontje
9215100ac4 Increase communicator padding to accomodate ppc larger lock structures.
This commit was SVN r20728.
2009-03-04 19:54:58 +00:00
Rainer Keller
9dea63d63a - Last of intrusive commits (promised)... err for now.
Anyway, this is blocking the move: do not include pml.h
   if not really needed, aka none of the following used:
     mca_pml
     MCA_PML_CALL
     OMPI_ANY_TAG
     OMPI_ANY_SOURCE
     OMPI_PROC_NULL

 - Notable exceptions (deleting in one header->adding):
   - ompi/mca/mtl/psm/
   - ompi/mca/osc/rdma/
   - ompi/mca/btl/openib/btl_openib_endpoint.c depended on
     pml_base_sendreq.h

 - Tested on Linux/x86-64, this time including make check
   (thanks Jeff and Ralph)

This commit was SVN r20725.
2009-03-04 17:06:51 +00:00
Josh Hursey
b62bc63f76 Fix some compiler warnings. I was using the ompi_predefined_* types instead of the base classes.
This commit was SVN r20722.
2009-03-04 16:16:13 +00:00
Rainer Keller
fd28b392bf - An intrusive commit yet again (sorry): with the separation we
get bitten by header depending on having already included
   the corresponding [opal|orte|ompi]_config.h header.
   When separating, things like [OPAL|ORTE|OMPI]_DECLSPEC
   are missed.

   Script to add the corresponding header in front of all following
   (taking care of possible #ifdef HAVE_...)

 - Including some minor cleanups to
   - ompi/group/group.h -- include _after_ #ifndef OMPI_GROUP_H
   - ompi/mca/btl/btl.h -- nclude _after_ #ifndef MCA_BTL_H
   - ompi/mca/crcp/bkmrk/crcp_bkmrk_btl.c -- still no need for
     orte/util/output.h
   - ompi/mca/pml/dr/pml_dr_recvreq.c -- no need for mpool.h
   - ompi/mca/btl/btl.h -- reorder to fit
   - ompi/mca/bml/bml.h -- reorder to fit
   - ompi/runtime/ompi_mpi_finalize.c -- reorder to fit
   - ompi/request/request.h -- additionally need ompi/constants.h

 - Tested on linux/x86-64

This commit was SVN r20720.
2009-03-04 15:35:54 +00:00
Eugene Loh
efe8c3a283 Initialize reuse_old_request properly at the beginning of each loop iteration in pml_ob1_start.c.
This commit was SVN r20712.
2009-03-04 06:58:36 +00:00
Rainer Keller
8123363357 - Dough. Makefile.am was missing in r20710
This commit was SVN r20711.

The following SVN revision numbers were found above:
  r20710 --> open-mpi/ompi@d68a8a1904
2009-03-04 00:30:28 +00:00
Rainer Keller
d68a8a1904 - Now that we don't need it anymore, blast away
ompi/class/ompi_bitmap.[ch] -- may always be restored from svn
   again...

This commit was SVN r20710.
2009-03-04 00:28:58 +00:00
Rainer Keller
811f2bd9b4 - As discussed on RFC, move the ompi_bitmap to the
opal layer.
   Add a check against a maximum (actually get rid of ifs internally to
   opal_bitmap.c) -- the functionality to set the current maximum size
   opal_bitmap_set_max_size() is currently only used in attribute.c
   to set the maximum OMPI_FORTRAN_HANDLE_MAX...

   Tested on linux/x86-64 with intel-tests with all_tests_no_perf_f
   run with 6 procs.
   Let's look into MTT as well...

This commit was SVN r20708.
2009-03-03 22:25:13 +00:00
Shiqing Fan
317db0fe62 Fix up the compiler flags again.
This commit was SVN r20702.
2009-03-03 17:32:08 +00:00
Jeff Squyres
a8456b27d7 Update the code to match the latest proposal:
https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/MPI3Tools/dllapi

This commit was SVN r20681.
2009-03-02 21:29:52 +00:00
George Bosilca
8078bac53c Correct a case where the added datatype is considered as contiguous but
has gaps in the beginning. Thanks to Markus Blatt for the bug report.

This commit was SVN r20674.
2009-03-02 17:33:13 +00:00
Rich Graham
7ef1550267 add an index to indicate which socket group I belong to.
This commit was SVN r20672.
2009-03-02 14:39:54 +00:00
Rich Graham
daf7673aff gather socket information - not debugged.`
This commit was SVN r20670.
2009-03-02 10:58:12 +00:00
Jeff Squyres
fd979b2278 Add support for the notifier framework into ompi_info
This commit was SVN r20664.
2009-03-01 01:01:39 +00:00
Rainer Keller
02416033ad - Get rid of warning on function declarations:
First "static inline", then the type

This commit was SVN r20657.
2009-02-28 14:15:34 +00:00
Jeff Squyres
2002c576fe Add a lengthy comment about correctness and features of MPI_FINALIZE,
per a lengthy discussion at the Louisville, Feb 2009 OMPI meeting.

This commit was SVN r20656.
2009-02-28 12:58:12 +00:00
Tim Mattox
57be80c983 First pass at integrating the CIFTS/FTB support as
a notifier module.
The Notifier framework was extended slightly to
convey more information about each event notice.
This works with the FTB v0.5 API.

To compile with FTB support, use --with-ftb=/path/to/ftb/install

CIFTS == Coordinated Infrastructure for Fault Tolerant Systems
FTB == Fault Tolerance Backplane
see http://wiki.mcs.anl.gov/cifts/index.php

This commit was SVN r20655.
2009-02-27 22:53:43 +00:00
Matthias Jurenz
dfb95c0cd7 Added missing header include of 'cctypes.h' for function 'tolower()'
This commit was SVN r20653.
2009-02-27 14:47:46 +00:00
George Bosilca
e181ba50c9 Stop valgrind from complaining about few uninitialized bytes on the PML
headers. This feature is enabled only in debug mode when the heterogeneous
support is enabled.

This commit was SVN r20648.
2009-02-27 05:24:06 +00:00
Eugene Loh
ffb35a1b6c Exposed mca_btl_sm_sendi() to the PML so that it will be used. Reviewed the code.
Added a few comments and changed the return code after the FIFO write to be SUCCESS,
even if the FIFO write indicated an error.  Such an error would only mean that the
FIFO was full, but the FIFO-write operation would still be queued.  Therefore, the
PML should think of this as successful.

This commit was SVN r20644.
2009-02-26 18:10:50 +00:00
Josh Hursey
e46c512ee7 Fix a couple of missing headers resulting from recent cleanup
This commit was SVN r20643.
2009-02-26 16:56:56 +00:00
Rainer Keller
4c0e8e1e69 - Header orte/mca/oob/base/base.h is probably the wrong one to include
anyhow -- if oob functionality is neededm then orte/mca/oob/oob.h

   Nevertheless compiles fine with -Wimplicit-function-declaration   

This commit was SVN r20641.
2009-02-26 04:20:03 +00:00
Rainer Keller
04567d3af0 - Header orte/mca/errmgr/errmgr.h is not needed.
Once again compiles fine with -Wimplicit-function-declaration   

This commit was SVN r20640.
2009-02-26 04:05:30 +00:00
Rainer Keller
96e1b9b747 - Header orte/mca/rml/rml.h is not needed if no occurence of orte_rml
or ORTE_RML.
   As the others compiles fine with -Wimplicit-function-declaration

This commit was SVN r20639.
2009-02-26 03:52:31 +00:00
Rainer Keller
224d89a353 - There sure is no local stdio.h header file.
Take the system header file...

This commit was SVN r20637.
2009-02-26 02:17:29 +00:00
Rainer Keller
b9f9cd8174 - Missed an occurence of ompi/info/info.h
This commit was SVN r20636.
2009-02-26 02:15:40 +00:00
Rainer Keller
985648086d - Header ompi/info/info.h is not needed here.
This commit was SVN r20635.
2009-02-26 02:00:39 +00:00
Shiqing Fan
2326f14be5 Remove the unnecessary PROJECT command, I somehow misunderstood how it should be used on Windows....
This commit was SVN r20634.
2009-02-25 16:07:43 +00:00
Shiqing Fan
aa2804de75 Refresh mpi.h.cmake according to the changes to mpi.h.in.
Add a few compiler flags which were missing.

This commit was SVN r20633.
2009-02-25 12:51:29 +00:00
Rainer Keller
b356e90fa1 - Get rid of include orte/util/proc_info.h, if not needed
Only proc_info.h-internal include file is opal/dss/dss_types.h
 - In one case (orte/util/hnp_contact.c) had to add proc_info.h again.
 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   works fine, no errors.

   Again, let's have MTT the last word.

This commit was SVN r20631.
2009-02-25 03:38:00 +00:00
Terry Dontje
0178b6c45f Added padding to predefined handle structures to maintain library version to
version compatibility.

This commit was SVN r20627.
2009-02-24 17:17:33 +00:00
Shiqing Fan
2148220ce4 Update the share libs dependency for windows build.
This commit was SVN r20625.
2009-02-23 17:49:46 +00:00
Shiqing Fan
65eac713bc Cast the pointer to the correct type, i.e. IOVBASE_TYPE.
This commit was SVN r20624.
2009-02-23 17:31:53 +00:00
Matthias Jurenz
a1608ecd60 bugfix: added configure check for header file 'asm/intrinsics.h' and definition of '_IA64_REG_AR_ITC' which required to use the ITC timer on IA64/Linux
This commit was SVN r20621.
2009-02-23 12:41:22 +00:00
Josh Hursey
cde4ab5c32 Forgot another btl_base_close per r20617
Things should be working fine now with openib.

This commit was SVN r20618.

The following SVN revision numbers were found above:
  r20617 --> open-mpi/ompi@d460264c79
2009-02-22 15:24:38 +00:00
Josh Hursey
d460264c79 Fix C/R support in response to r20586. This commit changed the way that bml/r2 finalized, so the C/R support needed to be updated otherwise the BTLs were not properly handled on restart.
This commit was SVN r20617.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
  r20586 --> open-mpi/ompi@14a83a6bbc
2009-02-21 13:42:17 +00:00
Jeff Squyres
f1a6d170dc Revert part of r20537: per lengtyh discussion on the phone and the
devel list, it ''is'' within in the spirit of MPI to allow
MPI_REQUEST_NULL to be passed to MPI_REQUEST_GET_STATUS.  I filed a
ticket proposal with MPI-2.2 to make this officially accepted:

  https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/137

Plus, r20537 didn't revert out all of the machinery for allowing
MPI_REQUEST_NULL or inactive requests, anyway.  So this commit simply
removes the parameter check that was added in r20537, and we're back
to where we were before this whole conversation.  :-)

This commit was SVN r20616.

The following SVN revision numbers were found above:
  r20537 --> open-mpi/ompi@38aab37bb3
2009-02-20 19:57:46 +00:00
Jeff Squyres
7e210fdaf8 Return MPI_ERR_COMM and MPI_ERR_WIN, respectively, for
MPI_COMM|WIN_SET|GET_ERRHANDLER if a bad MPI handle is passed.  Thanks
to Lisandro Dalcín for reporting the issue.

This commit was SVN r20615.
2009-02-20 19:53:48 +00:00
Eugene Loh
463f11f993 Improve shared-memory allocation:
* compute mmap-file size more wisely and pass requested size to allocator
* change MCA parameters:
  - get rid of mpool_sm_per_peer_size
  - get rid of mpool_sm_max_size
  - set default mpool_sm_min_size to 0
* no longer pad sm allocations to page boundaries
* have sm_btl_first_time_init check return codes on free-list creations

Have mca_btl_sm_prepare_src() check to see if it can allocate an EAGER fragment
rather than a MAX fragment if the smaller size works.

Remove ompi/class/ompi_[circular_buffer_]fifo.h and references thereto.

Remove opal/util/pow2.[c|h] and references thereto.

This commit was SVN r20614.
2009-02-20 19:51:57 +00:00
Rainer Keller
02599446d0 - Occurences of ORTE_PROC_MY_NAME require orte/runtime/orte_globals.h
This commit was SVN r20607.
2009-02-20 03:16:13 +00:00
Rainer Keller
32b7189995 - Make usage of BTL_OUTPUT
This commit was SVN r20606.
2009-02-20 03:05:14 +00:00
Jeff Squyres
28f1c995ae Add a decrement to the loop, lest it loop forever.
This commit was SVN r20605.
2009-02-20 02:58:52 +00:00
George Bosilca
97a2296fdd Correct the GET protocol. Thanks to Mike Dubman for finding the problem and
testing my patch.

This commit was SVN r20591.
2009-02-19 16:00:15 +00:00
Jeff Squyres
b8259ba500 Remove unused variable. Thanks for the heads-up, Ralph!
This commit was SVN r20587.
2009-02-19 13:59:38 +00:00
Jeff Squyres
14a83a6bbc Clean up the BML shutdown. Reviewed by George.
This commit was SVN r20586.
2009-02-19 13:17:01 +00:00
Jeff Squyres
3742c3550c Add "sync" collective component. This component is totally
deactivated by default.  It is activated by setting either of the
following two MCA parameters to values greater than 0:

 * coll_sync_barrier_before
 * coll_sync_barrier_after

If !_before is >0, then the sync coll collective will insert itself
before the underlying collective operations and invoke a barrier
before every Nth barrier (N == coll_sync_barrier_before).  Similar for
!_after.  Note that N is a _per communicator_ value; not global to the
MPI process.

If both are 0 (which is the default), this component returns NULL for
the comm query, meaning that it is not insertted into the coll module
stack. 

The intent of this component is to provide a a workaround for
applications with large numbers of collectives of short messages that
can cause unbounded unexpected messages.  Specifically, it is possible
for some iterative collective communication patterns to cause
unbounded unexpected messages.  Forcing a barrier before or after
every Nth collective operation would prevent that behavior by forcing
applications to synchronize (and thereby consume any outstanding
unexpected messages caused by collectives on the same communicator).

Open MPI still needs to bound unexpected messages resource consumption
at the receiver, but this is a viable workaround for at least some
symptoms of the problem.

Additionally, there has been anecdotal evidence of some applications
that "perfom better" when they put barriers after other collective
operations.  This could be due to many factors -- including shortening
the unexpected message queue.  Putting this component in Open MPI
allows people to try this with their own applications and give real
world feedback on this kind of behavior.

This commit was SVN r20584.
2009-02-18 23:32:44 +00:00
Jeff Squyres
563e989b6d Use a bit more friendly language. :-)
This commit was SVN r20583.
2009-02-18 22:12:42 +00:00
George Bosilca
15b60941f3 Cast the req to an opal_list_item_t*
This commit was SVN r20581.
2009-02-18 02:33:37 +00:00
George Bosilca
21f8eba620 There was nothing in item to be added to any list. Instead add
the request that we just removed.

This commit was SVN r20580.
2009-02-18 02:15:57 +00:00
George Bosilca
1b1ed0da37 Always set the frag to NULL.
This commit was SVN r20579.
2009-02-18 02:15:09 +00:00
Eugene Loh
5bbf5ba7d7 First putback of some sm BTL latency optimizations:
* The main thing done here is to convert from multiple FIFOs/queues per
  receiver (each receiver has one FIFO for each sender) to a single FIFO/queue
  per receiver (all senders sharing the same FIFO for a given receiver).
* This requires rewriting the FIFO support, so that
  ompi/class/ompi_[circular_buffer_]fifo.h is no longer used and FIFO
  support is instead in btl_sm.h.
* The number of FIFOs per receiver is actually an MCA tunable parameter,
  but it appears that 1 or possibly 2 FIFOs (even for 112 local processes)
  per receiver is sufficient.

This commit was SVN r20578.
2009-02-17 15:58:15 +00:00
George Bosilca
a0afc9ee29 Always release the allocated memory.
This commit was SVN r20560.
2009-02-14 21:49:06 +00:00
Jeff Squyres
265ac096e8 Restore a few #include's
This commit was SVN r20559.
2009-02-14 15:21:28 +00:00
Rainer Keller
d81443cc5a - On the way to get the BTLs split out and lessen dependency on orte:
Often, orte/util/show_help.h is included, although no functionality
   is required -- instead, most often opal_output.h, or               
   orte/mca/rml/rml_types.h                                           
   Please see orte_show_help_replacement.sh commited next.            

 - Local compilation (Linux/x86_64) w/ -Wimplicit-function-declaration
   actually showed two *missing* #include "orte/util/show_help.h"     
   in orte/mca/odls/base/odls_base_default_fns.c and                  
   in orte/tools/orte-top/orte-top.c                                  
   Manually added these.                                              

   Let's have MTT the last word.

This commit was SVN r20557.
2009-02-14 02:26:12 +00:00
Jeff Squyres
8b29e27ead Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20543.
2009-02-13 03:45:32 +00:00
Jeff Squyres
91415c2996 Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20542.
2009-02-13 03:45:11 +00:00
Jeff Squyres
c83ef674e3 Some minor valgrind-inspired cleanups: fix some memory leaks.
Also took the opprotunity to convert the rdma mpool to use the MCA
register function.

This commit was SVN r20541.
2009-02-13 03:44:29 +00:00
Jeff Squyres
6a1a8311cd Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20540.
2009-02-13 03:43:29 +00:00
Jeff Squyres
661690c273 Some minor valgrind-inspired cleanups: fix some memory leaks
This commit was SVN r20539.
2009-02-13 03:40:53 +00:00
Jeff Squyres
44092c6a21 Don't allow freeing of predefined datatypes. Thanks to Lisandro
Dalcín for reporting the issue.

This commit was SVN r20538.
2009-02-13 00:00:55 +00:00
Jeff Squyres
38aab37bb3 Be a little tougher looking for MPI_*_NULL cases in some functions.
Thanks to Lisandro Dalcín for reporting the issue.

This commit was SVN r20537.
2009-02-12 23:57:41 +00:00
Jeff Squyres
bcdd3ddbde Ensure to zero out all the pointers in the op so that the destructor
knows what it can and cannot free (these pointers are largely unused
and therefore otherwise uninitialized in user-defined op's and
MPI_REPLACE).

This commit was SVN r20532.
2009-02-12 19:15:37 +00:00
George Bosilca
a0248f736c Move the if around the for loop.
Don't release memory that has not been allocated by the freelist.

This commit was SVN r20530.
2009-02-12 17:29:14 +00:00
Ralph Castain
62dd763a8f Add ability for local slave spawns to pre-position supporting files. Update comm_spawn and comm_spawn_multiple man pages to cover new info_keys.
This commit was SVN r20527.
2009-02-12 15:56:45 +00:00
Ralph Castain
62e08e7212 Add missing header file
This commit was SVN r20526.
2009-02-12 14:15:25 +00:00
George Bosilca
4747a4bb53 ompi_comm_all allocate memory and retain the objects. Therefore, after
each call to ompi_comm_all we should parse the communicator list and
release the objects ...

This commit was SVN r20525.
2009-02-11 21:48:11 +00:00
George Bosilca
3b68ae5ea7 As we do call opal_util_init before calling opal_init we should call
opal_finalize_util after calling the opal_finalize.

This commit was SVN r20523.
2009-02-11 21:01:56 +00:00
George Bosilca
db4a49e3b0 Correctly release the objects, and don't check for NULL.
This commit was SVN r20522.
2009-02-11 21:00:44 +00:00
George Bosilca
0dab6eb93d Release the memory on finalize.
This commit was SVN r20521.
2009-02-11 20:58:41 +00:00
Tim Mattox
9b83df22ec Fix some "is proc on local node?" logic that got accidentally flipped
by r20496 for the sm BTL, openib BTL on iWarp, and the sm & sm2 coll modules.

This commit was SVN r20515.

The following SVN revision numbers were found above:
  r20496 --> open-mpi/ompi@4cdf91a8d4
2009-02-11 15:02:38 +00:00
Jeff Squyres
c596a1bcb3 Fix MPI_File_c2f -- ensure that if you invoke
MPI_File_c2f(MPI_FILE_NULL), you actually get 0, not -1.  Thanks for
Lisandro Dalcin for the bug report.

This commit was SVN r20511.
2009-02-11 00:48:12 +00:00
Shiqing Fan
2f1461419c Add a new feature for checking mca subdirectories, i.e. detecting if there is an exclude file list which indicates the files that shouldn't be added to the source list. By default, the CMake build system will simply add all source files in the required sub folders, without knowing which files have to be excluded. The first use of it is in plm/base/.windows.
And clean up the nested variable names, in order to make it readable.

This commit was SVN r20498.
2009-02-10 17:20:13 +00:00
Ralph Castain
4cdf91a8d4 Per the RFC, extend the current use of the ompi_proc_t flags field (without changing the field itself).
The prior ompi_proc_t structure had a uint8_t flag field in it, where only one
bit was used to flag that a proc was "local". In that context, "local" was
constrained to mean "local to this node".

This commit provides a greater degree of granularity on the term "local", to include tests
to see if the proc is on the same socket, PC board, node, switch, CU (computing
unit), and cluster.

Add #define's to designate which bits stand for which local condition. This
was added to the OPAL layer to avoid conflicting with the proposed movement of
the BTLs. To make it easier to use, a set of macros have been defined - e.g.,
OPAL_PROC_ON_LOCAL_SOCKET - that test the specific bit. These can be used in
the code base to clearly indicate which sense of locality is being considered.

All locations in the code base that looked at the current proc_t field have
been changed to use the new macros.

Also modify the orte_ess modules so that each returns a uint8_t (to match the
ompi_proc_t field) that contains a complete description of the locality of this
proc. Obviously, not all environments will be capable of providing such detailed
info. Thus, getting a "false" from a test for "on_local_socket" may simply
indicate a lack of knowledge.

This commit was SVN r20496.
2009-02-10 02:20:16 +00:00
Ralph Castain
f0af389910 Enable comm_spawn of slave processes, currently only active for the rsh, slurm, and tm environments. Establish support for local rsh environments in the plm/base so that rsh of local slaves can be done by any environment that supports it. Create new orte_rsh_agent param so users can specify rsh agent from outside of rsh plm, and sym link that to the old plm_rsh_agent and pls_rsh_agent options.
Modify the orte-bootproxy to pass prefix for the remote slave to support hetero/hybrid scenarios

This commit was SVN r20492.
2009-02-09 20:44:44 +00:00
Ralph Castain
eaa57e29b6 Revert r20480 as this breaks the trunk. The dpm.h include file has defines for OMPI_RML tags that are required for wireup.
This commit was SVN r20482.

The following SVN revision numbers were found above:
  r20480 --> open-mpi/ompi@62282fefe5
2009-02-09 14:14:45 +00:00
Rainer Keller
62282fefe5 - Get rid of #include "ompi/mca/dpm/dpm.h"
This commit was SVN r20480.
2009-02-09 02:56:10 +00:00
Jeff Squyres
f68d2b00d8 Fix one more place where the old name was left over.
This commit was SVN r20473.
2009-02-06 19:21:50 +00:00
Terry Dontje
64ace9ec12 convert bzero calls to memset to remove warnings.
This commit was SVN r20471.
2009-02-06 19:08:22 +00:00
Jeff Squyres
aae930e58b s/__n/converted_n/ -- according to C99, symbols that being with "__"
are the domain of the compiler.

This commit was SVN r20462.
2009-02-06 01:04:50 +00:00
Jeff Squyres
dfb2d92b37 s/ID/id/ - both work, but if I don't make this change, I'll wonder if
we remembered to use strcasecmp() every time I see this entry in the
file... (we did, but I just don't want to have to keep remembering
that ;-) )

This commit was SVN r20461.
2009-02-06 01:02:25 +00:00
Jeff Squyres
656d8578d0 * Rename (new) MCA parameter to
btl_openib_connect_rdmacm_reject_causes_connect_error (yes, it's
   still long -- on purpose :-) )
 * Add INI file parameter rdmacm_reject_causes_connect_error
 * Now only treat CONNECT_ERROR events as a REJECT if:
   * It's on a connection where we were expecting a REJECT, ''and''
   * The MCA parameter is true ''or'' the INI parameter for this
     device is true
 * Set the INI parameter for true for the NE020

This commit was SVN r20459.
2009-02-06 00:51:04 +00:00
Jeff Squyres
ffc5d8877f Fix a problem where we're accidentally initializing the wrong
errhandler (should be initializing _errors_throw_exceptions, not
_are_fatal).  This bug was not a huge tragedy because the only real
problem is that _are_fatal has the wrong string name with it (because
MPI::Init fixes up the _errors_throw_exceptions later).

This commit was SVN r20458.
2009-02-05 21:36:10 +00:00
Jeff Squyres
50b1fd1392 Per the big discussion on the OpenFabrics list a while ago, some
versions of the NE driver will report the OUI while others will report
the PCI ID.  We'll put in the Intel values when we get them (may not
be for a few more weeks).

This commit was SVN r20457.
2009-02-05 21:19:45 +00:00
Jeff Squyres
66d0a02f90 For a problem for some iWARP drivers that don't handle RDMA CM REJECT
properly at all.  NetEffect's current driver (OFED 1.4.0) will return
a CONNECT_ERROR event to the initiator rather than the REJECTED event.
Doh!  Additionally -- unfortunately -- NetEffect's vendor_id and
vendor_part_id are reported as 0 in OFED 1.4.0, so we can't
automatically detect these cards and work around the problem.  So all
we can do is add a new MCA parameter
(btl_openib_connect_rdmacm_ignore_connect_errors -- yes, it's long on
purpose ;-) ) that says that if we get a CONNECT_ERROR, bascially
treat it exactly as a REJECT for the WRONG_DIRECTION reason (which is
a "good" reject).  This allows OMPI to function with NetEffect/Intel
cards on OFED 1.4.0.

Note that NetEffect has been bought by Intel; I'm waiting for
information from them to update the ini file for their new OUI/PCI
ID's and/or new vendor_part_id values.

This commit was SVN r20454.
2009-02-05 18:45:59 +00:00
Jeff Squyres
08c35ca135 Somehow this mca param registration code got duplicated; remove one of
them

This commit was SVN r20452.
2009-02-05 16:52:30 +00:00
George Bosilca
36d496066b Correctly deal with the whole array.
This commit was SVN r20451.
2009-02-05 16:44:43 +00:00
George Bosilca
2c00133fdc Silence a possible casting warning.
This commit was SVN r20447.
2009-02-05 16:18:39 +00:00
Jeff Squyres
90c28810f4 Fix CID 1122: comm->c_name is a char array (not a pointer), so
comparing it to NULL is not useful.

This commit was SVN r20444.
2009-02-05 15:31:10 +00:00
George Bosilca
ee6ff2372e Fix the compilation for Windows.
This commit was SVN r20441.
2009-02-05 13:55:26 +00:00
Jeff Squyres
73ea7a9aa5 Fix CIDs 1211, 1212, 1214: fix error checking in MPI_REDUCE_LOCAL.
This commit was SVN r20435.
2009-02-05 02:18:03 +00:00
Ralph Castain
b100513022 Add a few new MPI_Info options to the dpm - documentation to follow.
Fix a mistake in the dpm that hardcoded the update of routes to the HNP. This needs to be done by the individual routing modules so they can take whatever action is required - which will usually include updating the HNP, but might not...and might include additional steps. New routing modules are coming that violated this assumption, so it had to be moved back into init_routes.

All current routed modules know what to do - anyone with routed modules not in the current trunk may need to adjust them (see any of the current routed modules for examples of what to do).

This commit was SVN r20427.
2009-02-04 22:30:23 +00:00
George Bosilca
745cec03e2 Fix two problems with the way we handle the lvalue in the case the Fortran and C integers
have different sizes:
1. Do not modify the read only parameter of the Fortran MPI interface (i.e be
    standard compliant).
2. When Fortran integers are 64 bits long, don't generate unlawful code.

Thanks to Christoph van Wullen for the bug report.

This commit was SVN r20420.
2009-02-04 15:41:55 +00:00
Jeff Squyres
2cafa5d640 Re-add missing assignment of component variable from MCA param that
somehow must have gotten deleted along the way...

This commit was SVN r20386.
2009-01-30 11:36:14 +00:00
George Bosilca
04a3b29b76 Silence some compiler warnings, and reindent the code.
This commit was SVN r20385.
2009-01-29 18:04:54 +00:00
Jeff Squyres
35c5e28a8e Up to SVN r20383
This commit was SVN r20384.

The following SVN revision numbers were found above:
  r20383 --> open-mpi/ompi@e0638c84c8
2009-01-29 17:59:04 +00:00
George Bosilca
d0a05e90ba Remove the dependency on datatype_pack.h from the convertor_raw file.
Revert r20381 as two header files are "special".

This commit was SVN r20382.

The following SVN revision numbers were found above:
  r20381 --> open-mpi/ompi@25b25aef41
2009-01-28 21:50:01 +00:00
Ralph Castain
25b25aef41 Fix the trunk so it will compile.
Note: this does -not- fix the compiler warnings, but just fixes the missing includes so the trunk will build again.

This commit was SVN r20381.
2009-01-28 21:26:42 +00:00
George Bosilca
2d4a668540 Don't write more iovec than expected.
This commit was SVN r20375.
2009-01-28 16:32:56 +00:00
George Bosilca
0513e018b1 Fix the length of the line.
This commit was SVN r20373.
2009-01-28 15:40:59 +00:00
George Bosilca
321ac99814 Add a function to allow extraction of the iovec covering
the mmory layout of the convertor.

This commit was SVN r20372.
2009-01-28 15:40:15 +00:00
Rainer Keller
fb0e0b854a - Again, no need for #include "orte/util/show_help.h"
- Use BEGIN_C_DECLS and END_C_DECLS

This commit was SVN r20358.
2009-01-27 19:19:04 +00:00
Rainer Keller
9825e087b8 - In rb/rcache_rb.c, the reg->flags should only be operated under the
lock -- therefore move the OPAL_THREAD_UNLOCK after
   the if-OMPI_ERR_TEMP_OUT_OF_RESOURCE block.

 - As mca_rcache_rb_mru_delete is the only setter of rc, move the
   error-check right after mca_rcache_rb_mru_delete.

 - Removed a few nitty ompi/info/info.h and orte/util/show_help.h

This commit was SVN r20355.
2009-01-27 19:00:03 +00:00
Rainer Keller
de4c123ca2 - No dependancy on orte/util/show_help.h, so get rid of #include
This commit was SVN r20354.
2009-01-27 16:30:21 +00:00
Rainer Keller
340d72a166 - There is no dependancy on mpool -- so no need to include
This commit was SVN r20353.
2009-01-27 16:18:56 +00:00
Jeff Squyres
ca0f7d77e9 Fix a help message regarding the btl_openib_receive_queues MCA
parameter.

This commit was SVN r20350.
2009-01-26 18:57:07 +00:00
Jeff Squyres
f9c5adb86f Fix to enable the --disable-mpi-io configure option.
This commit was SVN r20330.
2009-01-23 14:15:51 +00:00
Matthias Jurenz
7a2a081670 Updated VT version to 5.4.7
This commit was SVN r20318.
2009-01-22 13:20:09 +00:00
Matthias Jurenz
1288c662ea - bugfix: select cycle counter timer only on i*86, x86, IA64, and PPC platforms
- minor cleanups

This commit was SVN r20317.
2009-01-22 12:29:10 +00:00
Jeff Squyres
207a61e8d9 Fixes trac:1072: allow MPI C++ constants to be used as array sizes, such
as:

  char name[MPI::MAX_PORT_NAME];

This commit was SVN r20310.

The following Trac tickets were found above:
  Ticket 1072 --> https://svn.open-mpi.org/trac/ompi/ticket/1072
2009-01-21 23:02:51 +00:00
Jeff Squyres
90e69ac6ff Fix some man page nits noticed by the Debain OMPI maintainers. Thanks
Dirk!

This commit was SVN r20307.
2009-01-21 18:38:37 +00:00
Ralph Castain
5d9de3326c Check for valid local/node ranks before using the returned values
This commit was SVN r20304.
2009-01-21 00:54:50 +00:00
Jeff Squyres
1573aaceb7 Add missing header file.
This commit was SVN r20290.
2009-01-17 12:21:42 +00:00
Jeff Squyres
6bde41c785 Forgot this #define -- ooops.
This commit was SVN r20288.
2009-01-16 19:15:17 +00:00
Jeff Squyres
84a3f84fdf Possible fix for random openib segv.
This commit was SVN r20282.
2009-01-15 17:10:18 +00:00
Jeff Squyres
8483c3c66e It is not an error if there are no op components found; we'll just
fallback to the base functions.

This commit was SVN r20281.
2009-01-15 02:01:32 +00:00
Jeff Squyres
4d8a187450 Two major things in this commit:
* New "op" MPI layer framework
 * Addition of the MPI_REDUCE_LOCAL proposed function (for MPI-2.2)

= Op framework =

Add new "op" framework in the ompi layer.  This framework replaces the
hard-coded MPI_Op back-end functions for (MPI_Op, MPI_Datatype) tuples
for pre-defined MPI_Ops, allowing components and modules to provide
the back-end functions.  The intent is that components can be written
to take advantage of hardware acceleration (GPU, FPGA, specialized CPU
instructions, etc.).  Similar to other frameworks, components are
intended to be able to discover at run-time if they can be used, and
if so, elect themselves to be selected (or disqualify themselves from
selection if they cannot run).  If specialized hardware is not
available, there is a default set of functions that will automatically
be used.

This framework is ''not'' used for user-defined MPI_Ops.

The new op framework is similar to the existing coll framework, in
that the final set of function pointers that are used on any given
intrinsic MPI_Op can be a mixed bag of function pointers, potentially
coming from multiple different op modules.  This allows for hardware
that only supports some of the operations, not all of them (e.g., a
GPU that only supports single-precision operations).

All the hard-coded back-end MPI_Op functions for (MPI_Op,
MPI_Datatype) tuples still exist, but unlike coll, they're in the
framework base (vs. being in a separate "basic" component) and are
automatically used if no component is found at runtime that provides a
module with the necessary function pointers.

There is an "example" op component that will hopefully be useful to
those writing meaningful op components.  It is currently
.ompi_ignore'd so that it doesn't impinge on other developers (it's
somewhat chatty in terms of opal_output() so that you can tell when
its functions have been invoked).  See the README file in the example
op component directory.  Developers of new op components are
encouraged to look at the following wiki pages:

  https://svn.open-mpi.org/trac/ompi/wiki/devel/Autogen
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateComponent
  https://svn.open-mpi.org/trac/ompi/wiki/devel/CreateFramework

= MPI_REDUCE_LOCAL =

Part of the MPI-2.2 proposal listed here:

    https://svn.mpi-forum.org/trac/mpi-forum-web/ticket/24

is to add a new function named MPI_REDUCE_LOCAL.  It is very easy to
implement, so I added it (also because it makes testing the op
framework pretty easy -- you can do it in serial rather than via
parallel reductions).  There's even a man page!

This commit was SVN r20280.
2009-01-14 23:44:31 +00:00
Brian Barrett
cfc400eb57 * Enable eager sending for Accumulate
* If the accumulate is local, make it short-circuit the request path.  Accumulate requires local
  ops due to its window rules, so this is likely to help a bunch (on the codes I"m messing
  with at least)
* Due a better job at flushing everything that can go out on the wire in a resource constrained problem
* Move some debugging values around to make large problems somewhat easier to deal with

This commit was SVN r20277.
2009-01-14 20:15:15 +00:00
Edgar Gabriel
1072812bcf not every element in the pointer array list contains a valid entry. Thus, do not try to free elements if the list returns NULL.
This commit was SVN r20275.
2009-01-14 19:11:30 +00:00
Jeff Squyres
895edd04f8 Fix CID 468: remove some dead code. r_proc_list was set to NULL but
never used.

This commit was SVN r20272.
2009-01-14 18:15:17 +00:00
Jeff Squyres
2ac22db130 Fix CID 724: clean up the return value checking in ompi_info component
opening and closing.

This commit was SVN r20268.
2009-01-14 15:45:38 +00:00
George Bosilca
01adc999c5 Correctly forward the right module if we call another collective function. Kudos to
Edgar for figuring out this tricky bug.

This commit was SVN r20267.
2009-01-14 03:22:54 +00:00
Jeff Squyres
1bedf18305 OMPI_DECLSPEC is no longer necessary when it's static. Duh.
This commit was SVN r20254.
2009-01-13 15:09:16 +00:00
Jeff Squyres
34b7b6cfe8 Really fixes trac:623: there still is a difference between MPI::SEEK_SET
and ::SEEK_SET (duh); that's why it's listed in constants.h.  So put
that back and make it (static const int) rather than extern, and then
remove the instantiation from mpicxx.cc.  Ditto for the other 2.

This commit was SVN r20251.

The following Trac tickets were found above:
  Ticket 623 --> https://svn.open-mpi.org/trac/ompi/ticket/623
2009-01-12 22:06:16 +00:00
Jeff Squyres
20831c36d2 Fixes trac:623: we changed SEEK_SET (and friends) to be (static const int)
in mpicxx.h a while ago, but somehow accidentally left "extern const
int" for SEEK_SET (and friends) in constants.h.  This commit removes
the extraneous "extern" versions.

This commit was SVN r20250.

The following Trac tickets were found above:
  Ticket 623 --> https://svn.open-mpi.org/trac/ompi/ticket/623
2009-01-12 21:50:50 +00:00
Jeff Squyres
d1c6f3f89a * Fix a truckload of Cisco copyrights to be the same as the rest of
the code base.
 * Fix a few misspellings in other copyrights.

This commit was SVN r20241.
2009-01-11 02:30:00 +00:00
George Bosilca
9da6fba64b Update the SCTP BTL regarding ticket #1725.
This commit was SVN r20231.
2009-01-08 16:38:35 +00:00
Rolf vandeVaart
e78add702a Increase the default maximum size that the sm btl file is allowed
to grow to. Without this change, jobs with np>120 get errors.   
This does not change anything for np<16 jobs.  It only comes into 
play with larger np count on a node. I imagine that this can be 
scaled back in the future if the usage of memory in the sm
btl is improved.

This fixes trac:1449.

This commit was SVN r20230.

The following Trac tickets were found above:
  Ticket 1449 --> https://svn.open-mpi.org/trac/ompi/ticket/1449
2009-01-08 14:39:00 +00:00
Donald Kerr
e57435a5d4 udapl btl fix for #1725; replace WAIT with GET
This commit was SVN r20227.
2009-01-08 13:41:36 +00:00
Pavel Shamis
391b101439 Renaming pending_frags to no_credits_pending_frags.
(this commit is part of bug fix for ticket #1693)

This commit was SVN r20217.
2009-01-07 14:41:20 +00:00
Pavel Shamis
2f7b66160b Adding real fix for ticket #1693 - XRC + coalescing segfault.
This commit was SVN r20214.
2009-01-07 14:10:58 +00:00
Ralph Castain
5e1d2eec58 Cosmetic changes to the timing output in mpi_init, restore the barrier timing measurement in mpi_finalize
This commit was SVN r20211.
2009-01-06 21:30:12 +00:00
George Bosilca
8e4107353f Update the last instance of bml_base_send to correctly cope with the
return values from the BTL. This is related to ticket 1734.

This commit was SVN r20210.
2009-01-06 19:44:48 +00:00
Ralph Castain
9dbcee9110 Increase efficiency for modex-less launch by storing byte objects in the profile file
This commit was SVN r20206.
2009-01-05 21:46:12 +00:00
Ralph Castain
5f26a8b084 Since we have such a flag in orte_process_info, set it to true when we do mpi_init so we know we have an mpi_proc (simplifies later logic checks)
This commit was SVN r20205.
2009-01-05 21:41:40 +00:00
Jeff Squyres
a9850c96c5 Cosmetic change.
This commit was SVN r20203.
2009-01-05 19:07:06 +00:00
George Bosilca
f2b9b3fa0b One less warning about a unmatched printf type.
This commit was SVN r20199.
2009-01-05 15:02:00 +00:00
George Bosilca
760e744294 Use a more clear name for the proc in the constructor and destructor functions.
Make sure the lock is created and destroyed as expected.

This commit was SVN r20197.
2009-01-05 14:14:38 +00:00
Jeff Squyres
11b375f8b5 CIDs 1080-1090: assert() checks were not sufficient to check for
NEGATIVE_RETURNS from _reg_int() because those are not always
checked.  So replace them with real if() checks.

This commit was SVN r20195.
2009-01-03 15:56:25 +00:00
Jeff Squyres
679e2855b7 Fix CID 1135: the assignment to item was never used (it was
overwritten in the next loop iteration "item = next_item").

This commit was SVN r20189.
2009-01-03 15:15:42 +00:00
Jeff Squyres
62385a6c39 Add a comment explaining why there is an empty Makefile.am in this
tree.

This commit was SVN r20184.
2009-01-03 02:07:01 +00:00
Jeff Squyres
78b282d0d6 Cosmetic changes; replace tabs with spaces.
This commit was SVN r20183.
2009-01-03 01:45:52 +00:00
Jeff Squyres
fa30c6d8bc Cosmetic change; remove a tab.
This commit was SVN r20180.
2009-01-02 17:48:24 +00:00
Jeff Squyres
611ebeab33 Cosmetic: expunge some more old 2-space-indent code (re-indent with
"indent(1)").

This commit was SVN r20179.
2009-01-02 12:55:17 +00:00
Brian Barrett
e1f40c6a71 Fixes to make the rdma osc component work again:
* Don't overwrite the des_flags field, removing the
    all important always callback field
  * Fix up return status of bml_base_send, since
    the rest of the code expects OMPI_SUCCESS or
    an error code

This commit was SVN r20178.
2009-01-01 23:48:29 +00:00
Jeff Squyres
a7586bdd90 Cosmetic changes:
* Update to 4 space tabs where relevant (and some irrelevant white
   space changes)
 * Move a few constants to the left of !=/==
 * Add a few {}'s are one line blocks
 * Use BEGIN/END_C_DECLS
 * Change /**< to /** in a few places

This commit was SVN r20177.
2008-12-31 14:50:54 +00:00
Jeff Squyres
f13ea32830 Remove the code checkig the MCA "coll" parameter for a list of coll
components to use.  This code was rendered obsolete (albiet harmless)
by the MCA base improvements that only open the components that were
specified by each framework's MCA parameter.

This commit was SVN r20176.
2008-12-31 13:40:51 +00:00
Jeff Squyres
759a295cc9 Gaah -- missed one s/m/component/g
This commit was SVN r20175.
2008-12-31 13:35:37 +00:00
Jeff Squyres
955d1e132d Rename a variable to be "component" (not "m"), to emphasize that it is
the component struct, not a module.

This commit was SVN r20174.
2008-12-31 13:32:46 +00:00
Jeff Squyres
865900dd27 Nothing of substance; just indenting changes (''finally'' update this
framework base to 4 space tabs!).

This commit was SVN r20173.
2008-12-31 12:17:08 +00:00
Jeff Squyres
ce313fa391 Minor fixes to a few comments
This commit was SVN r20172.
2008-12-31 11:34:27 +00:00
Jeff Squyres
d533215dac Fix a comment to reflect the right version number
This commit was SVN r20169.
2008-12-30 12:39:32 +00:00
Donald Kerr
213daa58da support for solaris relaxed ordering
This commit was SVN r20167.
2008-12-24 15:05:12 +00:00
Ralph Castain
7787f84540 Per the earlier RFC and some discussion at the Dec ORTE design meeting, add the ompi-top tool and all its supporting infrastructure. This includes a new OPAL pstat framework and data type, currently with rather weak support for Mac OSX and pretty complete support for Linux. The Sun team promised to add Solaris support as well.
Also, per chat with Jeff, modified the Makefile.am's of a few orte tools so that they were consistent in the way we generate the ompi-equivalent cmds.

This commit was SVN r20165.
2008-12-22 20:23:05 +00:00
George Bosilca
4d5fbc5955 Remove unused lock from the ompi_proc_t. This reduce the size of the ompi_proc_t
by 64 bytes.
Remove the useless pml_proc from the PML layer.

This commit was SVN r20157.
2008-12-19 19:56:27 +00:00
George Bosilca
7fc48ae11e Update the template BTL to fulfill the requirements for #1713.
This commit was SVN r20153.
2008-12-17 22:15:43 +00:00
George Bosilca
209b844017 Update the self BTL to fulfill the requirements for #1713.
This commit was SVN r20152.
2008-12-17 22:15:27 +00:00
George Bosilca
7d404d238d Update the tcp BTL to fulfill the requirements for #1713.
This commit was SVN r20151.
2008-12-17 22:15:12 +00:00
George Bosilca
341ee1389c Update the sm BTL to fulfill the requirements for #1713.
This commit was SVN r20150.
2008-12-17 22:14:59 +00:00
George Bosilca
7cec018149 Update the Elan BTL to fulfill the requirements for #1713.
This commit was SVN r20149.
2008-12-17 22:14:45 +00:00
Josh Hursey
c954045989 Add a patch to address a deadlock in the CRCP BKMRK component.
The problem was that we doubly decremented the active count on blocking receives that we stall to complete. This moved the active count into the negative. With a negative count for 'active' a message that should have been accounted for would be over looked. This then causes the bookmark exchange to post a drain for a message that was never posted, thus locking the protocol. By eliminating the decrement on the 'active' count when we attempt to post the drain message, we only the decrement this counter when the outstanding blocking recv completes during the stall operation.

Refs trac:1619
Does not close this ticket since there is an outstanding potential problem with ANY_SOURCE and ANY_TAG, as referenced in the ticket.

This should be moved to v1.3

This commit was SVN r20147.

The following Trac tickets were found above:
  Ticket 1619 --> https://svn.open-mpi.org/trac/ompi/ticket/1619
2008-12-17 17:23:39 +00:00
Jeff Squyres
2f94a151e1 Fix a const cast.
This commit was SVN r20146.
2008-12-17 15:29:02 +00:00
Brian Barrett
f8537c0059 Following ticket #1725, when a free list item can not be allocated, return
the error to the upper layer and let it deal with the problem

This commit was SVN r20143.
2008-12-16 22:38:02 +00:00
Brian Barrett
dbccb250f0 TYPE shouldn't be surrounded by parens because it causes issues for some versions of gcc when the construct a = ((int)) sizeof(b) comes up...
This commit was SVN r20140.
2008-12-16 19:49:54 +00:00
Ethan Mallove
9003e4d722 Add missing #include <errno.h> line (for SunStudio Solaris).
This commit was SVN r20138.
2008-12-16 15:30:02 +00:00
George Bosilca
f9a1700f55 Some 64 bits architectures support pointer aligned on 4 bytes
(where the sizeof(long) is 8). On such architectures dont
assert if the datatype representation is not aligned on 64 bits.

This commit was SVN r20134.
2008-12-16 09:21:21 +00:00
George Bosilca
db424282f7 Fix an issue where the datatype description introduce a buffer misalignment. Because some
architectures (read SPARC64) require aligned accesses, we increase the storage space
when we pack a datatype description to keep the fields aligned. This has to be done
on both sided in order to be consistent.

This commit was SVN r20133.
2008-12-16 09:06:27 +00:00
Avneesh Pant
c1e508750b Check the active port MTU against the MTU statically configured for the HCA. QLogic HCA's capable of MTU had an issue when connected to switches running at 2K.
This commit was SVN r20131.
2008-12-15 21:17:58 +00:00
George Bosilca
fe87e28fee This is a temporary fix for the deadlock problem over MX. The real
problem seems to come from the free list, but due to lack of time to
understand it completely, I provide this fix. Basically, there is no
waiting in the MX BTL anymore, if we cannot allocate a fragment we
rely on the PML to take the corrective actions.

This commit was SVN r20124.
2008-12-15 03:45:34 +00:00
George Bosilca
aa4e9da26d Correct the disp array when creating a data based on the
MPI_COMBINER_INDEXED_BLOCK combiner.

This commit was SVN r20123.
2008-12-13 01:57:27 +00:00
George Bosilca
fec8692074 Get rid of all elan3 references.
This commit was SVN r20122.
2008-12-12 23:59:21 +00:00
Nysal Jan
ee8ec6f6b5 Remove dead/redundant code. Minimize number of calloc invocations
This commit was SVN r20121.
2008-12-12 10:55:50 +00:00
George Bosilca
7631eb8eed A fix for http://www.open-mpi.org/community/lists/users/2008/12/7502.php.
The solution is not to compute the OVERLAP flag, as the best we can do
is an approximative answer. Without this flag the unpack can leads to
unexpected answers if the data-type contain any overlapping regions.
As such datatypes are illegal in MPI, this became a user responsability.

This commit was SVN r20120.
2008-12-12 00:25:40 +00:00
Nysal Jan
6a5454b76a Fixes crash in openib BTL on a heterogeneous cluster Refs trac:1700
This commit was SVN r20113.

The following Trac tickets were found above:
  Ticket 1700 --> https://svn.open-mpi.org/trac/ompi/ticket/1700
2008-12-10 22:07:48 +00:00
Shiqing Fan
20cea164db - 3/4 commit for Windows Visual Studio and CCP support:
corrections to non-windows files (but within ifdef __WINDOWS__)
  type casts, event library for windows use win32. 
  in orte runtime, add windows sockets handling and object construction.

This commit was SVN r20110.
2008-12-10 21:13:10 +00:00
Shiqing Fan
a5281f0434 - 1/4 commit for Windows Visual Studio and CCP support:
CMakeLists and .windows files.
  In contribs preconfigured and precompiled parts.

This commit was SVN r20108.
2008-12-10 20:59:20 +00:00
Josh Hursey
df75abd6b2 Fix a warning. Thanks to Jeff for noticing.
This should be moved to v1.3 as well.

This commit was SVN r20101.
2008-12-10 15:38:12 +00:00
Ralph Castain
1ace83c470 Enable modex-less launch. Consists of:
1. minor modification to include two new opal MCA params:
   (a) opal_profile: outputs what components were selected by each framework
       currently enabled for most, but not all, frameworks
   (b) opal_profile_file: name of file that contains profile info required
       for modex

2. introduction of two new tools:
   (a) ompi-probe: MPI process that simply calls MPI_Init/Finalize with
       opal_profile set. Also reports back the rml IP address for all
       interfaces on the node
   (b) ompi-profiler: uses ompi-probe to create the profile_file, also
       reports out a summary of what framework components are actually
       being used to help with configuration options

3. modification of the grpcomm basic component to utilize the
   profile file in place of the modex where possible

4. modification of orterun so it properly sees opal mca params and
   handles opal_profile correctly to ensure we don't get its profile

5. similar mod to orted as for orterun

6. addition of new test that calls orte_init followed by calls to
   grpcomm.barrier

This is all completely benign unless actively selected. At the moment, it only supports modex-less launch for openib-based systems. Minor mod to the TCP btl would be required to enable it as well, if people are interested. Similarly, anyone interested in enabling other BTL's for modex-less operation should let me know and I'll give you the magic details.

This seems to significantly improve scalability provided the file can be locally located on the nodes. I'm looking at an alternative means of disseminating the info (perhaps in launch message) as an option for removing that constraint.

This commit was SVN r20098.
2008-12-09 23:49:02 +00:00
Pavel Shamis
068054132a Temporary work around for #1693 (osu_bibw segfault in xrc + coalescing mode)
This commit was SVN r20093.
2008-12-09 19:21:54 +00:00
George Bosilca
df13c2810d Undo the last commit related to the Fortran profiling. After spending few hours
pondering about this problem, we came to the conclusion that the best approach
is to keep what we had before (i.e. the original approach).

The main reason for this is being nice with tool developers. In the current
incarnation, they can either catch the Fortran calls or the C calls. If they
provide both, then they will have to figure out how to cope with the double
calls (as your example highlight).

Here is the behavior Open MPI will stick too:
Fortran MPI  -> C MPI
Fortran PMPI -> C MPI

However, the is another possible approach. This might avoid the double calls
while preserving the tool writers friendliness. This possible approach will do:
   Fortran MPI  -> C MPI
   Fortran PMPI -> C PMPI
                     ^
Unfortunately, we will have to heavily modify all files in the Fortran
interface layer in order to support this approach.

This commit was SVN r20079.
2008-12-06 00:35:32 +00:00
George Bosilca
54d9df317f Fix the Fortran profiling layer to insure that we call the C PMPI_ functions instead of
their MPI_ counterpart. This allow the profiling layer to catch each MPI function only once,
from C and Fortran.

This commit was SVN r20076.
2008-12-05 16:52:25 +00:00
Jeff Squyres
01feac443e Add a missing header file; we won't find mca_topo_base_comm_1_0_0_t
in optimized builds without this.

This commit was SVN r20075.
2008-12-05 14:32:50 +00:00
Shiqing Fan
d06604c258 Get rid of the compiler warning message when --enable-picky is used.
Do the checks according to inter/intracommunicator flags.

This commit was SVN r20063.
2008-12-03 17:44:21 +00:00
Brad Benton
0b83ab39e5 Added the part number for IBM's 2nd rev of the eHCA to the eHCA param stanza
This commit was SVN r20057.
2008-12-03 14:35:43 +00:00
Jon Mason
54b4e9901c Gracefully handle NULL strings when calling orte_show_help for preventing usage
of more than one of the btl_openib_if_include, btl_openib_if_exclude,
btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude MCA parameters.

This commit was SVN r20053.
2008-12-02 23:17:46 +00:00
Jon Mason
91b26eba67 This commit adds comments regarding IP Aliases and the default behavior when
determining which IP address to use when transmitting data.  Also it adds logic
to prevent usage of more than one of the btl_openib_if_include,
btl_openib_if_exclude, btl_openib_ipaddr_include, or btl_openib_ipaddr_exclude
MCA parameters.

This should complete the code modifications needed for ticket 1665.

This commit was SVN r20052.
2008-12-02 22:42:01 +00:00
Edgar Gabriel
b05393b363 now that we have the unexpected message queue for unknown communicators, there is no need to have this additional synchronization operation for multi-threaded communicator creations.
This commit was SVN r20046.
2008-12-02 16:30:15 +00:00
Shiqing Fan
abd21b6d17 - An update for memchecker :
1. fix a bug in pml_ob1_recvreq/sendreq.c, buffer was made defined where the request has already been released.
2. complete memchecker support for collective functions.
3. change the wrongly spelled function name of memchecker, i.e. '*_isaddressible' should be '*_isaddressable'

This commit was SVN r20043.
2008-11-27 16:34:02 +00:00
Donald Kerr
668eafec88 Support to handle platforms which do not define RLIMIT_MEMLOCK
This commit was SVN r20035.
2008-11-25 03:13:09 +00:00
George Bosilca
16598d7d39 When copying data using the same datatype don't ignore the gap in
the begining.

This commit was SVN r20034.
2008-11-25 00:26:12 +00:00
George Bosilca
3927c03e18 Correctly use the snprintf. Don't zap the last char.
This commit was SVN r20033.
2008-11-25 00:25:28 +00:00
George Bosilca
69afbc084a Don't forget to cast or the compiler will do the division
as a double and then convert.

This commit was SVN r20029.
2008-11-24 15:53:56 +00:00
Jon Mason
4757970438 This patch consists of two parts. Part one is the fixing of a bug in the
determing of the IP subnet.  The netmask was being used improperly when
determining which subnet each connection is on.  Part two is the ability to
include/exclude specific subnets.

This patch fixes ticket #1665

This commit was SVN r20016.
2008-11-17 20:20:24 +00:00
Shiqing Fan
4d2c118d3b - fix a type cast. The whole libmpi library has to be compiled as CXX on Windows, and MS compiler recognizes this as an error.
This commit was SVN r20012.
2008-11-17 12:18:01 +00:00
Patrick Geoffray
0f331b4c13 Define a "fake" mpool to provide a memory release callback for the
memory hooks (munmap) and initialize the mallopt component, and 
nothing else.
Use this mpool in the MX common initialization, supporting both BTL 
and MTL. Automatically set the MX_RCACHE environment variable to 
enable registration cache in MX.

Tested with success for munmap() and large free().

This commit was SVN r20003.
2008-11-15 04:17:58 +00:00
Nysal Jan
e4bdaac6d8 Fixed the case where a device does not support inline data. Redefined the interpretation of max_inline_data MCA parameter.
* If max_inline_data == -1 perform runtime detection 
* If max_inline_data >=0 use the value provided 
* If the user does not explicitly set this via command line, use the value from INI file

This commit fixes trac:1662

This commit was SVN r19995.

The following Trac tickets were found above:
  Ticket 1662 --> https://svn.open-mpi.org/trac/ompi/ticket/1662
2008-11-14 12:15:35 +00:00
Rolf vandeVaart
76f8ce01cf Need to add sppp to list of default excluded interfaces
to support Sun M9000 server.

This commit was SVN r19988.
2008-11-12 20:30:14 +00:00
Ralph Castain
ce26e3a2fb Update the notifier framework in prep for move to v1.3. Add an API to handle the case where error messages have been expressed via "show_help" so they can look similar to what was presented to users. Add three key calls in the openib btl to drop messages into syslog.
This will sit in trunk for a few days - would like to actually see some errors reported to syslog before moving the code to 1.3

This commit was SVN r19986.
2008-11-12 18:03:51 +00:00
Jeff Squyres
a48b2d45be Fix wonky copyright year.
This commit was SVN r19985.
2008-11-12 17:51:54 +00:00
Jeff Squyres
bb0b5b04bd Remove duplicate copyright notice (found by script).
This commit was SVN r19984.
2008-11-12 17:42:40 +00:00
Kenneth Matney
07f7f00c91 This disables sendi, since it may do 0-byte requests and it still has
another bug.  This also causes 0-byte requests to be treated as a buffer
error, causing the base request to be requeued.  On Cray XT, it may be
temporarily impossible to make allocations for buffer requests, as the
default stack size is small (8 MB) and there is no true swap device.
Even with the stack size increased, there will be cases in which this
condition recurs.

One possibility is to make the buffer allocations off of the heap; but,
this does not change the fact that eventually an out-of-memory condition
will occur and we need to support multiple receives in transit, a
condition for which the available buffer space may change.  On the other
hand, if we switch to allocating the buffer space from the heap, we will
need to return an error when the allocation fails and there are no other
buffers in transit.

This commit was SVN r19981.
2008-11-12 16:04:14 +00:00
George Bosilca
e84af7920e Move __counter outside the #ifdef section. Cleanup the usage of __counter.
This commit was SVN r19979.
2008-11-11 16:46:11 +00:00
George Bosilca
584154c2d3 Remove the group header file dependency.
This commit was SVN r19965.
2008-11-10 19:37:52 +00:00
Josh Hursey
080e581422 This commit removes some duplicate finalize code between the component's finalize, and the version that C/R needed in the ft_event function. From my testing everything looks fine, but should probably soak overnight just to be sure. It will need to be moved to v1.3
Thanks to Jeff, Pasha, and Tim M. for bringing this to my attention.

This commit was SVN r19963.
2008-11-10 18:35:57 +00:00
Josh Hursey
460e84f174 A fix for the intel "MPI_Send_init_ator_c" test.
It highlighted a bug in the bookmark component where for persistent sends we were not copying the context, but just moving it. This caused us to lose track of the message if it is started/completed multiple times.

This will need to be brought over to the v1.3 branch, but it should soak overnight to get a round of testing first.

This commit was SVN r19962.
2008-11-10 16:55:58 +00:00
Pavel Shamis
29cc6de40b OOB, XOOB, RDMACM and IBCM does not support qp creation and connection for self communication. So we must use self.
This commit was SVN r19960.
2008-11-10 11:24:57 +00:00
Jeff Squyres
4f028171a2 Refs trac:1603:
* Add OMPI_F77_CHECK_REAL16_C_EQUV test whether REAL*16 is bit
   equivalent to long double.  AC_DEFINE OMPI_REAL16_MATCHES_C with
   result (0 or 1).
 * Update ompi_info to only show real16 support if
   OMPI_REAL16_MATCHES_C is 1.
 * Update DDT to only support REAL16 and COMPLEX32 if
   1==OMPI_REAL16_MATCHES_C.
 * MPI Op function pointer tabls will have NULL for the REAL16 and
   COMPLEX32 entries if 0==OMPI_REAL16_MATCHES_C.
 * Slightly cleaned up OMPI_F77_GET_ALIGNMENT and OMPI_F77_CHECK m4
   tests (use OMPI_VAR_SCOPE_PUSH/POP).

This commit was SVN r19948.

The following Trac tickets were found above:
  Ticket 1603 --> https://svn.open-mpi.org/trac/ompi/ticket/1603
2008-11-07 20:37:21 +00:00
Matthias Jurenz
aafa318248 Fixed faulty length-parameter in snprintf call
This commit was SVN r19947.
2008-11-07 17:15:07 +00:00
Jeff Squyres
1788518bca Only set ompi_mpi_leave_pinned (a bool) to true if the MCA param value
is >= 1.  The default value of the MCA param is now -1, which means
"let someone else turn it on if they want to."  So we should default
to ''off'' (false), and let the openib BTL (etc.) turn it on if it
can/wants to.

Failure to do this will default _pipeline to true because
-1(int)==true(bool).  This causes a problem if the user tries to set
mpi_leave_pinned_pipeline to 1: they'll get a warning that you can't
set both _pinned and _pinned_pipeline to 1.  This happens because
_pinned will get the bool-ified value of of the MCA parameter (-1),
and then the user sets the value of _pinned_pipeline to 1/true.
Hence, both of them are set to true.  Bzzt!

This commit was SVN r19942.
2008-11-06 21:22:07 +00:00
George Bosilca
b2227ebd37 Update the comment to be simpler to understand. Change the name of the variables
to pinpoint the reason why they are there.

This commit was SVN r19940.
2008-11-06 00:00:15 +00:00
Terry Dontje
0f4a1a26fa Forgot one more file for ref #1644.
This commit was SVN r19935.
2008-11-05 20:39:53 +00:00
Terry Dontje
cd2c83932d This commit fixes trac:1644.
This commit was SVN r19934.

The following Trac tickets were found above:
  Ticket 1644 --> https://svn.open-mpi.org/trac/ompi/ticket/1644
2008-11-05 20:30:34 +00:00
Ralph Castain
25491628b8 Discovered while documenting the "preconnect" mca params that several of them didn't make sense any more. After chatting with Jeff, we agreed to the following:
1. register "mpi_preconnect_all" as a deprecated synonym for "mpi_preconnect_mpi"

2. remove "mpi_preconnect_oob" and "mpi_preconnect_oob_simultaneous" as these are no longer valid.

3. remove the routed framework's "warmup_routes" API. With the removal of the direct routed component, this function at best only wasted communications. The daemon routes are completely "warmed up" during launch, so having MPI procs order the sending of additional messages is simply wasteful.

4. remove the call to orte_routed.warmup_routes from MPI_Init. This was the only place it was used anyway.

The FAQs will be updated to reflect this changed situation, and a CMR filed to move this to the 1.3 branch.

This commit was SVN r19933.
2008-11-05 19:41:16 +00:00
Rolf vandeVaart
cad49da72d Fix the tcp btl so it makes use of the btl_tcp_if_include and btl_tcp_if_exclude
parameters on the connecting side also.  Also move define of IF_NAMESIZE
into if.h file.  And lastly, add one verbose debug message which may be
useful if we run into other issues like this.

This commit fixes trac:1573.

This commit was SVN r19932.

The following Trac tickets were found above:
  Ticket 1573 --> https://svn.open-mpi.org/trac/ompi/ticket/1573
2008-11-05 18:45:42 +00:00
Jeff Squyres
84e30534a2 Update btl_<foo>_flags help message
This commit was SVN r19930.
2008-11-05 00:00:55 +00:00
George Bosilca
82d1d5d785 The patch for "Unexpected message queue for unknown CID's required" ticket #1460.
I'm unable to split it in two parts, my patch and Edgar's one. So I just update
copyright information for both of us.
What this patch do:
- it use the unexpected queue create by commit r19562 to dispatch the
  unexpected message to the right communicator (once this communicator
  is created and initialized).
- delay the PML comm_add until we have the context_id for the new communicator.
- only do the PML comm_add on processes that really belong to the new
  communicator. Please read the lengthy comment in the source code for the
  reason behind this.

This commit was SVN r19929.

The following SVN revision numbers were found above:
  r19562 --> open-mpi/ompi@acd3406aa7
2008-11-04 21:58:06 +00:00
George Bosilca
cf96404075 All convertors with a zero length are considered as contiguous.
This commit was SVN r19913.
2008-11-04 16:52:06 +00:00
George Bosilca
d37706f6f8 Reset the variables to NULL after releasing the memory.
This commit was SVN r19912.
2008-11-04 16:49:46 +00:00
Terry Dontje
19cbe4567e This commit fixes trac:1635.
This commit was SVN r19911.

The following Trac tickets were found above:
  Ticket 1635 --> https://svn.open-mpi.org/trac/ompi/ticket/1635
2008-11-04 16:46:45 +00:00