1
1

5120 Коммитов

Автор SHA1 Сообщение Дата
Christopher Yeoh
768ea2bab0 fixes trac:2351 - race in use of ompi free lists
Adds memory barriers which are definitely needed on powerpc

This commit was SVN r22879.

The following Trac tickets were found above:
  Ticket 2351 --> https://svn.open-mpi.org/trac/ompi/ticket/2351
2010-03-25 03:38:14 +00:00
Christopher Yeoh
81e06a2baf fixes trac:2340 - race in mca_mpool_base_free
This commit was SVN r22878.

The following Trac tickets were found above:
  Ticket 2340 --> https://svn.open-mpi.org/trac/ompi/ticket/2340
2010-03-25 03:29:27 +00:00
Christopher Yeoh
0b93c87c2c Correct year for copyright notices
This commit was SVN r22877.
2010-03-25 03:14:21 +00:00
George Bosilca
c0ff44b9fe Don't let ROMIO mishandle the displacement for contiguous data with a non-zero
true_lb. Thanks to Pascal Deveze for the patch.

This commit was SVN r22864.
2010-03-23 01:23:45 +00:00
Matthias Jurenz
f01f70b8f6 - added missing header include
- moved warning message into an ifdef of OTF_VERBOSE

This commit was SVN r22860.
2010-03-22 13:08:13 +00:00
Shiqing Fan
66f1f1a69a Need to export this function for MPI C++ library on Windows.
This commit was SVN r22856.
2010-03-22 09:09:49 +00:00
Ralph Castain
e291fc2c69 With Jeff's help, get the libraries to link as required.
Update ompi_info and orte-info to include the new framework.

Fix some selection logic and a typo'd variable name

Still remains ompi_ignored until we complete testing

This commit was SVN r22848.
2010-03-18 02:12:59 +00:00
Matthias Jurenz
9aec91838b Fixed segfault which might occur if the application uses fork's
This commit was SVN r22847.
2010-03-17 11:51:04 +00:00
George Bosilca
1ed7fe5057 The mpool should take the same route as the rest of the pcie modules.
This commit was SVN r22844.
2010-03-17 04:16:23 +00:00
Ralph Castain
b400b84162 Merge in the modified thread configure option branch per today's telecon.
Remove the --enable-progress-threads option as this is no longer functional, and hardcode OPAL_ENABLE_PROGRESS_THREADS to 0.

Replace the --enable-mpi-threads option with --enable-mpi-thread-multiple as this is clearer as to meaning. This option automatically turns "on" opal thread support if it wasn't already so specified. If the user specifies --disable-opal-multi-threads --enable-mpi-thread-multiple, we will error out with a message

Add a new --enable-opal-multi-threads option that turns "on" opal thread support without doing anything wrt mpi-thread-multiple

This commit was SVN r22841.
2010-03-16 23:10:50 +00:00
Ralph Castain
4990cc41b6 Remove stale config file - the pcie btl has already been removed
This commit was SVN r22840.
2010-03-16 23:06:38 +00:00
Jeff Squyres
7b3ac4fb73 Refs trac:2273
After talking to both Brian and George, the conensus was to just
remove the flag and the test function.  Begone, evil spirits, BEGONE!

This commit was SVN r22831.

The following Trac tickets were found above:
  Ticket 2273 --> https://svn.open-mpi.org/trac/ompi/ticket/2273
2010-03-16 00:47:10 +00:00
Rainer Keller
814fb9399f - Further patches for support on NetBSD (and DragonFly) by
Aleksej Saushev.
   Dont use bash or bashism in shell scripts
   We should use Posix' setpgid(0,0), which is equivalent to setpgrp().

This commit was SVN r22829.
2010-03-15 05:33:42 +00:00
Jeff Squyres
bb314911b3 If we get OMPI_ERR_UNREACH from the PML, print a slightly more
specific error.  Suggested by Nick Edmonds: 

    http://www.open-mpi.org/community/lists/users/2010/03/12339.php

This commit was SVN r22828.
2010-03-14 00:09:55 +00:00
Rainer Keller
f6e4694d67 - Print the name correctly when a certain sync module is disabled
This should be cmr'd to v1.5 and v1.4.2 (but the svn post hook won't
   let me at the moment).

This commit was SVN r22827.
2010-03-13 21:07:34 +00:00
Josh Hursey
e9b5162d79 Fix the configure logic for --with-ft so that it properly takes a comma separated list.
Many of the OPAL_ENABLE_FT should be OPAL_ENABLE_FT_CR, so fix those.

The OPAL Layer INC should call opal_output on restart so that it can refresh the string it prints to reflect the current pid/hostname which may have changed.

This commit was SVN r22824.
2010-03-12 23:57:50 +00:00
Matthias Jurenz
86e58eb6d3 VT configure fixes:
- skip test for libdl on BlueGene? and CrayXT platforms (particularly on CrayXT this library can be linked but it isn't suitable)
- set cache variables for functions 'PMPI_Win_test', 'PMPI_Win_lock', 'PMPI_Win_unlock', and 'MPI_Register_datarep', if VT is configuring for Open MPI
- added test for pthread functions 'pthread_condattr_<set|get>pshared' and 'pthread_mutexattr_<set|get>pshared', because they are not available on some platforms
VT fixes:
- cut 'nm' collected symbol names at '??'
- vtunify:
        - fixed unsafe usage of some strncpy's
        - fixed potential segmentation fault in vtunify-mpi which might occur on 32bit platforms using MPICH2
OTF general:
- updated date in copyright header of each source file
OTF fixes:
- minor code cleanups (indentation, nicer error messages, more correct free's)
- otfaux:
        - fix to place final statistics after the very last record instead of right before
        - changed fatal error to a warning when a file is closed twice (or unexpectedly)

This commit was SVN r22820.
2010-03-12 11:03:45 +00:00
Josh Hursey
3db01f0795 Add the process name to the error message resulting from a failed mmap(), open(), or ftruncate() so that it is slightly easier to figure out which process in the system caused the problem with sm.
This commit was SVN r22803.
2010-03-10 00:18:04 +00:00
Samuel Gutierrez
15f9f35a49 Another small typo fix.
This commit was SVN r22802.
2010-03-09 21:23:21 +00:00
Samuel Gutierrez
dcb5a2331f Fixed some typos in comments.
This commit was SVN r22801.
2010-03-09 20:41:25 +00:00
Rainer Keller
0feb158aaf - Since r22727 orte_app_idx_t was introduced, being a uint32_t (was
previously an orte_std_cntr_t, which is int32_t).
   Comparison with < 0 don't make any sense, here.

This commit was SVN r22799.

The following SVN revision numbers were found above:
  r22727 --> open-mpi/ompi@2541aa98ab
2010-03-08 22:56:33 +00:00
Rainer Keller
06f5ba1c19 - Reverse the logic (OPAL_LIKELY -> OPAL_UNLIKELY)
This commit was SVN r22796.
2010-03-08 14:00:59 +00:00
Jeff Squyres
95d7e08a66 More more discussion and testing has occurred off-ticket.
Short version: there is a bug in OS X/Snow Leopard, but there is also
a bug in Open MPI.  Fixing the bug in Open MPI is both trivial (a
1-line change) and avoids the bug in OS X.  We'll file an OS X bug
report upstream with Apple, but it should no longer affect us here in
OMPI.

Fixes trac:2039.

More details:

Some background first: 

 1. IPv4 sockets can only accept incoming IPv4 connections.  However,
    IPv6 sockets can be configured to accept ''only'' incoming IPv6
    connection, or ''both'' incoming IPv4 and IPv6 connections.  An
    IPv6 socket attribute sets which listening behavior is used.
 1. IPv4 and IPv6 have different port namespaces.  Hence, it is
    permissable to bind a v4 socket to port X ''and'' also bind a v6
    socket to that same port X on the same interface (assuming that
    the v6 socket is only accepting incoming v6 connections).
    Incoming v4 connections to port X on the interface should get
    matched to the listening v4 socket; incoming v6 connections should
    get matched to the listening v6 socket.
 1. When v6 sockets accept ''both'' incoming v4 and v6 connections, it
    should claim port X in both namespaces.
 1. Linux's default behavior is to only allow one listening socket to
    be bound to a given port (i.e., ''either'' a v6 or v4 socket to be
    bound to a single port X -- not both).  A v6 socket can listen for
    both v4 and v6 incoming connections on that port, but still --
    only one socket will be bound to that port.
 1. Snow Leopard's default behavior is to share ports -- i.e., let
    both a v4 and a v6 listening socket to be bound to port X
    (assuming that the v6 socket is only accepting incoming v6
    connections).

The TCP BTL creates a listening socket for each address family.
Hence, it creates a v4 listening socket on INADDR_ANY and a v6
listening socket on the v6 equivalent of INADDR_ANY.  OMPI then
iteratively tries to find ports to listen on within the range of
[mca_btl_tcp_port_min, mca_btl_tcp_port_min + mca_btl_tcp_port_range).

On Linux, the v4 socket will be bound to port X and the v6 socket will
likely be bound to port Y (where X != Y).  On Snow Leopard, the v4
socket will be bound to port X and the v6 socket may ''also'' be bound
to port X.  Since the namespaces are separate, this shouldn't be a
problem.

However, Open MPI was accidentally setting the v6 listening behavior
to accept ''both'' v4 and v6 incoming connections.  This is a trivial
thing to fix -- change a 0 to a 1 in the code.  On Linux, this issue
didn't matter because the v4 and v6 sockets were on different ports.
So even though the v6 socket ''would'' have accepted incoming v4
connections, that never happened because OMPI would direct v4
connections to the v4 port.

But on Snow Leopard, the v4 and v6 listening ports could end up
sharing the same port number.  As mentioned above, this ''shouldn't''
have been a problem, but it looks like Snow Leopard has the following
bugs:

 * If a v4 socket is already bound to port X, we're pretty sure that a
   v6 socket with the "accept both v4 and v6 incoming connections"
   listening behavior should not be able to claim port X (because
   there's already a v4 socket listening on X).  However, Snow Leopard
   would allow binding a v4 socket to port X, and then allow a v6
   socket configured to allow incoming v4 and v6 connections to
   ''also'' be bound to port X.
 * After binding the v6 socket to port X, Snow Leopard then lets
   ''another'' v4 socket ''also'' get bound to port X.  Hence, there's
   now '''three''' sockets all listening on port X.

This obviously led to mis-matched TCP connections, and things went
downhill from there.

That being said, Snow Leopard doesn't exhibit this behavior if v6
sockets only allow incoming v6 connections.  And technically, that is
exactly the behavior we want (we want v6 sockets to only accept
incoming v6 connections).  So if we just change the flag to make our
v6 listening socket us this behavior, the problem on OS X goes away.

That's what this commit does -- it changes a 0 to a 1, indicating
"only let this v6 socket allow incoming v6 connections."

That was simple, wasn't it?

This commit was SVN r22788.

The following Trac tickets were found above:
  Ticket 2039 --> https://svn.open-mpi.org/trac/ompi/ticket/2039
2010-03-05 17:37:57 +00:00
Matthias Jurenz
75d71239d1 Fixed bug in parsing nm-file:
Do not trigger a parse error if address is out of range. Ignore symbol instead.

This commit was SVN r22778.
2010-03-04 16:03:53 +00:00
Shiqing Fan
4c1fc87502 Set the compile flags for F77 on Windows more correctly.
This commit was SVN r22774.
2010-03-04 11:41:42 +00:00
Matthias Jurenz
5b9515225d - fixed stack shutdown if maximum number of buffer flushes was reached
- fixed potential stack underflow in vtfilter which might be cause a segmentation fault

This commit was SVN r22773.
2010-03-04 08:08:20 +00:00
Iain Bason
18d9e96301 Fixed two problems:
1. The code that looks at btl_tcp_if_exclude before doing a
   modex_send uses strcmp rather than strncmp. That means that
   "lo0" gets sent even though "lo" is excluded.

2. The code that determines whether a particular local TCP
   interface can connect to a particular remote interface doesn't
   check for loopback interfaces. With this fix, users can now
   enable "lo" and be assured that it will only be used for intra-
   node communication.

This commit was SVN r22762.
2010-03-03 15:51:15 +00:00
George Bosilca
ec7fcf3f91 While building the profiling interface, ignore the
I/O functions if support for I/O is not requested.

This commit was SVN r22761.
2010-03-02 21:05:04 +00:00
Ralph Castain
c88fe1ea54 Create a new mca parameter to control creation of session directories. Defaults to true so that the current behavior of always creating them is preserved. If set to false (0), then don't create session directories. Helps in those environments where session directories are a problem.
Tell the sm btl that it cannot run if no session directories were created.

This commit was SVN r22756.
2010-03-02 15:18:33 +00:00
Matthias Jurenz
5f368a094f Restored support for Automake's silent rules
This commit was SVN r22741.
2010-03-01 13:10:27 +00:00
Matthias Jurenz
157942809c Use more portable 'nm' command instead of the BFD library to collect symbol information for instrumentation with the GNU, Intel, and PathScale
This commit was SVN r22737.
2010-03-01 12:20:41 +00:00
Ralph Castain
f4c3cceb5e Get the function prototypes to match so we eliminate an annoying warning
This commit was SVN r22726.
2010-02-27 16:41:16 +00:00
Jeff Squyres
b0eaebf46f Add Intel's OUI.
This commit was SVN r22723.
2010-02-26 19:54:16 +00:00
Rolf vandeVaart
2715141f6d Fix minor bug in the way we handle btl_tcp_if_include list.
This commit was SVN r22722.
2010-02-26 18:08:04 +00:00
Shiqing Fan
e1c009932b Add a few more fortran compile flags, and enable dynamic build for f77 library now.
This commit was SVN r22720.
2010-02-26 07:55:32 +00:00
Jeff Squyres
2e91de0bdd This has bugged me for a long, long time: rename btl_openib_iwarp.* ->
btl_openib_ip.*.  The routines in these files are not specific to
iwarp -- they are specific to IP interfaces used with IBV devices
(even IB or IBoE/RoCEE/whatever devices).

This commit was SVN r22718.
2010-02-25 21:04:09 +00:00
Jeff Squyres
a4a81698c2 Mostly a patch from Vasily/Mellanox to fix multi-port and 32/64 bit
issues with iwarp.c.  These fixes are needed for IBoE / ROCEE /
whateveritscalledtoday.  I added a few minor changes to his base
patch.

This commit was SVN r22717.
2010-02-25 20:57:05 +00:00
Eugene Loh
316892b49f Fix spelling of "degradation".
This commit was SVN r22714.
2010-02-25 19:41:59 +00:00
Pavel Shamis
9fbfe6b1c0 The fix resolves the bug #2307. QP creation may fail, since the calculation for _reserved_ does not check for QP type. As result the max_recv_wr may get wrong value . Needs to go to both cmr:v1.4.2 and cmr:v1.5.0
This commit was SVN r22713.
2010-02-25 11:15:20 +00:00
Ralph Castain
18c7aaff08 Update the grpcomm framework to be more thread-friendly.
Modify the orte configure options to specify --enable-multicast such that it directs components to build or not instead of littering the code base with #if's. Remove those #if's where they used to occur.

Add a new grpcomm "mcast" module to support multicast operations. Still some work required to properly perform daemon collectives for comm_spawn operations. New module only builds when --enable-multicast is provided, and when specifically selected.

This commit was SVN r22709.
2010-02-25 01:11:29 +00:00
Jeff Squyres
dd4945c194 New part ID's from Chelsio and Intel. May still get more from
Chelsio. 

This commit was SVN r22708.
2010-02-24 20:39:40 +00:00
Jeff Squyres
af6f1f4b00 Add pkg-config(1) config files to Open MPI. Additionally, fix a minor
bug: libmpi_f90 had libmpi.la in its LIBADD instead of libmpi_f77.la.

Fixes trac:2244.

This commit was SVN r22704.

The following Trac tickets were found above:
  Ticket 2244 --> https://svn.open-mpi.org/trac/ompi/ticket/2244
2010-02-24 18:46:06 +00:00
Pavel Shamis
99ee62771d The fix resolves bug #2292. We may to call for prepare_device_for_use() only after adding the btl to mca_btl_openib_component.openib_btls. Needs to go to both cmr:v1.4.2 and cmr:v1.5.0
This commit was SVN r22702.
2010-02-24 10:13:06 +00:00
Christopher Yeoh
774a7a58b0 Fixes case where there is unprotected access to
mca_osc_rdma_component.c_modules in ompi_osc_rdma_windx_to_module
Fixes case where there is unprotected access to
mca_osc_rdma_component.c_modules in ompi_osc_rdma_windx_to_module

This commit was SVN r22700.
2010-02-24 01:28:37 +00:00
Jeff Squyres
d9b6b5af0c This commit converts us to the "one big libmpi" scheme that has been
discussed extensively.  See
https://svn.open-mpi.org/trac/ompi/ticket/2092 and the RFC thread
http://www.open-mpi.org/community/lists/devel/2010/02/7447.php.

Specifically:

 * Create LT convenience libraries for OPAL and ORTE if the layer
   above them is being created (use the already-defined
   AM_CONDITIONALs to know if the project above us is being built).
 * ORTE slurps in the LT convenience library for OPAL; OMPI slurps in
   the LT convenience library for ORTE.
 * Wrapper compilers now only -l one library (e.g., ortecc only does
   -lopen-ret, and mpicc only does -lmpi).

This commit was SVN r22691.
2010-02-23 22:20:01 +00:00
Jeff Squyres
5ec2d8764b Amendment to r22671: change the name of the new communicator flag from
INTERNAL to EXTRA_RETAIN, because not all "internal" communicators
have this flag set (only internal communicators with CIDs less than
their parent).  Hence, what this flag ''really'' means is that there
was an extra RETAIN performed on it.  So name the flag just that --
EXTRA_RETAIN -- indicating that an extra RETAIN has occurred.

This commit was SVN r22690.

The following SVN revision numbers were found above:
  r22671 --> open-mpi/ompi@61dee816db
2010-02-23 21:24:07 +00:00
Jeff Squyres
583394e30b This help message got a little jumbled.
This commit was SVN r22689.
2010-02-23 21:09:16 +00:00
Christopher Yeoh
f79263550c This fixes trac:2265 removing a race in the openib btl endpoint when
increasing sequence numbers. cmr:v1.4

This commit was SVN r22684.

The following Trac tickets were found above:
  Ticket 2265 --> https://svn.open-mpi.org/trac/ompi/ticket/2265
2010-02-23 12:46:06 +00:00
Christopher Yeoh
c1dcf1c164 The release of memory used by registration lists in rcaches must be delayed until the rcache lock is not held or deadlock
can occur ( fixes trac:2111 ).
Should not deregister memory with the rcache lock held otherwise a deadlock can occur as the lower
level infiniband libraries can free memory ( fixes trac:2110 )

cmr:v1.4

This commit was SVN r22683.

The following Trac tickets were found above:
  Ticket 2110 --> https://svn.open-mpi.org/trac/ompi/ticket/2110
  Ticket 2111 --> https://svn.open-mpi.org/trac/ompi/ticket/2111
2010-02-23 11:31:58 +00:00
Christopher Yeoh
322e73d8c4 The ib_procs list in the openib btl is accessed without the ib lock in some cases. This causes races when running multithreaded. This patch adds protection of the ib_procs list with the ib_lock.
fixes trac:2149 cmr:v1.4

This commit was SVN r22682.

The following Trac tickets were found above:
  Ticket 2149 --> https://svn.open-mpi.org/trac/ompi/ticket/2149
2010-02-23 05:19:03 +00:00