1
1

1555 Коммитов

Автор SHA1 Сообщение Дата
Shiqing Fan
b8555448b5 Remove the unnecessary/duplicated unistd.h.
This commit was SVN r22346.
2009-12-28 16:22:16 +00:00
Shiqing Fan
d0f85beaf3 Correctly include those header files.
This commit was SVN r22344.
2009-12-28 16:13:06 +00:00
Shiqing Fan
90e3092ce5 Fix a type cast.
This commit was SVN r22343.
2009-12-28 16:12:46 +00:00
George Bosilca
e127b20038 Correct a type in the name of the help string.
This commit was SVN r22336.
2009-12-21 19:13:25 +00:00
Vasily Filipov
897b7c0aa8 Fix orte_show_help message type error.
This commit was SVN r22321.
2009-12-16 14:11:43 +00:00
Vasily Filipov
e73274f9a9 Disabling SRQ limit event for devices that doesn't support this feature.
This commit was SVN r22320.
2009-12-16 14:05:35 +00:00
Vasily Filipov
87e71b26fe Jeff Squyres fixes
This commit was SVN r22319.
2009-12-16 10:23:58 +00:00
George Bosilca
b3d3a8e7b3 Remove useless lines.
This commit was SVN r22316.
2009-12-15 23:55:14 +00:00
George Bosilca
b85c3ca081 Enable support for the INRIA
knem (http://runtime.bordeaux.inria.fr/knem/) kernel device. This
is part of Ma Teng's work on Open MPI.

This commit was SVN r22315.
2009-12-15 23:34:09 +00:00
Vasily Filipov
c036c6ef95 Adding support for on-demand SRQ pre-post (receive wqe allocation)
This commit was SVN r22313.
2009-12-15 15:52:10 +00:00
Vasily Filipov
354bfe527f Improving support for non homogeneous OpenFabrics network configurations
This commit was SVN r22312.
2009-12-15 14:25:07 +00:00
Pavel Shamis
4d02aea54c Enabling, by default, RDMACM connection manager for RDMAoE devices
This commit was SVN r22311.
2009-12-15 13:52:19 +00:00
Christopher Yeoh
d5253aa0f1 Fixes multithread race which causes corruption of no_credits_pending_frags
list in the ib btl. See #2128 for details 

This commit was SVN r22298.
2009-12-14 01:41:45 +00:00
Eugene Loh
8177d91835 Minor change so that if the number of shared-memory FIFOs is greater
than can be used (e.g., number of on-node peers), that no additional
room is set aside for those FIFOs that will never be created.  This
makes it easier to have dedicated FIFOs:  just set btl_sm_num_fifos
to be very large rather than setting it to be the local number of
procs.  In practice, we ask for extra headroom anyhow, so this change
generally won't matter.

This commit was SVN r22291.
2009-12-10 19:28:39 +00:00
Pavel Shamis
b024aee10c Removing unused lists from mca_btl_openib_qp_info_t. The lists were moved to device.
This commit was SVN r22271.
2009-12-07 17:42:09 +00:00
Pavel Shamis
7d46985096 Removing unneeded spaces
This commit was SVN r22246.
2009-12-01 11:15:40 +00:00
Pavel Shamis
75a48f4b3c Bugfix for possible race in rdmacm_destroy_dummy_qp
This commit was SVN r22245.
2009-12-01 08:09:43 +00:00
Shiqing Fan
7cf427c39b Include the missing thread header, which is needed when build with --enable-progress-thread.
This commit was SVN r22239.
2009-11-27 14:49:24 +00:00
Rainer Keller
276b813f48 - Output according to their type.
This commit was SVN r22206.
2009-11-09 14:28:15 +00:00
Rainer Keller
366bd96c88 - Allow to work without xt-catamount module on Jaguar,
reducing the amount of components, that up to now needed to be
   deselected.

This commit was SVN r22205.
2009-11-09 14:26:24 +00:00
Eugene Loh
88c0921c5e Corrected the usage of "rc" in mca_btl_sm_component_progress.
The return code for this function should be the number of events
received.

This commit was SVN r22191.
2009-11-04 03:10:35 +00:00
Eugene Loh
1a44fc478d In sm_btl_first_time_init(), when we figure the size of the shared
area, we cap the size at LONG_MAX.  But we are figuring out how much
we need.  So, if that amount exceeds LONG_MAX, we should return an
"out of resource" error code.

This commit was SVN r22172.
2009-10-29 23:06:32 +00:00
Jeff Squyres
0f8ac9223f Refs trac:2023, #2027.
This commit does a bunch of things:

 * Address all remaining code review items from CMR #2023:

   * Defer mmap setup to be lazy; only set it up the first time we
     invoke a collective.  In this way, we don't penalize apps that
     make lots of communicators but don't invoke collectives on them
     (per #2027).
   * Remove the extra assignments of mca_coll_sm_one (fixing a
     convertor count setup that was the real problem).
   * Remove another extra/unnecessary assignment.
   * Increase libevent polling frequency when using the RML to
     bootstrap mmap'ed memory.
   * Fix a minor procs-related memory leak in btl_sm.
 * Commit a datatype fix that George and I discovered along the way to
   fixing the coll sm.
 * Improve error messages when mmap fails, potentially trying to
   de-alloc any allocated memory when that happens.
 * Fix a previously-unnoticed confusion between extent and true_extent
   in coll sm reduce.

This commit was SVN r22049.

The following Trac tickets were found above:
  Ticket 2023 --> https://svn.open-mpi.org/trac/ompi/ticket/2023
2009-10-02 17:13:56 +00:00
Josh Hursey
59143be39d Fix a minor C/R bug related to cleaning up session directories when sm is present.
Before this, we would restore the topmost old session directory. This commit makes sure that we remove it when we are done with it.

This commit was SVN r21971.
2009-09-17 14:43:06 +00:00
Jeff Squyres
4a40be650e Improve the MCA param help messages for btl_tcp_if_in|exclude.
This commit was SVN r21968.
2009-09-15 17:19:57 +00:00
Jeff Squyres
533633b8cb Fixes trac:1988. The little bug that turned out to be huge. Yoinks.
* Various cosmetic/style updates in the btl sm
 * Clean up concept of mpool module (I think that code was written way
   back when the concept of "modules" was fuzzy)
 * Bring over some old fixes from the /tmp/timattox-sm-coll/ tree to
   fix potential segv's when mmap'ed regions were at different
   addresses in different processes (thanks Tim!).
 * Change sm coll to no longer use mpool as its main source of shmem;
   rather, just mmap its own segment (because it's fixed size --
   there was nothing to be gained by using mpool; shedding the use of
   mpool saved a lot of complexity in the sm coll setup).  This
   effectively made Tim's fixes moot (because now everything is an
   offset into the mmap that is computed locally; there are no global
   pointers).  :-)
 * Slightly updated common/sm to allow making mmap's for a specific
   set of procs (vs. ''all'' procs in the process).  This potentially
   allows for same-host-inter-proc mmaps -- yay!
 * Fixed many, many things in the coll sm (particularly in reduce):
   * Fixed handling of MPI_IN_PLACE in reduce and allreduce
   * Fixed handling of non-contiguous datatypes in reduce
   * Changed the order of reductions to go from process (n-1)'s data
     to process 0's data, because that's how all other OMPI coll
     components work
   * Fixed lots of usage of ddt functions
   * When using a non-contiguous datatype, if the root process is not
     (n-1), now we used a 2nd convertor to copy from shmem to the rbuf
     (saves a memory copy vs. what was done before)
   * Lots and lots of little cleanups, clarifications, and minor
     optimizations (although still more could be done -- e.g., I think
     the use of write memory barriers is fairly sub-optimal; they
     could be ganged together at the root, for example)

I'm marking this as "fixes trac:1988" and closing the ticket; if something
is still broken, we can re-open the ticket.

This commit was SVN r21967.

The following Trac tickets were found above:
  Ticket 1988 --> https://svn.open-mpi.org/trac/ompi/ticket/1988
2009-09-15 00:25:21 +00:00
Lenny Verkhovsky
796b765952 fixed finding minimum distance to ibv_device,
thanks to Pasha .

This commit was SVN r21916.
2009-08-31 07:54:22 +00:00
Rainer Keller
8e1b23779f - Replace combinations of
#if defined (c_plusplus)
          defined (__cplusplus)
   followed by
      extern "C" {
   and the closing counterpart by BEGIN_C_DECLS and END_C_DECLS.

   Notable exceptions are:
    - opal/include/opal_config_bottom.h:
      This is our generated code, that itself defines BEGIN_C_DECL and
      END_C_DECL
    - ompi/mpi/cxx/mpicxx.h:
      Here we do not include opal_config_bottom.h:                                 
    - Belongs to external code:                                                    
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.c        
      opal/mca/backtrace/darwin/MoreBacktrace/MoreDebugging/MoreBacktrace.h        
    - opal/include/opal/prefetch.h:
      Has C++ specific macros that are protected:                                  

    - Had #if ... } #endif  _and_ END_C_DECLS (aka end up with 2x
      END_C_DECLS)
      ompi/mca/btl/openib/btl_openib.h
    - opal/event/event.h has #ifdef __cplusplus as BEGIN_C_DECLS...
    - opal/win32/ompi_process.h: had extern "C"\n {...
      opal/win32/ompi_process.h: dito
    - ompi/mca/btl/pcie/btl_pcie_lex.l: needed to add *_C_DECLS
      ompi/mpi/f90/test/align_c.c: dito
    - ompi/debuggers/msgq_interface.h: used #ifdef __cplusplus
    - ompi/mpi/f90/xml/common-C.xsl: Amend

   Tested on linux using --with-openib and --with-mx

   The following do not contain either opal_config.h, orte_config.h or
   ompi_config.h
   (but possibly other header files, that include one of the above):
      ompi/mca/bml/r2/bml_r2_ft.h
      ompi/mca/btl/gm/btl_gm_endpoint.h
      ompi/mca/btl/gm/btl_gm_proc.h
      ompi/mca/btl/mx/btl_mx_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_endpoint.h
      ompi/mca/btl/ofud/btl_ofud_frag.h
      ompi/mca/btl/ofud/btl_ofud_proc.h
      ompi/mca/btl/openib/btl_openib_mca.h
      ompi/mca/btl/portals/btl_portals_endpoint.h
      ompi/mca/btl/portals/btl_portals_frag.h
      ompi/mca/btl/sctp/btl_sctp_endpoint.h
      ompi/mca/btl/sctp/btl_sctp_proc.h
      ompi/mca/btl/tcp/btl_tcp_endpoint.h
      ompi/mca/btl/tcp/btl_tcp_ft.h
      ompi/mca/btl/tcp/btl_tcp_proc.h
      ompi/mca/btl/template/btl_template_endpoint.h
      ompi/mca/btl/template/btl_template_proc.h
      ompi/mca/btl/udapl/btl_udapl_eager_rdma.h
      ompi/mca/btl/udapl/btl_udapl_endpoint.h
      ompi/mca/btl/udapl/btl_udapl_mca.h
      ompi/mca/btl/udapl/btl_udapl_proc.h
      ompi/mca/mtl/mx/mtl_mx_endpoint.h
      ompi/mca/mtl/mx/mtl_mx.h
      ompi/mca/mtl/psm/mtl_psm_endpoint.h
      ompi/mca/mtl/psm/mtl_psm.h
      ompi/mca/pml/cm/pml_cm_component.h
      ompi/mca/pml/csum/pml_csum_comm.h
      ompi/mca/pml/dr/pml_dr_comm.h
      ompi/mca/pml/dr/pml_dr_component.h
      ompi/mca/pml/dr/pml_dr_endpoint.h
      ompi/mca/pml/dr/pml_dr_recvfrag.h
      ompi/mca/pml/example/pml_example.h
      ompi/mca/pml/ob1/pml_ob1_comm.h
      ompi/mca/pml/ob1/pml_ob1_component.h
      ompi/mca/pml/ob1/pml_ob1_endpoint.h
      ompi/mca/pml/ob1/pml_ob1_rdmafrag.h
      ompi/mca/pml/ob1/pml_ob1_recvfrag.h
      ompi/mca/pml/v/pml_v_output.h
      opal/include/opal/prefetch.h
      opal/mca/timer/aix/timer_aix.h
      opal/util/qsort.h
      test/support/components.h

This commit was SVN r21855.

The following SVN revision numbers were found above:
  r2 --> open-mpi/ompi@58fdc18855
2009-08-20 11:42:18 +00:00
Ralph Castain
ded58ae483 Silence some compiler warnings about print statements
This commit was SVN r21814.
2009-08-13 13:45:38 +00:00
Rainer Keller
02a39a208d - Patch r18658 introduced NUMA awareness and memory affinity for
BTL/sm. This static variable needlessly ends up in the so.-file.
   init_maffinity is called once from sm_btl_first_time_init.

   Checked with lennyve, static here is not necessary.

This commit was SVN r21813.

The following SVN revision numbers were found above:
  r18658 --> open-mpi/ompi@f4811d6c4d
2009-08-13 13:08:39 +00:00
Shiqing Fan
bce2f44154 Update related .windows files with proper compiling properties, in order to have a successful DSO build.
This commit was SVN r21805.
2009-08-12 08:55:58 +00:00
Pavel Shamis
31a88b149a Fixing thread deadlock flow in openib btl (mpi-thread enabled mode)
This commit was SVN r21793.
2009-08-11 10:43:52 +00:00
Rainer Keller
6050020c54 - Use OMPI_SUCCESS.
Fails to compile in environments with --disable-mpi

This commit was SVN r21785.
2009-08-10 17:46:25 +00:00
Steve Wise
ed39853f41 add new device ids for the Chelsio T3 RNIC
This commit was SVN r21774.
2009-08-07 14:14:08 +00:00
Rolf vandeVaart
c82e468ede Undo revision r21767 - sorry folks
This commit was SVN r21769.

The following SVN revision numbers were found above:
  r21767 --> open-mpi/ompi@41f38110ff
2009-08-05 22:23:26 +00:00
Rolf vandeVaart
41f38110ff HCA failover support in openib BTL
This commit was SVN r21767.
2009-08-05 21:53:02 +00:00
George Bosilca
0bf381e931 This patch try to solve a issue on Leopard. The supposedly global
variables that are not initialized and are declared in a file that
doesn't export any globally visible function are marked as
non-initialized constants, i.e. uninitialized common symbols. For some
obscure reasons, they get removed from the object files on Mac OS X.

So far I found two solution to this problem. One require the addition
of "-c" to the linker command, the second one (corresponding to this
patch) force them to became a common initialized symbol.

This commit was SVN r21739.
2009-07-28 17:06:16 +00:00
Avneesh Pant
38e48d4e2f Add support for MCA parameters for PSM MTL to specify IB unit, port, IB service level and PSM debug level to use. Also specify in the openib btl params file that QLogic hardware supports a max inlined messages size of 0 only.
This commit was SVN r21734.
2009-07-24 20:09:39 +00:00
George Bosilca
8275120656 Get rid of the ompi_convertor.h header file. Replace all references to ompi_convertor
by opal_convertor.
Cleanup the pcie BTL.

This commit was SVN r21703.
2009-07-16 19:13:30 +00:00
George Bosilca
2b57fc3835 Add the missing headers.
This commit was SVN r21701.
2009-07-16 18:42:14 +00:00
George Bosilca
3e971e61f3 The system headers are supposed to be protected by #ifdef and not by #if.
This commit was SVN r21700.
2009-07-16 18:27:33 +00:00
George Bosilca
d07ffedc54 No opal datatype functions in the BTL. The datatype attached to the
convertor is an ompi_datatype_t so calling the ompi level functions
is the way to go.

This commit was SVN r21698.
2009-07-16 18:25:08 +00:00
Jeff Squyres
9dc7f884b2 Fix yet another compile error from the great DDT split (r21641). Sigh.
This commit was SVN r21697.

The following SVN revision numbers were found above:
  r21641 --> open-mpi/ompi@6c5532072a
2009-07-16 18:08:03 +00:00
Rainer Keller
8243831d76 - Get OpenIB BTL to work with old libibverbs installation
Tested on smoky.

This commit was SVN r21685.
2009-07-15 16:12:47 +00:00
George Bosilca
2143424eb5 The MCA parameter should always be taken into account, independent on
how many networks are available on the node.

This commit was SVN r21652.
2009-07-13 19:40:00 +00:00
Rainer Keller
6c5532072a - Split the datatype engine into two parts: an MPI specific part in
OMPI
   and a language agnostic part in OPAL. The convertor is completely
   moved into OPAL.  This offers several benefits as described in RFC
   http://www.open-mpi.org/community/lists/devel/2009/07/6387.php
   namely:
    - Fewer basic types (int* and float* types, boolean and wchar
    - Fixing naming scheme to ompi-nomenclature.
    - Usability outside of the ompi-layer.
 - Due to the fixed nature of simple opal types, their information is
   completely
   known at compile time and therefore constified
 - With fewer datatypes (22), the actual sizes of bit-field types may be
   reduced
   from 64 to 32 bits, allowing reorganizing the opal_datatype
   structure, eliminating holes and keeping data required in convertor
   (upon send/recv) in one cacheline...
   This has implications to the convertor-datastructure and other parts
   of the code.
 - Several performance tests have been run, the netpipe latency does not
   change with
   this patch on Linux/x86-64 on the smoky cluster.
 - Extensive tests have been done to verify correctness (no new
   regressions) using:
   1. mpi_test_suite on linux/x86-64 using clean ompi-trunk and
    ompi-ddt:
    a. running both trunk and ompi-ddt resulted in no differences
       (except for MPI_SHORT_INT and MPI_TYPE_MIX_LB_UB do now run
       correctly).
    b. with --enable-memchecker and running under valgrind (one buglet
       when run with static found in test-suite, commited)
   2. ibm testsuite on linux/x86-64 using clean ompi-trunk and ompi-ddt:
      all passed (except for the dynamic/ tests failed!! as trunk/MTT)
   3. compilation and usage of HDF5 tests on Jaguar using PGI and
      PathScale compilers.
   4. compilation and usage on Scicortex.
 - Please note, that for the heterogeneous case, (-m32 compiled
   binaries/ompi), neither
   ompi-trunk, nor ompi-ddt branch would successfully launch.

This commit was SVN r21641.
2009-07-13 04:56:31 +00:00
Brian Barrett
2f3c0b4fcf Drain pipe from service thread to main thread during shutdown. By this
point, the event engine has been shut down until btl finalization is
done, so opal_progress in the wait loop is not an option - we have
to drain from inside the btl.

Clean up the looping structure for the finalize routine

Update copyrights.

This commit was SVN r21620.
2009-07-09 22:13:10 +00:00
Brian Barrett
ac34b1de69 RDMA CM doesn't retry if a packet is dropped, just timesout during route
discovery, which results in a timeout and we don't recover.  Instead,
try to recover a couple of times by retrying.

This commit was SVN r21619.
2009-07-09 22:10:06 +00:00
George Bosilca
311e27b42f Pretty print an error message when the specified range of ports (for both
IPv4 and IPv6) is outside the legal boundaries. This fixes trac:1869.

This commit was SVN r21612.

The following Trac tickets were found above:
  Ticket 1869 --> https://svn.open-mpi.org/trac/ompi/ticket/1869
2009-07-07 17:52:30 +00:00
George Bosilca
4038834dfb Convert the port number in network order before binding the socket.
Thanks to Mariusz Mamonski (mamonski@man.poznan.pl) for the bug
report and patch.

This commit was SVN r21610.
2009-07-07 17:21:28 +00:00