1
1
Граф коммитов

4488 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
0ab48ad0d2 Fix some annoying flex warnings that have been there for years.
Many thanks to Tom Fogal for the initial patch.

cmr=v1.7.4:reviewer=rhc:subject=Fix annoying flex warnings

This commit was SVN r29904.
2013-12-14 00:36:12 +00:00
Rolf vandeVaart
b955dbd6d9 Fix various items discovered by review of ticket #3951.
This commit was SVN r29900.
2013-12-13 21:25:07 +00:00
Jeff Squyres
f4afa4fd1f Add missing include, exposed in "external libevent" work.
Refs trac:3694

This commit was SVN r29898.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-13 21:21:30 +00:00
Jeff Squyres
bcfe2156d5 Bring over m4 quoting fix from v1.7 branch (in r29894) that was
discovered when removing some components.

This commit was SVN r29895.

The following SVN revision numbers were found above:
  r29894 --> open-mpi/ompi@58ed00296c
2013-12-13 20:27:33 +00:00
Ralph Castain
f763be26c4 Closes trac:2433. Check for hetero architecture and disqualify sm connections if that is found as the sm btl currently doesn't support hetero operations.
cmr=v1.7.4:reviewer=brbarret:subject=Disqualify sm btl for hetero procs

This commit was SVN r29882.

The following Trac tickets were found above:
  Ticket 2433 --> https://svn.open-mpi.org/trac/ompi/ticket/2433
2013-12-13 15:23:33 +00:00
Mike Dubman
fb3f94a16e remove debug print
Refs trac:3969

This commit was SVN r29876.

The following Trac tickets were found above:
  Ticket 3969 --> https://svn.open-mpi.org/trac/ompi/ticket/3969
2013-12-13 06:08:44 +00:00
Mike Dubman
21be95c9b5 Initialize sm global variables in mca_btl_sm_component_open(), because they are destructed in mca_btl_sm_component_close(), and init() function might not be called or fail.
For exammple, mca_btl_sm.knem_fd remained 0, and mca_btl_sm_component_close() ended up doing closing fd 0 which belongs to someone else.

fixed by Yossi, reviewed by miked
cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r29875.
2013-12-13 06:01:24 +00:00
Jeff Squyres
bac67e0d81 Per discussion @Chicago OMPI dev meeting Dec 2013: remove all MX support.
This commit was SVN r29873.
2013-12-12 18:54:47 +00:00
Nathan Hjelm
3262080391 Cleanup udcm structures to avoid issues with nesting structures with
flexible members.

UDCM is ready to go for 1.7.4 with this patch.

cmr=v1.7.4:ticket=3940

This commit was SVN r29861.

The following Trac tickets were found above:
  Ticket 3940 --> https://svn.open-mpi.org/trac/ompi/ticket/3940
2013-12-12 05:24:37 +00:00
Nathan Hjelm
e0e94a6029 Fix warning caused by typo in r29815
This commit was SVN r29860.

The following SVN revision numbers were found above:
  r29815 --> open-mpi/ompi@d556b60b21
2013-12-11 21:45:39 +00:00
Nathan Hjelm
6ab69c758b Fix warnings in udcm.
cmr=v1.7.4:reviewer=rhc:ticket=3940

This commit was SVN r29859.

The following Trac tickets were found above:
  Ticket 3940 --> https://svn.open-mpi.org/trac/ompi/ticket/3940
2013-12-11 21:40:06 +00:00
Rolf vandeVaart
3ae88f8a24 Ensure no fork support with GDR. CUDA-aware code only.
This commit was SVN r29854.
2013-12-10 18:08:53 +00:00
Rolf vandeVaart
1cc55f305f Add extra check for GDR. Adjust some names and replace opal_output with opal_show_help.
This commit was SVN r29853.
2013-12-10 16:04:08 +00:00
Jeff Squyres
0f61bb651e Technically, the PORT_ACTIVE is not a "bad" event.
Note that this event should never happen within a single OMPI job,
because OMPI will ignore usnic ports that are down.  The PORT_ACTIVE
event should only occur if a port ''was'' down and is now ''up''.  But
what the heck -- if we ever do get this event, it is harmless -- just
ignore it.

This commit was SVN r29852.
2013-12-09 20:45:55 +00:00
Edgar Gabriel
c253c2eec6 fix the condition for the lazy open of shared filepointers.
This commit was SVN r29850.
2013-12-09 19:37:21 +00:00
Mike Dubman
9a65e0d8c6 cosmetic fixed fpr hcol autotools
Refs: #3694

This commit was SVN r29841.
2013-12-08 09:45:13 +00:00
Mike Dubman
2e124454b4 cosmitic fix to remove redundant -lfca
use CPP extra flags var which propagated to coll/fca and scoll/fca
Refs: #3694

This commit was SVN r29832.
2013-12-07 15:00:54 +00:00
Jeff Squyres
3bd9c603ff Clean up variables used in configure with OPAL_VAR_SCOPE.
This is helpful in the work for #3694: ensure that many places that
eventually end up in configure don't overly-pollute the global shell
variable space (because debugging accidental shell variable pollution
can be a real pain).

Refs trac:3694

This commit was SVN r29830.

The following Trac tickets were found above:
  Ticket 3694 --> https://svn.open-mpi.org/trac/ompi/ticket/3694
2013-12-06 23:40:34 +00:00
Rolf vandeVaart
d556b60b21 Chnage some CUDA configure code and macro names per review request by jsquyres in ticket #3880.
Functionally, nothing changes.

This commit was SVN r29815.
2013-12-06 14:35:10 +00:00
Nathan Hjelm
231ebb09c9 Update romio configury to remove a warning message.
cmr=v1.7.4:ticket=3158

This commit was SVN r29811.

The following Trac tickets were found above:
  Ticket 3158 --> https://svn.open-mpi.org/trac/ompi/ticket/3158
2013-12-06 00:12:35 +00:00
Dave Goodell
da26226e3c usnic: add some extra debug-build sanity checks
On the off chance that the PML is twiddling fields that it really
shouldn't be...

Reviewed-by: Reese Faucette <rfaucett@cisco.com>

This commit was SVN r29804.
2013-12-05 00:28:11 +00:00
Jeff Squyres
ba018b3603 Protect the container_of #define.
MOFED apparently has a /usr/include/infiniband/verbs.h that also
defines a (slightly different but fully compatible) container_of
macro.  So put proper #ifndef protection around our definition of
container_of.

Thanks to Rolf vandeVaart for pointing out the issue.

Reviewed by Dave Goodell.

cmr=v1.7.4:reviewer=ompi-rm1.7

This commit was SVN r29799.
2013-12-04 14:24:56 +00:00
Yossi Etigin
a913b00f89 mtl mxm: update configuration parsing api to mxm 2.1, drop
older version support (1.0 and 1.1), and cleanup the code.

reviewed by miked.

cmr=v1.7.4:reviewer=ompi-gk1.7

This commit was SVN r29797.
2013-12-04 09:11:55 +00:00
Jeff Squyres
c74c1e86d3 Per suggestion from Paul Kapinos, report in BTL verbosity if a device
is skipped because it is too far away.

(see thread starting here:
http://www.open-mpi.org/community/lists/devel/2013/06/12470.php)

This commit was SVN r29790.
2013-12-03 22:44:11 +00:00
Rolf vandeVaart
218c05a4d1 Make sure synchronous copies are complete before moving the data.
This commit was SVN r29789.
2013-12-03 21:20:14 +00:00
Rolf vandeVaart
ab77435d9b Fix the CUDA-aware case where we are not sending any GPU data.
This commit was SVN r29788.
2013-12-03 20:25:58 +00:00
Devendar Bureddy
4554770ee4 hcol fixes
cmr=v1.7.4:reviewer=jladd

This commit was SVN r29787.
2013-12-03 20:21:40 +00:00
Nathan Hjelm
fe327d9859 udcm: cleanup code and improve the ack handling
Originally udcm acks used the immediate data to indicate which message was
being acknowleged. This data was (mysteriously) junk when using QLogic HCAs so I
updated udcm to use the source info (slid, qp, etc) to determine which message was being
acked. This works as long as we don't have two messages simultaneously in flight
to a particular peer and then loose the first of the two messages. The chances of this
happening are tiny. To fix this case I updated the udcm message header to include
a pointer to the in flight message. This pointer is then sent back to the sending
process to ack receipt.

cmr=v1.7.4:ticket=trac:3940

This commit was SVN r29775.

The following Trac tickets were found above:
  Ticket 3940 --> https://svn.open-mpi.org/trac/ompi/ticket/3940
2013-12-02 20:18:46 +00:00
Jeff Squyres
3a7af4ab40 Fix another clang warning: sendreq is undefined if proc==NULL.
cmr=v1.7.4:reviewer=hjelmn:subject=fix ob1 undefined sendreq value

This commit was SVN r29774.
2013-12-02 19:44:42 +00:00
Nathan Hjelm
fb0b0442c4 openib/connect: re-enable xrc support in the openib btl
This commit updates the udcm cpc to support xrc. The steps followed by udcm
mimic those in the removed xoob cpc. This update has been tested with both XRC
and RC.

Mellanox, this is intended to go into 1.7.4. Please review carefully and let
me know if there are any issues.

cmr=v1.7.4:reviewer=miked

This commit was SVN r29767.
2013-11-27 22:28:04 +00:00
George Bosilca
cb24277737 Restrict the usage of MPI_Type_extent only to receiving processes
(aka the root). This commit is based on a patch provided by Pierre 
Jolivet.
Fix all the output to match the failing MPI call.

This commit was SVN r29761.
2013-11-27 12:09:31 +00:00
Rolf vandeVaart
aa98b0333b Call function from function table. Discovered during static build.
This commit was SVN r29755.
2013-11-25 22:46:07 +00:00
Ralph Castain
ac9820c46f Link against common cuda library
Thanks to Jorg Bornschein for pointing it out

cmr=v1.7.4:reviewer=rolfv

This commit was SVN r29750.
2013-11-24 17:06:51 +00:00
George Bosilca
68268377af Fix an error message for the igather and the usage of the extent on
non non-root processes for the iscatter. Thanks to Pierre
Jolivet for the bug report and the patch.

This commit was SVN r29736.
2013-11-23 00:59:22 +00:00
Edgar Gabriel
4f425872be fix the streams used in opal_output in the sharedfp components.
This commit was SVN r29726.
2013-11-21 16:11:49 +00:00
Devendar Bureddy
4a311ae9fd continue search sorted openib device list if no btls found with nearest HCA.
cmr=v1.7.4:reviewer=jladd

This commit was SVN r29725.
2013-11-20 22:23:12 +00:00
Nathan Hjelm
24a7e7aa34 Add support for the udreg registration cache and dynamics on XE/XK/XC.
To support the new mpool two changes were made to the mpool infrastructure:

 1) Added an mpool flag to indicate that an mpool does not need the memory
    hooks to use the leave pinned protocols. This flag is checked in the
    mpool lookup.

 2) Add a mpool context to the base registration. This new member is used
    by the udreg mpool to store the udreg context associated with the
    particular registration. The new member will not break the ABI
    compatibility as the new member is only currently used by the udreg
    mpool.

Dynamics support for Cray systems makes use of the global rank provided by
orte to give the ugni library a unique rank for each process. Dynamics
support is not available under direct-launch (srun.)

cmr=v1.7.4

This commit was SVN r29719.
2013-11-18 04:58:37 +00:00
Jeff Squyres
5206e877be Help decrease conflicts between SVN trunk and Cisco git branch of OMPI v1.6 branch
This commit was SVN r29715.
2013-11-15 21:35:56 +00:00
Jeff Squyres
e6ed7c9f4d Avoid trivial "don't mix declarations and code" compiler warning
This commit was SVN r29714.
2013-11-15 21:31:10 +00:00
Rolf vandeVaart
92e6aaa808 Adjust a default value. Adjust some levels of verbosity and one more debug message.
This commit was SVN r29712.
2013-11-14 21:47:27 +00:00
Ralph Castain
7480beb7f0 Per request from Nathan, add an offset value to the job struct so we can construct a "global rank" that spans multiple jobs during dynamic launch operations. Store a new ORTE_DB_GLOBAL_RANK value for each process in the database, and ensure that we share our own value during connect_accept so both sides can see it.
This isn't being used yet - just enabling Nathan to do what he needs.

***** NOTE: any use of the OMPI_DB_GLOBAL_RANK database key must be protected by #ifdef OMPI_DB_GLOBAL_RANK as not all RTE's will define this key. *****

This commit was SVN r29708.
2013-11-14 17:01:43 +00:00
Ralph Castain
22e30a680d Given that the oob and xoob cpc's are no longer operable and haven't been since the OOB update, remove them to avoid confusion
cmr:v1.7.4:reviewer=hjelmn:subject=Remove stale cpcs from openib

This commit was SVN r29703.
2013-11-14 04:16:53 +00:00
Nathan Hjelm
6b3cf0c1ba Merge branch 'romio_refresh'
This commit was SVN r29695.
2013-11-13 21:02:55 +00:00
Rolf vandeVaart
4964a5e98b Per this RFC from October 8, 2013 and as discuessed in telecon.
http://www.open-mpi.org/community/lists/devel/2013/10/13072.php

Add support for pinning GPU Direct RDMA in openib BTL for better small message latency of GPU buffers. 
Note that none of this is compiled in unless CUDA-aware support is requested.

This commit was SVN r29680.
2013-11-13 13:22:39 +00:00
Jeff Squyres
98ff91cfeb Refs trac:3091
Gah!  The "device" variable isn't used at all in this loop (my eye
glossed over the next line and thought that "device" was used in the
free() statement, but it's actually "devices" -- not "device").

This commit was SVN r29665.

The following Trac tickets were found above:
  Ticket 3091 --> https://svn.open-mpi.org/trac/ompi/ticket/3091
2013-11-12 23:01:04 +00:00
Jeff Squyres
7cb31111a6 Refs trac:3901
Feedback from Dave's review.

This commit was SVN r29664.

The following Trac tickets were found above:
  Ticket 3901 --> https://svn.open-mpi.org/trac/ompi/ticket/3901
2013-11-12 22:51:20 +00:00
Ralph Castain
762400d559 Silence warning
Refs trac:3898

This commit was SVN r29659.

The following Trac tickets were found above:
  Ticket 3898 --> https://svn.open-mpi.org/trac/ompi/ticket/3898
2013-11-11 22:53:09 +00:00
Jeff Squyres
5a940f5ee7 Arrgh -- remove debugging printf.
This commit was SVN r29657.
2013-11-11 22:44:28 +00:00
Jeff Squyres
e20217eccc Expand the "btl_usnic" MPI_T enumeration to have strings of the form:
<usnic device name>,<eth device>,<ip address>/<CIDR prefix>

For example:

   usnic_0,eth4,10.1.0.15/16

This is just handy for mapping the usnic_X device back to the IP
network to which it corresponds.

This commit was SVN r29656.
2013-11-11 22:25:30 +00:00
Nathan Hjelm
6a331275d8 Set transfers as active before starting them.
cmr=v1.7.4:ticket=trac:3898

This commit was SVN r29654.

The following Trac tickets were found above:
  Ticket 3898 --> https://svn.open-mpi.org/trac/ompi/ticket/3898
2013-11-11 21:50:54 +00:00