1
1

73 Коммитов

Автор SHA1 Сообщение Дата
Alex Mikheev
74ab30b738 OSHMEM: spml ikrit: improve mxm transport sanity check
Do not allow combination of transports that is not compliant with
shmem spec. Especially do not allow mix of hw and software atomic
ops

Issue: 4721
Change-Id: Ide382f7510495df3d385f2a5ae5f9def6ef5332c
2014-10-14 15:44:57 +03:00
Alex Mikheev
1bcc88cfb1 OSHMEM: spml ikrit: hardware rdma endpoint
Create additional endpoint that is capable of true
one sided RDMA transfers.

MXM atomics component now uses this endpoint
2014-10-14 15:31:09 +03:00
Alina Sklarevich
e974bec57e OSHMEM: fix check-help-string.pl errors and warnings.
This commit was SVN r32511.
2014-08-12 11:30:14 +00:00
Gilles Gouaillardet
03fbd9a12d check-help-strings cleanup
This commit was SVN r32490.
2014-08-11 03:19:01 +00:00
Mike Dubman
e819a45cee shmem: opal refactoring voices
http://www.open-mpi.org/community/lists/devel/2014/08/15590.php

This commit was SVN r32489.
2014-08-10 08:06:37 +00:00
Mike Dubman
bb53dff57a oshmem: fix after opal refactoring
http://www.open-mpi.org/community/lists/devel/2014/08/15590.php

This commit was SVN r32488.
2014-08-10 07:30:12 +00:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00
Jeff Squyres
75230ee574 spml_yoda_getreq.c: fix compile error related to r32196
This commit was SVN r32197.

The following SVN revision numbers were found above:
  r32196 --> open-mpi/ompi@a14e0f10d4
2014-07-10 17:17:19 +00:00
Nathan Hjelm
a14e0f10d4 Per RFC: Remove des_src and des_dst members from the
mca_btl_base_segment_t and replace them with des_local and des_remote

This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.

In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.

RFC text:

This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.

What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.

Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.

This commit was SVN r32196.
2014-07-10 16:31:15 +00:00
Alex Mikheev
c3e017c190 OSHMEM: refactoring of fix wrong btl/sm processing
Use exising fields of mkey struct to identify 'shared memory'
segments.

mkey.u.key is now always initialized to MAP_SEGMENT_SHM_INVALID instead
of 0

reviewed by Mike and Igor
cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32174.
2014-07-09 08:57:27 +00:00
Mike Dubman
247da2819f OSHMEM: fix wrong btl/sm processing and typo
fixed by Igor reviewed by Alex,Mike,Yossi

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r32100.
2014-06-28 18:40:28 +00:00
Gilles Gouaillardet
53ae38cfb1 Handle error case in mca_spml_yoda_register
if source memory could not be registered, then return NULL
some cleanup might be needed, please refer to the FIXME in the code

cmr=v1.8.2:reviewer=miked

This commit was SVN r32081.
2014-06-25 08:58:45 +00:00
Alex Mikheev
3b5fa97790 OSHMEM: fixes problem with local heap2heap copy
check for possibility of heap2heap copy was incorrect
in case when shared heaps have different virtual
addresses on same host.

It seems that ibv_exp_reg_mr() on CIB cards may return
different VAs for heap on same node. On CX3 addresses are
the same.

reviewed by miked

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31969.
2014-06-09 09:41:44 +00:00
Mike Dubman
55e35e0f6e OSHMEM: Fix issue with incorrect mca variables registration
Few components had wrong mca variables registration procedure
List of them:
- atomic basic and mxm
- spml yoda and ikrit
Two mca variables as runtime_api_verbose and runtime_lock_recursive change
names to oshmem_api_verbose and oshmem_lock_recursive otherwise they
were not shown by oshmem_info tool.

fixed by Igor, reviewed by Miked
cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31962.
2014-06-06 17:36:47 +00:00
Mike Dubman
5dd4c68a0f OSHMEM: use bulk connect/disconnect API
fixed by Yossi, reviewed by MikeD

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31856.
2014-05-21 08:00:21 +00:00
Mike Dubman
95e637f5ba OSHMEM: fix error message when aborting on OOM
fixed by Roman, reviewed by Miked

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31752.
2014-05-14 13:45:16 +00:00
Ralph Castain
5602156a1c Use the correct abstraction layer name for the data dirs
This commit was SVN r31684.
2014-05-08 14:32:24 +00:00
Ralph Castain
11faab1091 The final step of the RFC: convert the <foo>libdir and friends to fit their respective code areas, and equate them all at the top. Note that we can't entirely separate things as the opal_install_dirs framework can't handle separated locations for the various trees.
This commit was SVN r31679.
2014-05-08 02:01:35 +00:00
Alex Mikheev
c29d426153 OSHMEM: fixes mxm rc transport mkey ecxhange
cmr=v1.8.2:reviewer=ompi-rm1.8
reviewed by miked

This commit was SVN r31627.
2014-05-04 14:26:54 +00:00
Mike Dubman
ff42999037 OSHMEM: Added missing API for Java bindings (int16/32/64 stuff)
fixed by Roman, reviewed by Mike

cmr=v1.8.2:reviewer=ompi-rm1.8

This commit was SVN r31564.
2014-04-30 12:03:23 +00:00
Mike Dubman
d8288fa39d OSHMEM: Fix call prepare_src with a NULL endpoint
see issue: https://svn.open-mpi.org/trac/ompi/ticket/4399

Refs trac:4399

fixed by Igor, reviewed by Alex
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r31168.

The following Trac tickets were found above:
  Ticket 4399 --> https://svn.open-mpi.org/trac/ompi/ticket/4399
2014-03-20 13:11:25 +00:00
Mike Dubman
c195118959 OSHMEM: Fix issue 'OSHMEM passing endpoint==NULL to BTL prepare_src()'
fixed by Igor, reviewed by Miked

fixes trac:4359

This commit was SVN r30996.

The following Trac tickets were found above:
  Ticket 4359 --> https://svn.open-mpi.org/trac/ompi/ticket/4359
2014-03-11 18:00:32 +00:00
Mike Dubman
2828afddce OSHMEM: fix output, lower prio for scoll/mpi
fixed by Roman/Elena, reviewed by Igor/Mike

cmr=v1.7.5:revewer=ompi-rm1.7

This commit was SVN r30957.
2014-03-06 16:17:58 +00:00
Mike Dubman
361f15d5d7 OSHMEM: fix use of opal_verbose
fixed by Roman, reviewed by Igor/Mike

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30943.
2014-03-05 08:49:14 +00:00
Mike Dubman
d584869dda OSHMEM: memheap mkey exchange fix
fix situations where cluster nodes can have different btls

Fixed by Roman, reviewed by Igor, Mike
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30877.
2014-02-27 14:02:30 +00:00
Mike Dubman
323e4418b9 OSHMEM: extract memheap allocate methods into separate framework
- similar to opal/shmem
- next step is some refactoring and merge into opal/shmem
 Developed by Igor, reviewed by AlexM, MikeD

This commit fixes trac:4261.

This commit was SVN r30855.

The following Trac tickets were found above:
  Ticket 4261 --> https://svn.open-mpi.org/trac/ompi/ticket/4261
2014-02-26 16:32:23 +00:00
Ralph Castain
fbeb0cac10 Silence warning
cmr=v1.7.5:reviewer=ompi-gk1.7

This commit was SVN r30828.
2014-02-25 23:50:12 +00:00
Mike Dubman
202bf90287 OSHMEM: ikrit optimization
no need pre-register heap if MXM/ud is used
fixed by Alex, reviewed by Miked

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30809.
2014-02-25 15:03:53 +00:00
Mike Dubman
5ed50793d5 OSHMEM: misc ikrit/mxm enhancements
- fix mxm/tl selection logic
- do not require memory registration if mxm/ud was selected

fixed by Alex, reviewed by Miked

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30802.
2014-02-24 07:06:57 +00:00
Mike Dubman
684e78e669 OSHMEM: OOM in yoda
fix: do not fail on blm allocation error, wait for some puts to complete and retry

fixed by Roman, reviewed by Mike/Alex
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30779.
2014-02-20 09:53:32 +00:00
Mike Dubman
49ee63f4b8 MXM: do not enforce version check
- MXM uses libtool versioning scheme which is enough, no need additional in OMPI

reviewed by yossi

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30768.
2014-02-18 19:44:37 +00:00
Mike Dubman
982149d8c8 OSHMEM: various fixes
1. fix in oshmem scoll component: basic algorithms should
   call basic collectives since their implementation
   incompatible with others (fca, hcoll).

2. Set OPAL_EVLOOP_ONCE flag ON for libevent in the case 
   of yoda smpl. Otherwise there is possible deadlock in 
   atomic_basic_lock call

fixed by Val, Igor, reviewed by Miked

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30762.
2014-02-18 15:07:03 +00:00
Mike Dubman
fe692eb107 OSHMEM: remove unused code, rename mca param
- remove old, unused code
- rename mca param for oshmem preconnect to match mpi naming scheme.

fixed by Alex, reviewed by Mike

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30748.
2014-02-17 14:57:12 +00:00
Mike Dubman
96142b31bd shmem: remove unused defines
fixed by Roman, reviewed by MikeD
cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30735.
2014-02-15 06:43:08 +00:00
Mike Dubman
732f108ae4 OSHMEM: fix segv on finalize with spml/yoda
avoid double call to bml
fixed by AlexMa, reviewed by AlexM and MikeD

cmr=v1.7.5:reviewer=ompi-rm1.7

This commit was SVN r30715.
2014-02-13 04:42:19 +00:00
Mike Dubman
28949efcaf OSHMEM: MXM2: a2a perf improvement on large scale
Allow only limited number of coonections to have 'puts'
that do not require remote completion ack. That will
greatly improve performance of shmem_fence()/shmem_quiet()
and shmem_barrier() when there are many active connections.

fixed by Alex, reviewed by Miked

Refs trac:3763

This commit was SVN r30573.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-02-06 08:42:45 +00:00
Mike Dubman
27a763c86c OSHMEM: finalization fixes
Fixes mxm endpoint destruction and hung during
SHMEM finalization.

Add a barrier between spml del procs and finalization.
Not having it caused hungs because ikrit spml can not properly
disconnect if its peer already finizalized.

Refs trac:3763

This commit was SVN r30572.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-02-06 08:40:43 +00:00
George Bosilca
489f093b59 It didn't compile. Cleanup a little the headers inclusion.
This commit was SVN r30473.
2014-01-29 14:30:55 +00:00
Mike Dubman
30e1e49a9e OSHMEM: refactoring to reuse common functions from different components.
This is preparation for moving verbs dependent code out from memheap/base component

Refs: #3763

This commit was SVN r30454.
2014-01-28 07:30:36 +00:00
Mike Dubman
b7750ccbf4 OSHMEM: bml initialization is moved into ompi_init
it fixes race of mca_var segfault in finalization of shmem

based on this thread:
http://www.open-mpi.org/community/lists/devel/2014/01/13778.php

Refs trac:3763

fixed by Igor, reviewed by Brian

This commit was SVN r30304.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-17 06:09:29 +00:00
Jeff Squyres
d953e46a68 Fix compiler warnings: PRIu64 is the proper way to output uint64_t
values.  See https://svn.open-mpi.org/trac/ompi/wiki/PrintfCodes.

This commit was SVN r30149.
2014-01-08 02:31:30 +00:00
Brian Barrett
8b778903d8 Fix longstanding issue with our multi-project support. Rather than using
pkg{data,lib,includedir}, use our own ompi{data,lib,includedir}, which is
always set to {datadir,libdir,includedir}/openmpi.  This will keep us from
having help files in prefix/share/open-rte when building without Open MPI,
but in prefix/share/openmpi when building with Open MPI.

This commit was SVN r30140.
2014-01-07 22:11:15 +00:00
Mike Dubman
6fb0dbdab5 OSHMEM: port 6 patches from git mirror to svn
Subject: [PATCH 1/6] OSHMEM: mkey refactoring
mkey can be either shared memory style id or it can be
arbitrary byte string
removed hack that used spml_context to store generic keys
coding style fixes

Subject: [PATCH 2/6] OSHMEM: added support of MXM 2.0 rc transport
coding style fixed, typos, check error condition

Subject: [PATCH 3/6] OSHMEM: mxm2.0: remove PTL_SELF
There is no need to have special case for 'self'
connection in mxm 2.0. It also solves the problem
of passing incorrect mkey when doing put/get to
self

Subject: [PATCH 4/6] OSHMEM: fixes mxm fadd
give a dummy buffer if doing atomic add

Subject: [PATCH 5/6] OSHMEM: mxm2.0: do not use MXM_REQ_FLAG_SEND_LAZY
Subject: [PATCH 6/6] OSHMEM: remove unused include, causes compilation fail on ubuntu

Refs trac:3763

This commit was SVN r30129.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2014-01-07 11:56:36 +00:00
Mike Dubman
92cf175e9e OSHMEM: exchange mxm(ikrit) endpoints via MPI_Allgather, code cleanup, remove unused
Refs trac:3763

This commit was SVN r30089.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-26 10:53:48 +00:00
Mike Dubman
e2f372ac4b OSHMEM: set default transport for mxm under OSHMEM to be ud, can be overwritten with MXM_OSHMEM_TL= variable
Refs trac:3763

This commit was SVN r30088.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-26 08:05:08 +00:00
Mike Dubman
2e138ddd05 OSHMEM: Use MPI calls for mkey exchange
fixed by Alex, reviewed by miked
Refs: 3763

This commit was SVN r30056.
2013-12-23 09:20:42 +00:00
Mike Dubman
6dbce7f9f8 fix compile warning
Refs trac:3763

This commit was SVN r30004.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-20 11:03:09 +00:00
Mike Dubman
d70f93b2dc fix corrupted verbose output in oshmem
set yoda prio lower than ikrit
fix anon unions in ikrit
Refs trac:3763

This commit was SVN r29976.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-19 11:59:32 +00:00
Mike Dubman
b95a9d865a rework SHMEM verbose macros to enable if --enable-debug specified
Refs trac:3763

This commit was SVN r29921.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-16 09:13:27 +00:00
Mike Dubman
ac4573b6db code formatting according to OMPI code style
Refs trac:3763

This commit was SVN r29908.

The following Trac tickets were found above:
  Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
2013-12-14 14:39:56 +00:00