calls, which means that the operation was successfully started but not
immediately completed. This is a "good" return code that should not be handled
as an error.
Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
- pass field_mask to ucp_init().
- use non-blocking disconnect.
- recv() with pre-allocated request.
- call opal_progress() from iprobe() and improbe().
- use shift pattern in connect/disconnect.
store oshmem related per proc data in an oshmem_proc_data_t struct,
that is stored in the padding section of an ompi_proc_t
this data can be accessed via the OSHMEM_PROC_DATA(proc) macro
Fixesopen-mpi/ompi#2023
and fix oshmem_group_proc_{init,create} so they use the number of procs in oshmem_comm_world
Thanks Debendra Das for the report and Josh Ladd for the guidance
Fixesopen-mpi/ompi#1966
Some MXM tls such as self, shm can comlete requests immediately.
Make sure that opal_progress() is called before before request
is completed.
fix ud_only logic when hw rdma channel is using ud and main
transport is rc or dc.
Most functionality of oshmem_proc duplicates ompi_proc. In addition
to that, Current logic does not allow to do oshmem initialization
w/o ompi startup.
So this refactoring allows to avoid code duplication, decrease used
memory and make oshmem support easier.
Now oshmem_proc is transparent ompi_proc structure, that can be
extended by oshmem specific data.
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.
All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.
This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.
Notes:
OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit make spml/yoda compatible with BTL 3.0. This is meant as a
starting point only. More work will be needed to make optimial use of
the new interface.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.
Force completion of all puts before deregestering memheap/bss memory
Fixes a possible race condition where put request completion callback
is called when request context is already cleared.
Change-Id: I7ed887ec0b03a66ce5d3076a7edcf64061f57370
Do not allow combination of transports that is not compliant with
shmem spec. Especially do not allow mix of hw and software atomic
ops
Issue: 4721
Change-Id: Ide382f7510495df3d385f2a5ae5f9def6ef5332c
WHAT: Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL
All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies. This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP. Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose. UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs. A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.
This commit was SVN r32317.
mca_btl_base_segment_t and replace them with des_local and des_remote
This change also updates the BTL version to 3.0.0. This commit does
not represent the final version of BTL 3.0.0. More changes are coming.
In making this change I updated all of the BTLs as well as BTL user's
to use the new structure members. Please evaluate your component to
ensure the changes are correct.
RFC text:
This is the first of several BTL interface changes I am proposing for
the 1.9/2.0 release series.
What: Change naming of btl descriptor members. I propose we change
des_src and des_dst (and their associated counts) to be des_local and
des_remote. For receive callbacks the des_local member will be used to
communicate the segment information to the callback. The proposed change
will include updating all of the doxygen in btl.h as well as updating
all BTLs and BTL users to use the new naming scheme.
Why: My btl usage makes use of both put and get operations on the same
descriptor. With the current naming scheme I need to ensure that there
is consistency beteen the segments described in des_src and des_dst
depending on whether a put or get operation is executed. Additionally,
the current naming prevents BTLs that do not require prepare/RMA matched
operations (do not set MCA_BTL_FLAGS_RDMA_MATCHED) from executing
multiple simultaneous put AND get operations. At the moment the
descriptor can only be used with one or the other. The naming change
makes it easier for BTL users to setup/modify descriptors for RMA
operations as the local segment and remote segment are always in the
same member field. The only issue I forsee with this change is that it
will require a little more work to move BTL fixes to the 1.8 release
series.
This commit was SVN r32196.
Use exising fields of mkey struct to identify 'shared memory'
segments.
mkey.u.key is now always initialized to MAP_SEGMENT_SHM_INVALID instead
of 0
reviewed by Mike and Igor
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r32174.
if source memory could not be registered, then return NULL
some cleanup might be needed, please refer to the FIXME in the code
cmr=v1.8.2:reviewer=miked
This commit was SVN r32081.
check for possibility of heap2heap copy was incorrect
in case when shared heaps have different virtual
addresses on same host.
It seems that ibv_exp_reg_mr() on CIB cards may return
different VAs for heap on same node. On CX3 addresses are
the same.
reviewed by miked
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31969.
Few components had wrong mca variables registration procedure
List of them:
- atomic basic and mxm
- spml yoda and ikrit
Two mca variables as runtime_api_verbose and runtime_lock_recursive change
names to oshmem_api_verbose and oshmem_lock_recursive otherwise they
were not shown by oshmem_info tool.
fixed by Igor, reviewed by Miked
cmr=v1.8.2:reviewer=ompi-rm1.8
This commit was SVN r31962.
fixed by Igor, reviewed by Miked
fixes trac:4359
This commit was SVN r30996.
The following Trac tickets were found above:
Ticket 4359 --> https://svn.open-mpi.org/trac/ompi/ticket/4359
fix situations where cluster nodes can have different btls
Fixed by Roman, reviewed by Igor, Mike
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30877.
- similar to opal/shmem
- next step is some refactoring and merge into opal/shmem
Developed by Igor, reviewed by AlexM, MikeD
This commit fixes trac:4261.
This commit was SVN r30855.
The following Trac tickets were found above:
Ticket 4261 --> https://svn.open-mpi.org/trac/ompi/ticket/4261
- fix mxm/tl selection logic
- do not require memory registration if mxm/ud was selected
fixed by Alex, reviewed by Miked
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30802.
fix: do not fail on blm allocation error, wait for some puts to complete and retry
fixed by Roman, reviewed by Mike/Alex
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30779.
- MXM uses libtool versioning scheme which is enough, no need additional in OMPI
reviewed by yossi
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30768.
1. fix in oshmem scoll component: basic algorithms should
call basic collectives since their implementation
incompatible with others (fca, hcoll).
2. Set OPAL_EVLOOP_ONCE flag ON for libevent in the case
of yoda smpl. Otherwise there is possible deadlock in
atomic_basic_lock call
fixed by Val, Igor, reviewed by Miked
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30762.
- remove old, unused code
- rename mca param for oshmem preconnect to match mpi naming scheme.
fixed by Alex, reviewed by Mike
cmr=v1.7.5:reviewer=ompi-rm1.7
This commit was SVN r30748.
Allow only limited number of coonections to have 'puts'
that do not require remote completion ack. That will
greatly improve performance of shmem_fence()/shmem_quiet()
and shmem_barrier() when there are many active connections.
fixed by Alex, reviewed by Miked
Refs trac:3763
This commit was SVN r30573.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763
Fixes mxm endpoint destruction and hung during
SHMEM finalization.
Add a barrier between spml del procs and finalization.
Not having it caused hungs because ikrit spml can not properly
disconnect if its peer already finizalized.
Refs trac:3763
This commit was SVN r30572.
The following Trac tickets were found above:
Ticket 3763 --> https://svn.open-mpi.org/trac/ompi/ticket/3763