1
1
Граф коммитов

189 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
8f2d3aeb65 oshmem: do not include pml/ob1 headers
this is an abstraction violation and that can cause linker failure
2015-09-11 09:34:10 +09:00
Nathan Hjelm
202c6a38e4 scoll/mpi: work around bug in oshmem/proc design
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-10 08:55:56 -06:00
Ralph Castain
cf6137b530 Integrate PMIx 1.0 with OMPI.
Bring Slurm PMI-1 component online
Bring the s2 component online

Little cleanup - let the various PMIx modules set the process name during init, and then just raise it up to the ORTE level. Required as the different PMI environments all pass the jobid in different ways.

Bring the OMPI pubsub/pmi component online

Get comm_spawn working again

Ensure we always provide a cpuset, even if it is NULL

pmix/cray: adjust cray pmix component for pmix

Make changes so cray pmix can work within the integrated
ompi/pmix framework.

Bring singletons back online. Implement the comm_spawn operation using pmix - not tested yet

Cleanup comm_spawn - procs now starting, error in connect_accept

Complete integration
2015-08-29 16:04:10 -07:00
Gilles Gouaillardet
1a238d3a4f configury: fix fca detection
* do not add -I/.../include/fca -I /.../include/fca_core to CPPFLAGS
 * allow configure --with-fca
 * search fca libs in both DIR/lib and DIR/lib64
 * fix the description of the --with-fca option
2015-08-13 11:09:15 +09:00
Jeff Squyres
5065978a1e oshmem: __FUNCTION__ -> __func__ fixes 2015-08-05 05:39:38 -07:00
yosefe
41f3b77e31 ikrit: set DC defaults. 2015-07-24 21:01:13 +03:00
Nathan Hjelm
4d92c9989e more c99 updates
This commit does two things. It removes checks for C99 required
headers (stdlib.h, string.h, signal.h, etc). Additionally it removes
definitions for required C99 types (intptr_t, int64_t, int32_t, etc).

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-06-25 10:14:13 -06:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Gilles Gouaillardet
11e11e1be9 initialize common symbols from oshmem 2015-05-08 10:11:58 +09:00
Nathan Hjelm
033894b493 Merge pull request #541 from hjelmn/c99_components
C99 component initialization
2015-04-20 10:45:39 -06:00
Devendar Bureddy
3dbd95fa73 OSHMEM: enable mpi collective by default 2015-04-20 19:39:36 +03:00
Nathan Hjelm
c4a61969c0 oshmem: use C99 subobject naming for component initialization
This commit helps future-proof oshmem components by initializing each
component member by name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-04-18 10:29:58 -06:00
Nathan Hjelm
b68d66bb9b MCA: Add the project/project version to the MCA base component
This commit adds support for project_framework_component_* parameter
matching. This is the first step in allowing the same framework name
in multiple projects. This change also bumps the MCA component version
to 2.1.0.

All master frameworks have been updated to use the new component
versioning macro. An mca.h has been added to each project to add a
project specific versioning macro of the form
PROJECT_MCA_VERSION_2_1_0.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-03-27 10:59:04 -06:00
Jeff Squyres
8d04215741 coll: trivial spelling fix
s/Algoritm/Algorithm/g
2015-02-27 18:20:17 -08:00
Alina Sklarevich
e4c4e7df5e Fix the calls to ibv_fork_init and remove btl_openib_want_fork_support.
In order to have an effect, ibv_fork_init should be called in the
beginning of the verbs initialization flow - before the calls to the
ibv_create_qp and ibv_create_cq verbs.
These functions are called from the oob/ud code and by the time the
other verbs components (btl openib, pml yalla, ...) call ibv_fork_init,
it's too late. This commit forces the call to ibv_fork_init (if it's
requested) right at the beginning of all the components that are using
verbs.
(ibv_fork_init() can be safely called multiple times)

This commit also removes the btl_openib_want_fork_support mca parameter
and adds a new mca parameter instead - opal_verbs_want_fork_support.
Through this new parameter, fork support may be requested for ALL
components.
The default value for this parameter is set to 1.

Before this commit the btl_openib_want_fork_support parameter didn't
provide fork support for the openib btl if its value was set to 1.
(because when openib called ibv_fork_init, it was already after the
calls to ibv_create_* in oob/ud and thereofre it failed).
2015-02-25 10:58:50 +02:00
igor-ivanov
0f44cdd779 Merge pull request #421 from igor-ivanov/pr/fix-oshmem-coverity
oshmem: Fix set of coverity issues
2015-02-24 21:40:06 +04:00
Nathan Hjelm
5f1254d710 Update code base to use the new opal_free_list_t
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.

This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.

Notes:

OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-24 10:05:45 -07:00
Igor Ivanov
3e2dd782ea oshmem: Fix set of coverity issues
Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-02-24 19:03:10 +02:00
Igor Ivanov
010dce307a Fix set of coverity issues
List of CIDs (scan.coverity.com):
oshmem:
1269787, 1269907, 1270161, 1270162, 1270977, 1270978
ompi:
1270170, 1270172, 1270173

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-02-20 17:45:46 +04:00
Igor Ivanov
426d1ce146 oshmem: Fix set of coverity issues
List of CIDs (scan.coverity.com):
1269721, 1269725, 1269787, 1269907, 1269909, 1269910, 1269911, 1269912,
1269959, 1269960, 1269984, 1269985, 1270136, 1270157, 1269845, 1269875,
1269876, 1269877, 1269878, 1269884, 1269885, 1270161, 1270162, 1270175,
1269734, 1269739, 1269742, 1269743

Signed-off-by: Igor Ivanov <Igor.Ivanov@itseez.com>
2015-02-19 23:00:17 +04:00
Nathan Hjelm
16ae7d97d1 spml/yoda: update for BTL 3.0 interface
This commit make spml/yoda compatible with BTL 3.0. This is meant as a
starting point only. More work will be needed to make optimial use of
the new interface.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:38 -07:00
Mike Dubman
6611f4ce38 OSHMEM: fix warnings 2015-02-09 20:49:03 -08:00
Mike Dubman
54a072caaa OSHMEM: fix infinite recursion and stack size violation
send reply before posting the receive request again to limit the recursion size to
number of receive requests.
send can call opal_progress which calls this function again. If recv req is started
stack size will be proportional to number of job ranks.
2015-01-04 16:31:19 +02:00
Alex Mikheev
c76261da07 OSHMEM: atomic mxm: fix mkey conversion
Correctly return mxm_empty_mem_key when shmem mkey is empty
2014-12-16 16:34:42 +02:00
Alex Mikheev
71ebbca26d OSHMEM: spml ikrit: fix spelling in help file 2014-12-16 16:18:38 +02:00
Alex Mikheev
3f7ed56548 OSHMEM: spml ikrit: fix mxm disconnect flow
Add out of band barrier before performing mxm disconnect.
It will make sure that every pe is ready to disconnect. Otherwise
bad things may happen.
2014-12-16 15:07:17 +02:00
Alex Mikheev
428add390e OSHMEM: spml ikrit: add skew to connect/disconnect
Each pe connects/disconnects starting from itself instead of pe=0. This
will distribute network traffic in a more friendly way.
2014-12-03 15:36:45 +02:00
Alex Mikheev
8de50d8420 OSHMEM: spml ikrit: add call to mxm_mq_destroy()
Make valgrind happy by calling mxm_mq_destroy() on module
close.
2014-12-01 12:36:46 +02:00
Nathan Hjelm
d495d49b1c Merge pull request #273 from open-mpi/topic/yoda_rdma_flags
OSHMEM: spml yoda: use flags to check if btl is RDMA capable
2014-11-16 12:04:04 -07:00
Alex Mikheev
fbb9dc5b1e OSHMEM: spml ikrit valgrind fix
always initialize request flags
2014-11-16 17:24:16 +02:00
Alex Mikheev
3443c1d5e5 OSHMEM: spml yoda: use flags to check if btl is RDMA capable 2014-11-16 17:20:20 +02:00
Gilles Gouaillardet
2177f9ec3e fix missing copyright, no code change 2014-11-13 14:56:09 +09:00
Gilles Gouaillardet
cd6e3ecb07 oshmem/yoda: fix a typo in mca_spml_yoda_get_completion 2014-11-13 14:53:32 +09:00
Ralph Castain
780c93ee57 Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL.
We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.
2014-11-11 17:00:42 -08:00
Alex Mikheev
097b469f61 OSHMEM: sshmem verbs: fix shared_mr detection
It seems that 5ce2f10067
changed default flag values but it did not modify detection code.
2014-11-10 13:34:04 +02:00
Alex Mikheev
7327b13823 OSHMEM: sshmem mmap: removed unused help topics 2014-11-05 16:39:20 +02:00
Alex Mikheev
1f2ab43ba9 OSHMEM: spml ikrit: remove empty lines in helpfile 2014-11-04 11:26:09 +02:00
Alex Mikheev
d06fb85350 OSHMEM: fixes 'improve mxm transport sanity check'
The code that actually checks for valid transport combos
somehow did not make it to the original commit:

74ab30b738
2014-11-04 11:07:22 +02:00
Alex Mikheev
e1cf6f37ba OSHMEM: spml ikrit: disable rdmap op DCI pool
Instead use single pool for both rdma and send receive ops.
2014-11-03 10:01:07 +02:00
Mike Dubman
5ce2f10067 OSHMEM: integrate review comments from open-mpi/ompi-release#7 2014-10-24 17:21:46 +03:00
Alex Mikheev
5af4d02bd3 OSHMEM: spml ikrit: complete puts b4 memheap destruction
Force completion of all puts before deregestering memheap/bss memory

Fixes a possible race condition where put request completion callback
is called when request context is already cleared.

Change-Id: I7ed887ec0b03a66ce5d3076a7edcf64061f57370
2014-10-19 14:04:34 +03:00
Mike Dubman
ab22dcb875 Merge pull request #229 from nkogteva/master
oshmem mmap: new mca parameters were introduced - sshmem_mmap_anonymous,...
2014-10-15 10:24:29 +03:00
Alex Mikheev
643e64497d OSHMEM: spml ikrit: hw rdma channel is disabled by default 2014-10-14 16:09:51 +03:00
Alex Mikheev
74ab30b738 OSHMEM: spml ikrit: improve mxm transport sanity check
Do not allow combination of transports that is not compliant with
shmem spec. Especially do not allow mix of hw and software atomic
ops

Issue: 4721
Change-Id: Ide382f7510495df3d385f2a5ae5f9def6ef5332c
2014-10-14 15:44:57 +03:00
Alex Mikheev
1bcc88cfb1 OSHMEM: spml ikrit: hardware rdma endpoint
Create additional endpoint that is capable of true
one sided RDMA transfers.

MXM atomics component now uses this endpoint
2014-10-14 15:31:09 +03:00
Alina Sklarevich
1eb6286547 OSHMEM: fix the makefile.
(oshmem/mca/sshmem/base/Makefile.am)
2014-10-14 11:57:46 +03:00
Nadezhda Kogteva
b2a93943dc oshmem mmap: set lvl4 for sshmem_mmap_anonymous and sshmem_mmap_fixed variables, define MAP_ANONYMOUS returned. 2014-10-14 08:54:44 +03:00
Mike Dubman
ec1f761d8e OSHMEM: add missing help file, got lost during merge. Thanks to Yossi/Igor for finding it.
Change-Id: I466e40a3fea70e8045dd1e897edcc50ccf0451a3

Conflicts:
	oshmem/mca/sshmem/base/Makefile.am
	oshmem/mca/sshmem/base/help-oshmem-sshmem.txt
2014-10-13 16:58:35 +03:00
Alex Mikheev
8fcbcba516 Merge branch 'topic/oshmem_shared_mr_fix' 2014-10-13 15:24:12 +03:00
Alex Mikheev
cd67642183 OSHMEM: sshmem verbs: workaround shared_mr procfs bug
dereg shared_mr before doing dereg on its mr.
2014-10-13 15:14:34 +03:00