1
1
Граф коммитов

647 Коммитов

Автор SHA1 Сообщение Дата
Yossi Itigin
f2c5b2d778 Merge pull request #7373 from hoopoepg/topic/oshmem-inc-max-segments
OSHMEM/SEGMENTS: increase number of max segments
2020-02-09 11:27:21 +02:00
Gilles Gouaillardet
174e967dbc
Remove ORTE project
Will be replaced by PRRTE. Ensure that OMPI and OPAL layers build
without reference to ORTE. Setup opal/pmix framework to be static.
Remove support for all PMI-1 and PMI-2 libraries. Add support for
"external" pmix component as well as internal v4 one.

remove orte: misc fixes

 - UCX fixes
 - VPATH issue
 - oshmem fixes
 - remove useless definition
 - Add PRRTE submodule
 - Get autogen.pl to traverse PRRTE submodule
 - Remove stale orcm reference
 - Configure embedded PRRTE
 - Correctly pass the prefix to PRRTE
 - Correctly set the OMPI_WANT_PRRTE am_conditional
 - Move prrte configuration to the end of OMPI's configure.ac
 - Make mpirun a symlink to prun, when available
 - Fix makedist with --no-orte/--no-prrte option
 - Add a `--no-prrte` option which is the same as the legacy
   `--no-orte` option.
 - Remove embedded PMIx tarball. Replace it with new submodule
   pointing to OpenPMIx master repo's master branch
 - Some cleanup in PRRTE integration and add config summary entry
 - Correctly set the hostname
 - Fix locality
 - Fix singleton operations
 - Fix support for "tune" and "am" options

Signed-off-by: Ralph Castain <rhc@pmix.org>
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-02-07 18:20:06 -08:00
Sergey Oblomov
f742f289ea OSHMEM/SEGMENTS: increase number of max segments
- increase number of max segments to allow application be launched
  on some Ubuntu configurations

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-02-07 15:20:58 +02:00
Dmitry Gladkov
b6a658b24a SPML/UCX: Fix compilation warnings with GCC
Signed-off-by: Dmitry Gladkov <dmitrygla@mellanox.com>
2020-02-03 05:11:49 -08:00
Sergey Oblomov
8543860689 SPML/UCX: fixed coverity issues
- fixed sizeof(char***) by variable datatype
- fixed resorce leak in proc_add

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-01-23 14:02:08 +02:00
Tomislav Janjusic
3d6bf9fd8e oshmem/ucx: improves spml ucx performance for multi-threaded
applications.

Improves multi-threaded performance by adding the option to create
multiple ucx workers in threaded applications.

Co-authored with:
Artem Y. Polyakov <artemp@mellanox.com>,
Manjunath Gorentla Venkata <manjunath@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-01-22 21:41:09 +02:00
Tomislav Janjusic
1b58e3d073 oshmem/ucx: Improves performance for non-blocking put/get operations.
Improves the performance when excess non-blocking operations are posted
by periodically calling progress on ucx workers.

Co-authored with:
Artem Y. Polyakov <artemp@mellanox.com>,
Manjunath Gorentla Venkata <manjunath@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2020-01-22 00:59:26 +02:00
Charles Shereda
cbc6feaab2 Created opal_gethostname() as safer gethostname substitute.
The opal_gethostname() function provides a more robust mechanism
to retrieve the hostname than gethostname(), which can return
results that are not null-terminated, and which can vary in its
behavior from system to system.

opal_gethostname() just returns the value in opal_process_info.nodename;
this is populated in opal_init_gethostname() inside opal_init.c.

-Changed all gethostname calls in opal subtree to opal_gethostname
-Changed all gethostname calls in orte subtree to opal_gethostname
-Changed all gethostname calls in ompi subdir to opal_gethostname
-Changed all gethostname calls in oshmem subdir to opal_gethostname
-Changed opal_if.c in test subdir to use opal_gethostname
-Changed opal_init.c to include opal_init_gethostname. This function
 returns an int and directly sets opal_process_info.nodename per
 jsquyres' modifications.

Relates to open-mpi#6801

Signed-off-by: Charles Shereda <cpshereda@lanl.gov>
2020-01-13 08:52:17 -08:00
Eisuke Kawashima
d26d4e1d63
Fix typo and update URLs (https, redirection) [skip ci]
Signed-off-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
2020-01-07 03:52:25 +09:00
Tomislav Janjusic
2d8f9b1d09 oshmem/extended: Fix shmem_atomic_set for float and double.
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:15:41 +02:00
Tomislav Janjusic
cb5ff55b27 oshmem/ucx: fixed a build issue
Co-authored with: Artem Polyakov <artemp@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-12-19 21:14:54 +02:00
Tomislav Janjusic
ebb985dea9 oshmem:ucx, fix race condition and add context recycling
1) Race condition: Do not add private contexts to active list.
Private contexts are only visible to the user.
2) Recycled contexts: Destroyed contexts are put on an idle list until
finalize, continuous context creation will lead to oom condition.
Instead, check if context from idle list meets new context requirements
and reuse it.

Co-authored with: Artem Y. Polyakov <artemp@mellanox.com>,
                  Manjunath Gorentla Venkata <manjunath@mellanox.com>

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-11-23 04:39:05 +02:00
Barton Chittenden
47816ef83b Remove mxm, yalla and ikrit
Signed-off-by: Barton Chittenden <bartonski@gmail.com>
2019-11-22 13:40:16 -08:00
Joseph Schuchart
9f2c6a42c3 Shmem: use bitwise and instead of logical and to check for allocator capabilities
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2019-11-05 15:44:59 +01:00
Sergey Oblomov
991082abf2 IKRIT: restored compilation
- due to some refactoring and adding new functionality compilation
  of ikrit module was broken
- this commit restores compilation

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-09-16 19:00:34 +03:00
Sergey Oblomov
01dacaa6a4 SPML/UCX: fixed comment
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-21 12:08:58 +03:00
Sergey Oblomov
182023febb SPML/UCX: fixed hang in SHMEM_FINALIZE
- used MPI _Barrier to synchronize processes

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-21 12:04:46 +03:00
Gilles Gouaillardet
3f081e0750 oshmem: fix make uninstall
This fixes a typo introduced in open-mpi/ompi@e0e924c4ed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-08-19 10:21:53 +09:00
Sergey Oblomov
43186e494b UCX: added PPN hint for UCX context
- added PPN hint for UCX context init

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-08-05 18:07:06 +03:00
Scott Miller
ca59cabc67
shmem/c: Fix shmem type for calls to shmem_test and shmem_wait_until with [u]int32_t and [u]int64_t
Signed-off-by: Scott Miller <scott.miller1@ibm.com>
2019-05-30 17:20:30 -04:00
Sergey Oblomov
0b108411f8 SPML/UCX: added synchronized flush on quiet
- added synchronized flush operation on quiet call.
- flush is implemented using get operation

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-27 16:07:04 +03:00
Yossi Itigin
88503f0f8d
Merge pull request #6667 from hoopoepg/topic/alloc-with-hint-realloc-inplace
ALLOC_WITH_HINT: added inplace realloc
2019-05-26 11:12:55 +03:00
Sergey Oblomov
d6a0912024 OSHMEM: minor optimization of realloc in shadow allocator
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-24 09:16:56 +03:00
Sergey Oblomov
7d8cb75b2e SSHMEM/COLL: added sshmem/mpi implementation for shmem_collect call
- added MPI based implementation of shmem_collect call

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-21 11:42:10 +03:00
Sergey Oblomov
421a7fd47d SPML/UCX: fixed few compilation warnings
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-20 14:40:24 +03:00
Gilles Gouaillardet
5c14f8439a oshmem: remove useless reference to orte header
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2019-05-17 15:38:31 +09:00
Sergey Oblomov
a51badd627 SHADOW ALLOCATOR: minor code optimization
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-16 09:38:01 +03:00
Sergey Oblomov
277c2a9e5c ALLOC_WITH_HINT: added implace realloc
- in some cases realloc operation may be completed without
  allocation of new buffer (and without additional data copy)
- added logic to reallocate buffer inplace if possible

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-15 19:37:38 +03:00
Yossi Itigin
f7086746e9 OSHMEM/MMAP/SYSV: Return ERR_NOT_IMPLEMENTED if segment hint != 0
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2019-05-15 16:12:19 +03:00
Mikhail Brinskii
d81dc533f6 SPML/UCX: Fix coverity error
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-05-14 22:34:01 +03:00
Sergey Oblomov
4df8c1b3e3 OSHMEM/free: suppressed coverity issue
- removed dead code

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-13 11:59:02 +03:00
Yossi Itigin
94b5e91194 OSHMEM: Add support for shmemx_malloc_with_hint()
- added multiple segments processing
- added shmemx_malloc_with_hint call + set of hints

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-05-10 20:04:57 +03:00
Mikhail Brinskii
d4843b1651 SPML/UCS: CR comments p2
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-30 16:49:11 +03:00
Mikhail Brinskii
c4c99457db SPML/UCX: CR comments p1
Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-30 16:26:45 +03:00
Mikhail Brinskii
2ef5bd8b36 SPML/UCX: Add shmemx_alltoall_global_nb routine to shmemx.h
The new routine transfers the data asynchronously from the source PE to all
PEs in the OpenSHMEM job. The routine returns immediately. The source and
target buffers are reusable only after the completion of the routine.
After the data is transferred to the target buffers, the counter object
is updated atomically. The counter object can be read either using atomic
operations such as shmem_atomic_fetch or can use point-to-point synchronization
routines such as shmem_wait_until and shmem_test.

Signed-off-by: Mikhail Brinskii <mikhailb@mellanox.com>
2019-04-26 14:47:58 +03:00
Valentin Petrov
2fa93332ce Fixes the O(N^2) loop in the mca_scoll_mpi_comm_query
The new proc group is created from the "world_group" based on the
      ranks mapping which can be directly taken from proc_name->vpid.

Signed-off-by: Valentin Petrov <valentinp@mellanox.com>
2019-04-08 17:49:00 +03:00
Ben Menadue
063596b828 Add missing #include to oshmem/shmem/c/shmem_context.c.
Signed-off-by: Ben Menadue <ben.menadue@nci.org.au>
2019-04-03 15:58:13 +11:00
Geoff Paulsen
44b3aa244b
Merge pull request #6510 from sam6258/int4_cswap_fix
shmem/fortran: Fix invalid datatype size in call to atomic cswap
2019-03-25 11:49:00 -05:00
Xin Zhao
9c3d00b144 ompi/oshmem/spml/ucx: use lockfree array to optimize spml_ucx_progress/delete oshmem_barrier in shmem_ctx_destroy
ompi/oshmem/spml/ucx: optimize spml ucx progress

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e0414006b0 ompi/oshmem/spml/ucx:delete oob path of getting rkeys in spml ucx
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:45 +02:00
Xin Zhao
e1c1ab0202 ompi/oshmem/spml/ucx: defer clean up shmem_ctx to shmem_finalize
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-21 23:01:37 +02:00
Scott Miller
6b294e0641 shmem/fortran: Fix invalid datatype size in call to atomic cswap
Signed-off-by: Scott Miller <scott.miller1@ibm.com>
2019-03-20 21:57:08 -04:00
Xin Zhao
48033ac1f4 ompi/oshmem: add spml_context back to sshmem_type in memheap, to keep track of ucx_ctx_default's rkeys
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:21 +02:00
Xin Zhao
9a06000962 ompi/oshmem/spml/ucx: let shmem_finalize to clean up any ctx left
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:48:07 +02:00
Xin Zhao
289595e45d OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:50 +02:00
Xin Zhao
79ba752667 ompi/oshmem/spml/ucx: fix eps destroy in shmem_ctx_destroy().
Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:47:38 +02:00
Xin Zhao
b00209e1f5 Revert "OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx."
This reverts commit f1b095c784.

Signed-off-by: Tomislav Janjusic <tomislavj@mellanox.com>
2019-03-15 18:46:56 +02:00
Sergey Oblomov
d8e3562bae PML/SPML/UCX: added evaluation of mmap events
- there was a set of UCX related issues reported which caused
  by mmap API hooks conflicts. We added diagnostic of such
  problems to simplify bug-resolving pipeline

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2019-03-12 21:14:27 +02:00
Joshua Ladd
e57e18f6cc
Merge pull request #6290 from xinzhao3/topic/oshmem_mkeys
OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
2019-02-25 13:09:44 -05:00
Xin Zhao
f1b095c784 OMPI/OSHMEM: bug-fix: store mkeys for each oshmem ctx.
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2019-02-25 16:19:08 +02:00