1
1

30737 Коммитов

Автор SHA1 Сообщение Дата
Brian Barrett
0a21a58f08
Merge pull request #7771 from dancejic/multi
common/ofi: Fixing compilation issue with ofi versions that do not support fi_info.nic
2020-06-01 18:42:07 -07:00
Nikola Dancejic
ae2a447b0e common/ofi: Fixing compilation issue with ofi versions that do not support fi_info.nic
Added the flag OPAL_OFI_PCI_DATA_AVAILABLE to remove accessing the nic
object in
fi_info when the ofi version does not support that structure.

Signed-off-by: Nikola Dancejic dancejic@amazon.com
2020-06-01 23:14:41 +00:00
Howard Pritchard
c074a23e8f
Merge pull request #7675 from hppritcha/topic/fix_issue_7578
rework argobots configury to be smarter
2020-06-01 14:02:32 -06:00
Gilles Gouaillardet
1036eca117
Merge pull request #7773 from ggouaillardet/topic/opal_str_to_bool
opal/util: fix opal_str_to_bool()
2020-06-01 10:15:16 +09:00
Gilles Gouaillardet
c450b21405 opal/util: fix opal_str_to_bool()
correctly use strlen(char *) instead of sizeof(char *)

Thanks Georg Geiser for reporting this issue.

Refs. open-mpi/ompi#7772

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2020-05-30 20:47:41 +09:00
Austen Lauria
d5c4f6b92a
Merge pull request #7770 from hoopoepg/topic/fixed-typo-in-hcoll-var-desc
OMPI/HCOLL: fixed typo in vars description
2020-05-29 14:33:15 -04:00
Sergey Oblomov
df0f2ac026 OMPI/HCOLL: fixed typo in vars description
Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2020-05-29 20:13:35 +03:00
Ralph Castain
7c9da91362
Merge pull request #7767 from rhc54/topic/syn
Sync to PMIx and PRRTE masters
2020-05-26 20:48:21 -07:00
Ralph Castain
b27db0e2a3
Sync to PMIx and PRRTE masters
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-26 20:11:14 -07:00
Howard Pritchard
e2948c96bc
Merge pull request #7761 from hppritcha/topic/fix_issue_7755
OFI common: set include list explicitly to NULL
2020-05-26 06:43:14 -06:00
Howard Pritchard
b9498ec31b rework argobots configury to be smarter
Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-05-23 14:46:41 -07:00
Howard Pritchard
45b643d0cf OFI common: set include list explicitly to NULL
related to #7755

Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2020-05-23 14:05:29 -06:00
Jeff Squyres
4d0c23c029
Merge pull request #7760 from jsquyres/pr/remove-stale-lt-init-options
configure.ac: remove stale LT_INIT options
2020-05-23 09:24:55 -04:00
Jeff Squyres
62c9a25bea configure.ac: remove stale LT_INIT options
1. We haven't used the -dlopen or -preopen options for years (if
   ever?); no need for the `dlopen` LT_INIT option.
2. We haven't supported Windows for years; no need for the `win32-dll`
   LT_INIT option.

Also, this commit includes a minor fix to a comment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2020-05-22 12:42:59 -07:00
bosilca
2b1f053345
Merge pull request #7758 from wckzhang/fixdynamic
coll/tuned: Fix dynamic message size for gather and scatter
2020-05-22 14:15:32 -04:00
Austen Lauria
b419edead4
Merge pull request #7732 from karasevb/fix_sys_limits
sys limits: fixed soft limit setting if it is less than hard limit
2020-05-20 16:25:34 -04:00
Austen Lauria
16ff51ef37
Merge pull request #7754 from hpcraink/fixes
Restore testing all datatypes.
2020-05-20 16:23:11 -04:00
Rainer Keller
a8cdc0d38b Restore testing all datatypes.
Signed-off-by: Rainer Keller <rainer.keller@hs-esslingen.de>
2020-05-20 17:21:54 +02:00
Michael Heinz
e21c31f54c
Merge pull request #7722 from mwheinz/mwheinz-7721
Add check for PSM2 reference counting to PSM2 MTL #7721
2020-05-19 08:06:41 -04:00
Geoff Paulsen
fa483b686d
Merge pull request #7749 from jjhursey/stronger-event-lsf
A slightly stronger check for LSF's libevent
2020-05-18 15:32:29 -05:00
Michael Heinz
f10305a49f Add check for PSM2 reference counting to PSM2 MTL #7721
As discussed, a feature is being added to libpsm2 to correctly handle
the case where the library is opened by multiple OMPI transports in the same
process. (For example, the OFI BTL and the PSM2 MTL).

* Improved error message to indicate required libpsm2 version.

* Adds a test at autogen/configure time for the existence of
  PSM2_LIB_REFCOUNT_CAP.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
2020-05-18 15:25:22 -04:00
Joshua Hursey
05e095a1ee A slightly stronger check for LSF's libevent
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2020-05-18 15:08:10 -04:00
Jeff Squyres
7a2af9e977
Merge pull request #7747 from cniethammer/README-fixes
Fix minor inconsistencies and typos in README
2020-05-18 13:39:38 -04:00
Ralph Castain
d791f73259
Merge pull request #7748 from rhc54/topic/syn
Sync to PRRTE master
2020-05-18 10:36:33 -07:00
Ralph Castain
03f5c93dd3
Sync to PRRTE master
Port variety of bug fixes

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-18 09:54:41 -07:00
Christoph Niethammer
a59a138dd7 Fix minor inconsistencies and typos
Signed-off-by: Christoph Niethammer <niethammer@hlrs.de>
2020-05-18 09:08:02 +02:00
Ralph Castain
5911e84982
Merge pull request #7745 from rhc54/topic/prt
Sync up with PRRTE and cleanup stale code
2020-05-16 15:31:27 -07:00
Ralph Castain
4468691eeb
Sync up with PRRTE and cleanup stale code
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-16 14:48:31 -07:00
Ralph Castain
c1374afd0d
Merge pull request #7744 from rhc54/topic/sync
Pickup the OMPI system-default parameters
2020-05-16 14:00:33 -07:00
Ralph Castain
54f8b6d23c
Pickup the OMPI system-default parameters
Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-16 12:43:04 -07:00
Ralph Castain
ac5ec62563
Merge pull request #7741 from rhc54/topic/sync
Sync to PMIx and PRRTE masters
2020-05-16 11:06:45 -07:00
Ralph Castain
337fcb0047
Sync to PMIx and PRRTE masters
Roll in new mapping/binding methods and report outputs. Fix a few bugs

Signed-off-by: Ralph Castain <rhc@pmix.org>
2020-05-16 07:39:31 -07:00
Josh Hursey
9c0a2bb2d6
Merge pull request #7734 from jjhursey/fix-lsf-libevent
Move to `libevent_core` and add checks for libevent.so conflict with LSF
2020-05-15 14:36:22 -05:00
William Zhang
50823fe9a9 coll/tuned: Fix dynamic message size for gather and scatter
The gather and scatter operations did not use the correct message size
(Only did datatype size * com size). This did not correctly reflect the
total message size and prevents fine tuning within a com size. This
patch multiplies the value by the number of elements sent.

Signed-off-by: William Zhang <wilzhang@amazon.com>
2020-05-14 12:17:52 -07:00
Boris Karasev
fb9eca55cf sys limits: fixed soft limit setting if it is less than hard limit
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2020-05-14 10:54:16 +07:00
Austen Lauria
9996b9f54d
Merge pull request #7720 from abouteiller/bugfix/tcp-failed-lock
Race condition when closing TCP endpoint with error
2020-05-13 16:52:21 -04:00
Joshua Hursey
33afdb6649 Move from legacy -levent to recommended -levent_core
* `libevent_core.so` contains the core functionality that we depend upon
   - `libevent.so` library has been identified as the legacy target.
   - `libevent_core.so` exists as far back as Libevent 2.0.5 (oldest supported by OMPI)
 * `libevent_pthreads.so` can work with either `-levent` or `-levent_core`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 886f41fe3381a338eac215f26360980c612e6bb8)
2020-05-13 10:48:24 -04:00
Joshua Hursey
959353b421 Add checks for libevent.so conflict with LSF
* LSF ships a `libevent.so` that is no related to the `libevent.so`
   shipped with Libevent.
 * Add some checks to the configure logic to detect scenarios where this
   conflict can be detected, and provide the user with a descriptive
   warning message.
   - When detected by `event/external` this is just a warning since
     the internal component may be able to be used instead.
     - This happens when the user supplies the LSF path via the
       `LDFLAGS` envar instead of via `--with-lsf-libdir`.
   - When detected by a LSF component and LSF was explicitly requested
     then this becomes an error. Otherwise it will just print the warning
     and that component will fail to build.
 * Note for `master` the `orter_check_lsf.m4` portion of this cherry-pick
   was moved to `prrte/config/prrte_check_lsf.m4`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit fc4199e3ba567a672ce1da0dc46efbfd996d71f6)
2020-05-13 10:47:02 -04:00
Joshua Hursey
a73a89f6cf event/external: Fix typo in LDFLAGS vs LIBS var before check
* This should have been `LDFLAGS` not `LIBS`. Either works, but
   `LDFLAGS` is more correct. We should also include `CPPFLAGS`
   just in case the header is important to the check.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
(cherry picked from commit 22d8fa197b73eff7afc6d5fd11a99ced396c388a)
2020-05-13 10:45:26 -04:00
Howard Pritchard
f744668f5f
Merge pull request #7646 from hppritcha/topic/ofi_common_wl
add a common ofi whitelist/blacklist
2020-05-13 06:44:05 -06:00
Michael Heinz
4a5622a436
Merge pull request #7713 from mwheinz/master-7699
PSM2: Call add_procs through PML
2020-05-13 07:59:43 -04:00
Michael Heinz
548060e43f PSM2: Call add_procs through PML
Change ompi_mtl_ofi_get_endpoint() to call the active PML's add_procs()
rather than the OFI MTL add_procs() directly when discovering a new
process during operation.

Functionally, this has no impact in correct operation. However, the
current behavior means that the heterogenous and active PML checks
are not being executed in the dynamic discovery case.

Signed-off-by: Michael Heinz <michael.william.heinz@intel.com>
2020-05-12 12:35:39 -04:00
Howard Pritchard
3078485eee
Merge pull request #7712 from shintaro-iwasaki/fix7697
opal/mca/threads/argobots: fix compilation error
2020-05-11 09:02:22 -06:00
Aurelien Bouteiller
0e93d0f647
Bugfix: when a TCP socket is closed in error, it could update the
endpoint state without holding the endpoint lock, resulting in a race
condition.

Signed-off-by: Aurelien Bouteiller <bouteill@icl.utk.edu>
2020-05-11 01:11:05 -04:00
Howard Pritchard
9f1081a07a add a common ofi whitelist/blacklist
also add common verbose variable.

Note the verbosity thing is a little tricky owing to the way the MCA frameworks and components are registered and
and initialized.  The BTL's are registered/initialized prior to the MTL components even getting registered.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2020-05-09 14:50:31 -06:00
bosilca
4460e8ba8e
Merge pull request #7714 from devreal/opal-progress-unregister-oob
Fix potential out-of-bounds write in opal_progress_unregister
2020-05-08 16:57:28 -04:00
Joseph Schuchart
fa1b12ac33 Fix potential out-of-bounds write in opal_progress_unregister
Signed-off-by: Joseph Schuchart <schuchart@hlrs.de>
2020-05-08 21:11:51 +02:00
Michael Heinz
dbbdb8f2e2
Merge pull request #7621 from jsquyres/pr/remove-osc-pt2pt
Remove OSC pt2pt component
2020-05-08 12:43:57 -04:00
Brian Barrett
0dc2325297
Merge pull request #7641 from dancejic/multi-NIC
Added multi-NIC support to provider selection
2020-05-07 15:24:41 -07:00
Shintaro Iwasaki
0fc2033c75 opal/mca/threads/argobots: fix compilation error
Fixes #7697

Signed-off-by: Shintaro Iwasaki <siwasaki@anl.gov>
2020-05-07 16:07:12 +00:00