if PMIx (version > 1.x) is active since all diagnostic messages will instead flow thru
the PMIx connection. Unfortunately, PMIx v1 does not support this
feature, but we can remove the stddiag support once PMIx v1 slides out
of the support window
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
since some tasks migth end up having /dev/null as their stdin,
simply avoid pipe creation and destruction for these tasks.
From a pragmatic and MPI point of view, and unless explicitly required
otherwise, all MPI tasks but (the first) one end up with /dev/null
as their stdin.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
ompio has historically changed the WRONLY flag provided by the applicaiton
to RDWR to allow for the data sieving optimization within the two-phase I/O
fcoll component. This change did not have a performance impact
on regular UNIX file systems, but seems to hurt performance on NFS (and maybe Lustre?)
So provide an option that allows to keep the WRONLY option, and raise an error
if tha fcoll/two-phase would actually like to use the data sieving.
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
Sometimes, the ethernet interfaces can get quite high kernel indices. struct
ifreq (see netdevice(7)) defines ifr_ifindex to be int's. The OOB component
used int16_t internally for matching (in case of -mca oob_tcp_if_[in|ex]clude)
which meant that any interface index > 32767 would never be matched because the
integer would be truncated to int16_t upon return from the function. OOB would
then refuse to work because it didn't find any usable interfaces and MPI job
would abort.
Signed-off-by: Wojtek Wasko <wwasko@nvidia.com>
gcc 5.2 complains:
```
mtl_ofi_component.c: In function ‘ompi_mtl_ofi_finalize’:
mtl_ofi_component.c:613:5: warning: suggest parentheses around assignment used as truth value [-Wparentheses]
if (ret = fi_close((fid_t)ompi_mtl_ofi.fabric)) {
^
```
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Before this commit, the presence of usNIC devices -- which will
(currently) return no data when fi_getinfo() is queried for tagged
matching providers -- would cause an error message to be displayed.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The value of ret is negative (e.g., -61), but it is displayed in the
help message as `%zd`, which renders as unsigned (i.e., a giant
positive value). So make sure to negate the negative value before
rendering it (e.g., so we display "61", not "4294967235").
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
so the bool type is defined when using old compilers that do not support gcc builtin atomics (such as gcc 4.1.x from CentOS 5)
Fixesopen-mpi/ompi#4478
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The following issues have been fixed for `mindist`:
- computing the job map on the backend nodes
- using slots count (`-host node1:<s1>,nodeN:<sN>`)
- fixed `dist:span` job mapping method
- fixed `oversubcribe` option with `-host`
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
Rework the logic to handle the out-of-sequence fragments on the receiver
side. A large number of OOS messages are still arriving even in single
threaded scenarios.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
set proper error codes in mca_fs_ufs_file_open by mapping the errno value to
the MPI error code.
Refs. open-mpi/ompi#4443
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
This is a bug fix based on a problem reported on the mailing list.
For very large read/write operations, ompio breaks the operation
down into multiple cycles. The problem was that
one of the variables required to maintain its values
across the different cycles did not do that, and because
of that the calculations of the memory offsets was wrong.
Fixes issue #4453
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
the fs/lustre component has missed out on a number of updates to the fs/ufs component.
This commit tries to import all the changes performed on the fs/ufs component
w.r.t to the file_open operation, including updates on how the amode is set,
error is propegated and setting the fs_block_size value (which is required for
locking purposes).
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>