1
1

28158 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
8209fca842 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
add the callback prototype for the upcoming PMIx_IOF_push() API

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:35:34 +09:00
Gilles Gouaillardet
0481277e93 pmix/ext3x: bring external component up-to-date with the embedded pmix3x
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:33:33 +09:00
Nathan Hjelm
bb212e0c94
Merge pull request #4767 from ggouaillardet/topic/vader_backing_file
btl/vader: make the backing file job specific
2018-01-30 21:27:02 -07:00
Gilles Gouaillardet
0285c63348 pmix/ext3x: generate component source when only static libraries are built
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-31 13:21:14 +09:00
Yossi Itigin
46967cfa63
Merge pull request #4764 from hoopoepg/topic/set-recv-status-canceled
request/state: update state for canceled request
2018-01-30 12:22:59 +02:00
Gilles Gouaillardet
611d7c2d27 btl/vader: make the backing file job specific
Since open-mpi/ompi@47fd2313ab
the backing file is now in /dev/shm by default. As a consequence,
the backing file name has to include the jobid so more than one job
can run at a time.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-30 16:52:51 +09:00
Sergey Oblomov
7a5811d0a8 request/state: update state for canceled request
- fixed issue in set state for canceled request

Signed-off-by: Sergey Oblomov <sergeyo@mellanox.com>
2018-01-29 18:26:20 +02:00
Ralph Castain
f3fbc1172d
Merge pull request #4757 from ggouaillardet/topic/iof_hnp
iof: do not release a sink before all read data is written.
2018-01-26 14:43:01 -08:00
Ralph Castain
e284a3e98b
Merge branch 'master' into topic/iof_hnp 2018-01-26 13:55:49 -08:00
Ralph Castain
5190f4e4ec
Merge pull request #4759 from rhc54/topic/missing
Properly terminate the job when executable not found
2018-01-26 13:55:32 -08:00
Ralph Castain
b643852d8a Properly terminate the job when executable not found
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-26 12:09:24 -08:00
Ralph Castain
c166e26265
Merge branch 'master' into topic/iof_hnp 2018-01-26 06:15:58 -08:00
Ralph Castain
d83d2be9ea
Merge pull request #4758 from rhc54/topic/sync
Refresh ORTE PMIx support
2018-01-25 15:00:37 -08:00
Ralph Castain
a17df810ed Sync with PMIx iof rfc
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 10:51:38 -08:00
Ralph Castain
e9cd7fd7e6 Update orte
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:53:43 -08:00
Ralph Castain
d1071397ac Update the orte/ess framework
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:43:44 -08:00
Ralph Castain
9fb80bd239 Update the opal/pmix base framework elements
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:37:52 -08:00
Ralph Castain
187352eb3d Update the PMIx external components
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:35:57 -08:00
Ralph Castain
a5679ef000 Update the PMIx 3.x component
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:34:44 -08:00
Gilles Gouaillardet
54fb8ac5d5 iof: do not release a sink before all read data is written.
When too much data is available on stdin, it might not be
forwarded immediatly to the task (write() might fail with -EAGAIN),
so when stdin is terminated, there might be some remaining data
to be pushed to the task. In this case, delay the release of the sink
so no data is discarded.

Refs open-mpi/ompi#4744

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-25 16:29:22 +09:00
Gilles Gouaillardet
ebffaded5d iof/base: remove the unused iof_base_input_files MCA parameter
this option was only used by the iof/mr_hnp (aka Map/Reduce)
component that is no more part of master nor v3 branches.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-25 11:29:14 +09:00
Edgar Gabriel
57f946a798
Merge pull request #4749 from edgargabriel/pr/fs_ufs_bad_return_statement
fs/ufs and fs/lustre: remove erroneous return statement
2018-01-24 16:28:39 -06:00
Edgar Gabriel
bcf26d419f fs/ufs and fs/lustre: remove erroneous return statement
an erroneous return statement has creeped in commit 1885d99
which leads to some processes not resetting stripe_size
and stripe_count correctly. This can lead in 3.0.x to different
fcoll modules being selected. The impact is not that dramatic on
master and 3.1.x, but could lead to problems as well.

Fixes #4745

Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-01-24 14:07:21 -06:00
Jeff Squyres
5b0df815d2
Merge pull request #4743 from jsquyres/pr/extension-attribute-config-test-fix
opal_check_attributes: fix __extension__ test
2018-01-24 13:52:23 -05:00
Jeff Squyres
ff31da6f74 opal_check_attributes: fix __extension__ test
Per
https://gcc.gnu.org/onlinedocs/gcc/Alternate-Keywords.html#index-_005f_005fextension_005f_005f,
use __extension__ in a C statement that will actually verify if the
compiler supports it or not.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-01-23 13:44:43 -08:00
Gilles Gouaillardet
88e26c63e0 spml/ucx: fix a double free() issue
in mca_spml_ucx_add_procs() error path

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-22 13:42:16 +09:00
Joshua Ladd
13bbc394cc
Merge pull request #4728 from xinzhao3/topic/osc-ucx-fetch-op-fix
OMPI/OSC/UCX: adding atomic lock for fetch_and_op and compare_and_swap
2018-01-19 15:32:51 -05:00
Ralph Castain
f92c9f35e6
Merge pull request #4729 from rhc54/topic/revert
Revert changes to OPAL_CHECK_PACKAGE
2018-01-17 17:01:17 -08:00
Ralph Castain
01e6539127 Revert "Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR"
This reverts commit c4fe4ecfb918eef88bcc8dc10fdd743e3dc7fa38.

Revert "Fix  DIR, DIR/include search for --with-pmix"

This reverts commit 2e3f4017639e0b248c2f0d1eb14e7bb31f6287be.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-17 16:02:19 -08:00
Xin Zhao
72ff2b1135 OMPI/OSC/UCX: adding atomic lock for fetch_and_op and compare_and_swap
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-01-18 00:36:22 +02:00
Joshua Ladd
dbefb35aad
Merge pull request #4635 from karasevb/oshmem/spec_1.3/broadcast
oshmem: remove "shmem_broadcast" in accordance with the spec v1.3
2018-01-17 12:11:09 -05:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Yossi Itigin
df1136dc63
Merge pull request #4719 from alex-mikheev/topic/pml_ucx_send_nbr
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
2018-01-17 17:01:30 +02:00
Alex Mikheev
640e945b9c ompi: pml/ucx: blocking send using ucp_tag_send_nbr
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Yossi Itigin
79ca1c4f18
Merge pull request #4697 from yosefe/topic/opal-progress-avoid-checking-timer
opal_progress: check timer only once per 8 calls
2018-01-17 10:48:34 +02:00
Ralph Castain
6b3cf6fcf1
Merge pull request #4722 from pkovacs/master-opal-check
opal_check_package: filter /usr[/local]/include from CPPFLAGS
2018-01-16 16:23:49 -08:00
Philip Kovacs
c4fe4ecfb9 Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR
Signed-off-by: Philip Kovacs <pkdevel@yahoo.com>
2018-01-16 16:45:20 -05:00
Ralph Castain
8eea942b80
Merge pull request #4721 from rhc54/topic/nidmap
Remove the orte_nidmap test
2018-01-16 12:35:43 -08:00
Ralph Castain
345916f2f3 Remove the orte_nidmap test
Moved to the ompi-tests repo

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-16 11:47:44 -08:00
Yossi Itigin
7cee60346e opal_progress: check timer only once per 8 calls
Reading the system clock on every call to opal_progress() is an
expensive operation on most architectures, and it can negatively affect
the performance, for example of message rate benchmarks.

We change opal_progress() to read the clock once per 8 calls, unless
there are active users of the event mechanism.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-01-16 19:18:53 +02:00
Ralph Castain
87fbe5f98e
Merge pull request #4680 from rhc54/topic/addhost
Continue resolving add_host behavior
2018-01-16 00:07:33 -08:00
Ralph Castain
75eb56522c Continue resolving add_host behavior
Fix a problem in packing/unpacking job updates. There remains a race condition that causes messages to attempt to be sent to the second new daemon before it is completely ready. Not entirely sure where it is coming from.

Refs #4665

Rebase to master. Reset orte_nidmap_communicated if hosts are added. Check for duplicate hostnames in an add_host command. Turn off tree_spawn for dynamic launch of additional daemons.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-15 08:21:01 -08:00
Gilles Gouaillardet
cb5dfbe5b1 man: fix indentation of MPI_Comm_spawn[_multiple]
no code change.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-15 11:29:15 +09:00
Ralph Castain
64ba33cb32
Merge pull request #4708 from rhc54/topic/cl4
Restrict MPI apps to cleaning up job-level dirs
2018-01-12 18:47:19 -08:00
Ralph Castain
4fc0cdd24e
Merge pull request #4709 from rhc54/topic/ws
Whitespace cleanup
2018-01-12 18:47:02 -08:00
Ralph Castain
1cd8e34765 Restrict MPI apps to cleaning up job-level dirs
MPI apps should only cleanup the session directory to the level of their
own job.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 17:14:24 -08:00
Ralph Castain
7a818a26a9 Whitespace cleanup
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 10:32:49 -08:00
Matias Cabral
8049c06a96
Merge pull request #4580 from matcabral/fix_comments_pr_425_osc_rmda
osc/rmda: fix missing opal_argv_free in mtls search.
2018-01-12 09:20:11 -08:00
Ralph Castain
e35347f9e3
Merge pull request #4704 from ggouaillardet/topic/regx_misc
orte/regx: fix, revamp and enhancement
2018-01-12 06:50:58 -08:00