Ralph Castain
f3fbc1172d
Merge pull request #4757 from ggouaillardet/topic/iof_hnp
...
iof: do not release a sink before all read data is written.
2018-01-26 14:43:01 -08:00
Ralph Castain
e284a3e98b
Merge branch 'master' into topic/iof_hnp
2018-01-26 13:55:49 -08:00
Ralph Castain
5190f4e4ec
Merge pull request #4759 from rhc54/topic/missing
...
Properly terminate the job when executable not found
2018-01-26 13:55:32 -08:00
Ralph Castain
b643852d8a
Properly terminate the job when executable not found
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-26 12:09:24 -08:00
Ralph Castain
c166e26265
Merge branch 'master' into topic/iof_hnp
2018-01-26 06:15:58 -08:00
Ralph Castain
d83d2be9ea
Merge pull request #4758 from rhc54/topic/sync
...
Refresh ORTE PMIx support
2018-01-25 15:00:37 -08:00
Ralph Castain
a17df810ed
Sync with PMIx iof rfc
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 10:51:38 -08:00
Ralph Castain
e9cd7fd7e6
Update orte
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:53:43 -08:00
Ralph Castain
d1071397ac
Update the orte/ess framework
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:43:44 -08:00
Ralph Castain
9fb80bd239
Update the opal/pmix base framework elements
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:37:52 -08:00
Ralph Castain
187352eb3d
Update the PMIx external components
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:35:57 -08:00
Ralph Castain
a5679ef000
Update the PMIx 3.x component
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-25 08:34:44 -08:00
Gilles Gouaillardet
54fb8ac5d5
iof: do not release a sink before all read data is written.
...
When too much data is available on stdin, it might not be
forwarded immediatly to the task (write() might fail with -EAGAIN),
so when stdin is terminated, there might be some remaining data
to be pushed to the task. In this case, delay the release of the sink
so no data is discarded.
Refs open-mpi/ompi#4744
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-25 16:29:22 +09:00
Gilles Gouaillardet
ebffaded5d
iof/base: remove the unused iof_base_input_files MCA parameter
...
this option was only used by the iof/mr_hnp (aka Map/Reduce)
component that is no more part of master nor v3 branches.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-25 11:29:14 +09:00
Edgar Gabriel
57f946a798
Merge pull request #4749 from edgargabriel/pr/fs_ufs_bad_return_statement
...
fs/ufs and fs/lustre: remove erroneous return statement
2018-01-24 16:28:39 -06:00
Edgar Gabriel
bcf26d419f
fs/ufs and fs/lustre: remove erroneous return statement
...
an erroneous return statement has creeped in commit 1885d99
which leads to some processes not resetting stripe_size
and stripe_count correctly. This can lead in 3.0.x to different
fcoll modules being selected. The impact is not that dramatic on
master and 3.1.x, but could lead to problems as well.
Fixes #4745
Signed-off-by: Edgar Gabriel <egabriel@central.uh.edu>
2018-01-24 14:07:21 -06:00
Jeff Squyres
5b0df815d2
Merge pull request #4743 from jsquyres/pr/extension-attribute-config-test-fix
...
opal_check_attributes: fix __extension__ test
2018-01-24 13:52:23 -05:00
Jeff Squyres
ff31da6f74
opal_check_attributes: fix __extension__ test
...
Per
https://gcc.gnu.org/onlinedocs/gcc/Alternate-Keywords.html#index-_005f_005fextension_005f_005f ,
use __extension__ in a C statement that will actually verify if the
compiler supports it or not.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2018-01-23 13:44:43 -08:00
Gilles Gouaillardet
88e26c63e0
spml/ucx: fix a double free() issue
...
in mca_spml_ucx_add_procs() error path
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-22 13:42:16 +09:00
Joshua Ladd
13bbc394cc
Merge pull request #4728 from xinzhao3/topic/osc-ucx-fetch-op-fix
...
OMPI/OSC/UCX: adding atomic lock for fetch_and_op and compare_and_swap
2018-01-19 15:32:51 -05:00
Ralph Castain
f92c9f35e6
Merge pull request #4729 from rhc54/topic/revert
...
Revert changes to OPAL_CHECK_PACKAGE
2018-01-17 17:01:17 -08:00
Ralph Castain
01e6539127
Revert "Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR"
...
This reverts commit c4fe4ecfb918eef88bcc8dc10fdd743e3dc7fa38.
Revert "Fix DIR, DIR/include search for --with-pmix"
This reverts commit 2e3f4017639e0b248c2f0d1eb14e7bb31f6287be.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-17 16:02:19 -08:00
Xin Zhao
72ff2b1135
OMPI/OSC/UCX: adding atomic lock for fetch_and_op and compare_and_swap
...
Signed-off-by: Xin Zhao <xinz@mellanox.com>
2018-01-18 00:36:22 +02:00
Joshua Ladd
dbefb35aad
Merge pull request #4635 from karasevb/oshmem/spec_1.3/broadcast
...
oshmem: remove "shmem_broadcast" in accordance with the spec v1.3
2018-01-17 12:11:09 -05:00
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
...
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Yossi Itigin
df1136dc63
Merge pull request #4719 from alex-mikheev/topic/pml_ucx_send_nbr
...
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
2018-01-17 17:01:30 +02:00
Alex Mikheev
640e945b9c
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
...
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Yossi Itigin
79ca1c4f18
Merge pull request #4697 from yosefe/topic/opal-progress-avoid-checking-timer
...
opal_progress: check timer only once per 8 calls
2018-01-17 10:48:34 +02:00
Ralph Castain
6b3cf6fcf1
Merge pull request #4722 from pkovacs/master-opal-check
...
opal_check_package: filter /usr[/local]/include from CPPFLAGS
2018-01-16 16:23:49 -08:00
Philip Kovacs
c4fe4ecfb9
Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR
...
Signed-off-by: Philip Kovacs <pkdevel@yahoo.com>
2018-01-16 16:45:20 -05:00
Ralph Castain
8eea942b80
Merge pull request #4721 from rhc54/topic/nidmap
...
Remove the orte_nidmap test
2018-01-16 12:35:43 -08:00
Ralph Castain
345916f2f3
Remove the orte_nidmap test
...
Moved to the ompi-tests repo
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-16 11:47:44 -08:00
Yossi Itigin
7cee60346e
opal_progress: check timer only once per 8 calls
...
Reading the system clock on every call to opal_progress() is an
expensive operation on most architectures, and it can negatively affect
the performance, for example of message rate benchmarks.
We change opal_progress() to read the clock once per 8 calls, unless
there are active users of the event mechanism.
Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-01-16 19:18:53 +02:00
Ralph Castain
87fbe5f98e
Merge pull request #4680 from rhc54/topic/addhost
...
Continue resolving add_host behavior
2018-01-16 00:07:33 -08:00
Ralph Castain
75eb56522c
Continue resolving add_host behavior
...
Fix a problem in packing/unpacking job updates. There remains a race condition that causes messages to attempt to be sent to the second new daemon before it is completely ready. Not entirely sure where it is coming from.
Refs #4665
Rebase to master. Reset orte_nidmap_communicated if hosts are added. Check for duplicate hostnames in an add_host command. Turn off tree_spawn for dynamic launch of additional daemons.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-15 08:21:01 -08:00
Gilles Gouaillardet
cb5dfbe5b1
man: fix indentation of MPI_Comm_spawn[_multiple]
...
no code change.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-15 11:29:15 +09:00
Ralph Castain
64ba33cb32
Merge pull request #4708 from rhc54/topic/cl4
...
Restrict MPI apps to cleaning up job-level dirs
2018-01-12 18:47:19 -08:00
Ralph Castain
4fc0cdd24e
Merge pull request #4709 from rhc54/topic/ws
...
Whitespace cleanup
2018-01-12 18:47:02 -08:00
Ralph Castain
1cd8e34765
Restrict MPI apps to cleaning up job-level dirs
...
MPI apps should only cleanup the session directory to the level of their
own job.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 17:14:24 -08:00
Ralph Castain
7a818a26a9
Whitespace cleanup
...
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 10:32:49 -08:00
Matias Cabral
8049c06a96
Merge pull request #4580 from matcabral/fix_comments_pr_425_osc_rmda
...
osc/rmda: fix missing opal_argv_free in mtls search.
2018-01-12 09:20:11 -08:00
Ralph Castain
e35347f9e3
Merge pull request #4704 from ggouaillardet/topic/regx_misc
...
orte/regx: fix, revamp and enhancement
2018-01-12 06:50:58 -08:00
Gilles Gouaillardet
c988011afd
test/util: test the regx framework
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 16:12:42 +09:00
Ralph Castain
1821c6530c
Merge pull request #4703 from rhc54/topic/dvm
...
Ensure that prun doesn't prematurely exit
2018-01-11 19:59:26 -08:00
Ralph Castain
ac522a521f
Ensure that prun doesn't prematurely exit
...
Ensure that prun doesn't exit until notified that its own child job
terminated.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 19:03:32 -08:00
Gilles Gouaillardet
4130c93976
regx/reverse: add the reverse component
...
Search for the digits to be compressed from the end of the node names.
For example, if the nodelist is c712f6n01,c712f6n02,c712f6n03
the regx/fwd component generates c[3:712]f6n01,c[3:712]f6n02,c[3:712]f6n03@(3)
when the regx/reverse component generates c712f6n[2:1-3]@0(3) which is
a better fit here.
Josh Hursey authored the changes and must be credited.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 11:45:49 +09:00
Gilles Gouaillardet
c2a358ff45
regx: move most functions from the fwd component to base
...
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 11:45:48 +09:00
Gilles Gouaillardet
0c686f01e5
regx: add the extract_node_names callback
...
typedef int (*orte_regx_base_module_extract_node_names_fn_t)(char *regexp, char ***names);
among other things, that will make testing way easier.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 10:58:41 +09:00
Gilles Gouaillardet
a056fdea2d
regx/fwd: correctly handle node names with multiple set of digits
...
Refs. open-mpi/ompi#4689
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 10:58:36 +09:00