1
1

28328 Коммитов

Автор SHA1 Сообщение Дата
Yossi Itigin
f2851fd502
Merge pull request #4724 from alex-mikheev/topic/ucx_as_default
ompi/oshmem: ucx is selected over yalla/ikrit by default
2018-01-17 17:41:49 +02:00
Yossi Itigin
df1136dc63
Merge pull request #4719 from alex-mikheev/topic/pml_ucx_send_nbr
ompi: pml/ucx: blocking send using ucp_tag_send_nbr
2018-01-17 17:01:30 +02:00
Alex Mikheev
640e945b9c ompi: pml/ucx: blocking send using ucp_tag_send_nbr
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:54:18 +02:00
Alex Mikheev
ae326546f4
ompi/oshmem: ucx is selected over yalla/ikrit by default
Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2018-01-17 15:08:04 +02:00
Yossi Itigin
79ca1c4f18
Merge pull request #4697 from yosefe/topic/opal-progress-avoid-checking-timer
opal_progress: check timer only once per 8 calls
2018-01-17 10:48:34 +02:00
Ralph Castain
6b3cf6fcf1
Merge pull request #4722 from pkovacs/master-opal-check
opal_check_package: filter /usr[/local]/include from CPPFLAGS
2018-01-16 16:23:49 -08:00
Philip Kovacs
c4fe4ecfb9 Filter /usr[/local]/include from opal CPPFLAGS when used explictly --with-package=DIR
Signed-off-by: Philip Kovacs <pkdevel@yahoo.com>
2018-01-16 16:45:20 -05:00
Ralph Castain
8eea942b80
Merge pull request #4721 from rhc54/topic/nidmap
Remove the orte_nidmap test
2018-01-16 12:35:43 -08:00
Ralph Castain
345916f2f3 Remove the orte_nidmap test
Moved to the ompi-tests repo

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-16 11:47:44 -08:00
Yossi Itigin
7cee60346e opal_progress: check timer only once per 8 calls
Reading the system clock on every call to opal_progress() is an
expensive operation on most architectures, and it can negatively affect
the performance, for example of message rate benchmarks.

We change opal_progress() to read the clock once per 8 calls, unless
there are active users of the event mechanism.

Signed-off-by: Yossi Itigin <yosefe@mellanox.com>
2018-01-16 19:18:53 +02:00
Ralph Castain
87fbe5f98e
Merge pull request #4680 from rhc54/topic/addhost
Continue resolving add_host behavior
2018-01-16 00:07:33 -08:00
Ralph Castain
75eb56522c Continue resolving add_host behavior
Fix a problem in packing/unpacking job updates. There remains a race condition that causes messages to attempt to be sent to the second new daemon before it is completely ready. Not entirely sure where it is coming from.

Refs #4665

Rebase to master. Reset orte_nidmap_communicated if hosts are added. Check for duplicate hostnames in an add_host command. Turn off tree_spawn for dynamic launch of additional daemons.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-15 08:21:01 -08:00
Gilles Gouaillardet
cb5dfbe5b1 man: fix indentation of MPI_Comm_spawn[_multiple]
no code change.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-15 11:29:15 +09:00
Ralph Castain
64ba33cb32
Merge pull request #4708 from rhc54/topic/cl4
Restrict MPI apps to cleaning up job-level dirs
2018-01-12 18:47:19 -08:00
Ralph Castain
4fc0cdd24e
Merge pull request #4709 from rhc54/topic/ws
Whitespace cleanup
2018-01-12 18:47:02 -08:00
Ralph Castain
1cd8e34765 Restrict MPI apps to cleaning up job-level dirs
MPI apps should only cleanup the session directory to the level of their
own job.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 17:14:24 -08:00
Ralph Castain
7a818a26a9 Whitespace cleanup
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-12 10:32:49 -08:00
Matias Cabral
8049c06a96
Merge pull request #4580 from matcabral/fix_comments_pr_425_osc_rmda
osc/rmda: fix missing opal_argv_free in mtls search.
2018-01-12 09:20:11 -08:00
Ralph Castain
e35347f9e3
Merge pull request #4704 from ggouaillardet/topic/regx_misc
orte/regx: fix, revamp and enhancement
2018-01-12 06:50:58 -08:00
Gilles Gouaillardet
c988011afd test/util: test the regx framework
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 16:12:42 +09:00
Ralph Castain
1821c6530c
Merge pull request #4703 from rhc54/topic/dvm
Ensure that prun doesn't prematurely exit
2018-01-11 19:59:26 -08:00
Ralph Castain
ac522a521f Ensure that prun doesn't prematurely exit
Ensure that prun doesn't exit until notified that its own child job
terminated.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 19:03:32 -08:00
Gilles Gouaillardet
4130c93976 regx/reverse: add the reverse component
Search for the digits to be compressed from the end of the node names.

For example, if the nodelist is c712f6n01,c712f6n02,c712f6n03
the regx/fwd component generates c[3:712]f6n01,c[3:712]f6n02,c[3:712]f6n03@(3)
when the regx/reverse component generates c712f6n[2:1-3]@0(3) which is
a better fit here.

Josh Hursey authored the changes and must be credited.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 11:45:49 +09:00
Gilles Gouaillardet
c2a358ff45 regx: move most functions from the fwd component to base
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 11:45:48 +09:00
Gilles Gouaillardet
0c686f01e5 regx: add the extract_node_names callback
typedef int (*orte_regx_base_module_extract_node_names_fn_t)(char *regexp, char ***names);

among other things, that will make testing way easier.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 10:58:41 +09:00
Gilles Gouaillardet
a056fdea2d regx/fwd: correctly handle node names with multiple set of digits
Refs. open-mpi/ompi#4689

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-12 10:58:36 +09:00
Matias A Cabral
009ba475e1 osc/rmda: fix missing opal_argv_free in mtls search.
Use asprintf in description message to avoid missing default values
Signed-off-by: Matias Cabral <matias.a.cabral@intel.com>
2018-01-11 14:29:16 -08:00
Ralph Castain
8f02596777
Merge pull request #4701 from rhc54/topic/cl3
Ensure cleanup of registered files/dirs
2018-01-11 11:58:11 -08:00
Ralph Castain
6216225bda Ensure cleanup of registered files/dirs
Resolve a race condition between registering for a file to be removed upon termination and actual creation of that file by providing attributes that identify whether the path is a file or directory. This removes the need for PMIx to detect the difference.

Refs #4686

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-11 11:05:30 -08:00
Ralph Castain
614696f03c
Merge pull request #4699 from rhc54/topic/regx
Convert nidmap to regx framework
2018-01-10 21:16:20 -08:00
Ralph Castain
4cd7f3b202 Convert nidmap to regx framework
Handle the need for different regex generator/parsers by moving the
orte/util/nidmap and orte/util/regex code into a new "regx" framework.
Use the original code to complete a "fwd" component, and create a
scaffold for IBM's "reverse" component.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-10 20:28:21 -08:00
Ralph Castain
71c7ae8236
Merge pull request #4698 from rhc54/topic/cl2
Ensure the epilog gets executed in PMIx server
2018-01-10 19:22:40 -08:00
Ralph Castain
6dacf40a8c Ensure the epilog gets executed in PMIx server
If we abnormally terminate, then we still want any cleanups to be
executed.

Remove debug

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-10 18:28:05 -08:00
Ralph Castain
3c78b8525b
Merge pull request #4683 from pkovacs/master-pmix-dirs
Fix DIR, DIR/include search for --with-pmix
2018-01-10 16:25:07 -08:00
Philip Kovacs
2e3f401763 Fix DIR, DIR/include search for --with-pmix
Signed-off-by: Philip Kovacs <pkdevel@yahoo.com>
2018-01-10 17:17:48 -05:00
Jeff Squyres
1c5664fdec
Merge pull request #4681 from nathanweeks/issue/f08_mpi_errcodes_ignore
Fix type of mpi_f08 MPI_ERRCODES_IGNORE
2018-01-10 12:34:16 -05:00
Gilles Gouaillardet
c1b1bfc6c4
Merge pull request #4684 from ggouaillardet/topic/monotonic_datatype
MPI_File_set_view(): check datatypes are monotonic
2018-01-09 22:17:37 +09:00
Gilles Gouaillardet
02f8215b25 ompi: enhance MPI_File_set_view datatype check.
Per MPI 3.1 chapter 13.3 :
"Derived etypes can be constructed by using any of the MPI
datatype constructor routines, provided all resulting typemap
displacements are non-negative and monotonically nondecreasing."
Same restriction applies to ftypes.

add the OMPI_DATATYPE_CHECK_FOR_VIEW() macro that is
check the underlying opal_datatype_t is monotonic, on top
of all checks performed in OMPI_DATATYPE_CHECK_FOR_RECV().

Since checking monotoniciy is expensive, check is only performed
when needed, but the result is cached by ompi_datatype_is_monotonic().

Thanks Wei-keng Liao for the valuable feedback.
Thanks George for the guidance.

Refs. open-mpi/ompi#4682

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-09 18:05:15 +09:00
Gilles Gouaillardet
1a17cb3b1c opal/datatype: add opal_datatype_is_monotonic()
return true if the datatype has non-negative displacements and
monotonically nondecreasing, and false otherwise.

Thanks George for the guidance.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-09 18:05:14 +09:00
Ralph Castain
b6840ad769
Merge pull request #4679 from rhc54/topic/iso
Silence some warnings
2018-01-07 08:41:53 -08:00
Nathan T. Weeks
3158d2c5ed Fix type of mpi_f08 MPI_ERRCODES_IGNORE
Signed-off-by: Nathan T. Weeks <weeks@iastate.edu>
2018-01-07 05:36:41 -08:00
Ralph Castain
e2bc941f1e Silence some warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-05 11:28:20 -08:00
Ralph Castain
af2a3b452c
Merge pull request #4678 from rhc54/topic/mca
Correct the comment in the default MCA param template
2018-01-05 08:57:57 -08:00
Ralph Castain
d620070c77 Correct the comment in the default MCA param template - we do not support a param called "component_path". The correct syntax is "mca_base_component_path"
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2018-01-05 08:46:44 -08:00
KAWASHIMA Takahiro
710080be63
Merge pull request #4667 from kawashima-fj/pr/f08-pmpi
fortran: Fix PMPI interface bugs in mpi_f08 module
2018-01-05 03:45:10 -06:00
Gilles Gouaillardet
56fe714776
Merge pull request #4637 from ggouaillardet/topic/tree_spawn_no_regex
orted: fix tree-spawn when the node regex is too long
2018-01-04 13:03:31 +09:00
Gilles Gouaillardet
03da5218ea orte: remove some dead code related to the new tree_spawn method
Now that the daemon calls remote_spawn itself, there is no longer
a need for the "tree_spawn" command nor the associated command
processing code since the HNP is no longer sending a tree-spawn
message to the orted.

Thanks Ralph for the guidance !

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-04 09:35:17 +09:00
Gilles Gouaillardet
4527584840 orted: fix tree-spawn when the node regex is too long
When the node regex is too long to be sent on the command line,
retrieve  it first from the parent, and then spawn the remote orted

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-04 09:33:46 +09:00
Gilles Gouaillardet
799152e7fb plm/base: add the orte_plm_base_node_regex_threshold MCA parameter
This parameter can be used to set the node regex max length that can
be passed to the orted command line.
For testing purpose, it can be set to zero in order to force the node regex
being retrieved by orted from its parent.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-04 09:33:46 +09:00
Gilles Gouaillardet
f7e29127bc sstore/stage: fix parameter handling in sstore_stage_local_compress_waitpid_cb()
since open-mpi/ompi@8f496b01b7
sstore_stage_local_compress_waitpid_cb is invoked with an orte_wait_tracker_t *,
that must be used to reach the orte_sstore_stage_local_app_snapshot_info_t *.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2018-01-04 09:33:46 +09:00