1
1
Граф коммитов

27319 Коммитов

Автор SHA1 Сообщение Дата
Piotr Lesnicki
99453e6b10 mtl/portals4: get retransmission REPLY code
Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2017-07-09 22:12:25 -05:00
Piotr Lesnicki
06b15cebbf mtl/portals4: add timeout to get retransmit
Signed-off-by: Todd Kordenbrock <thkgcode@gmail.com>
2017-07-09 22:12:08 -05:00
Ralph Castain
c632784ca3 Merge pull request #3835 from rhc54/topic/hetero
Remove --enable-heterogeneous until fix is ready
2017-07-07 10:57:12 -07:00
Jeff Squyres
83746fba71 Merge pull request #3822 from tjcw/tjcw-fix-mpi-sizeof
Fix MPI_SIZEOF for gfortran 4.8
2017-07-07 13:49:52 -04:00
Ralph Castain
8e25733760 Remove --enable-heterogeneous until fix is ready
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-07 10:09:30 -07:00
Ralph Castain
b2f90e5d1b Merge pull request #3831 from rhc54/topic/fix
Prefix the MB macro in one more place
2017-07-07 08:04:30 -07:00
Chris Ward
3e6a196714 Merge pull request #1 from jsquyres/tjcw-tjcw-fix-mpi-sizeof
README: minor tweak to specifically mention GNU Fortran
2017-07-07 15:42:06 +01:00
Jeff Squyres
75ec541610 README: minor tweak to specifically mention GNU Fortran
Lots of people still use GFortran, and lots of people still use
somewhat old versions of it (e.g., if it's bundled in their
older-but-still-installed Linux distros).  So let's specifically
mention it.  This may be a bit overkill, but more specific docs are
usually a Good Thing (i.e., they can prevent questions from being sent
to the mailing list).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-07-07 07:30:03 -07:00
Ralph Castain
a190b4b89f Prefix the MB macro in one more place
Fixes #3830

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-07 06:07:47 -07:00
Chris Ward
5de3d5dde6 Fix MPI_SIZEOF for gfortran 4.8
Add copyrights.

Revise the README to take out the 'most notably' statement about GNU Fortran 4.8

Signed-off-by: Chris Ward <tjcw@uk.ibm.com>
2017-07-07 13:47:35 +01:00
Gilles Gouaillardet
823382f5d7 plm/base: do not abort when configure'd with --enable-heterogeneous
and a mix of BE/LE is detected

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-07 10:43:54 +09:00
Ralph Castain
2a580fa71e Merge pull request #3801 from rhc54/topic/hetero
Detect that we have a mix of BE/LE in the system
2017-07-06 15:29:06 -07:00
Josh Hursey
753e3b0156 Merge pull request #3824 from jjhursey/doc/xl-f08-readme
README: Update F08 language about IBM XL compiler
2017-07-06 16:26:23 -05:00
Joshua Hursey
bf5a58dcca README: Update F08 language about IBM XL compiler
- MPI bindings build/link correctly, so remove note about that.
 - OpenSHMEM bindings do not build/link correctly by default.
   - Note the workaround and the issue on GitHub for users.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-07-06 15:52:48 -05:00
Ralph Castain
1bc366b374 Merge pull request #3820 from rhc54/topic/cov
Silence Coverity warnings
2017-07-06 06:53:43 -07:00
Ralph Castain
9c9e0a9773 Merge pull request #3819 from rhc54/topic/esh
Not really necessary, but technically correct
2017-07-06 06:49:44 -07:00
Ralph Castain
8979bfe71e Silence Coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-06 06:07:28 -07:00
Ralph Castain
ed43492867 Not really necessary, but technically correct
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-06 06:00:03 -07:00
Gilles Gouaillardet
fc11c37223 Merge pull request #3646 from ggouaillardet/spacc-fix-coverity-warnings
coll/spacc: misc fixes
2017-07-06 11:39:14 +09:00
Ralph Castain
7bea824194 Merge pull request #3813 from rhc54/topic/esh
Replace syntax with something less strictly C99
2017-07-05 19:14:38 -07:00
Mikhail Kurnosov
44acc92104 Fix buffer overflow
Add check for bounds of sindex[] and rindex[].

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2017-07-06 10:49:08 +09:00
Gilles Gouaillardet
5fceca235b coll/spacc: silence more coverity warnings in mca_coll_spacc_allreduce_intra_redscat_allgather()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-06 10:49:08 +09:00
Mikhail Kurnosov
2f0f476642 Silence spacc coverity warnings
1. Add assert for opal_hibit return value: comm_size is always > 1.
2. Modified verbose output (dead-code warning).

Signed-off-by: Mikhail Kurnosov <mkurnosov@gmail.com>
2017-07-06 10:49:08 +09:00
Ralph Castain
e7a44a1483 Merge pull request #3814 from anandhis/ofi-choose-provider-at-send
Choosing the ofi provider when opening conduit and sending message to peer
2017-07-05 16:55:06 -07:00
Ralph Castain
31130a4bee Replace syntax with something less strictly C99
Fixes #3809

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-05 16:54:36 -07:00
anandhi
793ebc272e When opening conduit, checking for the transport preference in below order -
(1) rml_ofi_transports mca parameter.  This parameter should have the list of transports (currently ethernet,fabric are valid)
    fabric is higher priority if provided.
(2) ORTE_RML_TRANSPORT_TYPE key with values "ethernet" or "fabric". "fabric" is higher priority.
If specific provider is required use ORTE_RML_OFI_PROV_NAME key with values "socket" or "OPA" or any other supported in system.

	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

On send_msg choose the provider on local and peer to follow below rules -
1. if the user specified the transport for this conduit (even giving us a prioritized list of candidates), then the one we selected is the _only_ one we will use. If the remote peer has a matching endpoint, then we use it - otherwise, we error out

2. if the user didn't specify a transport, then we look for matches against _all_ of our available transports, starting with fabric and then going to Ethernet, taking the first one that matches.

3. if we can't find any match, then we error out

	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

send_msg() -> Fixed case when the local provider chosen at time of opening conduit
is not present in peer (destination) node
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

When opening conduit, checking for the transport preference in below order -

(1) rml_ofi_transports mca parameter.  This parameter should have the list of transports (currently ethernet,fabric are valid)
    fabric is higher priority if provided.
(2) ORTE_RML_TRANSPORT_TYPE key with values "ethernet" or "fabric". "fabric" is higher priority.
If specific provider is required use ORTE_RML_OFI_PROV_NAME key with values "socket" or "OPA" or any other supported in system.

	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

On send_msg choose the provider on local and peer to follow below rules -
1. if the user specified the transport for this conduit (even giving us a prioritized list of candidates), then the one we selected is the _only_ one we will use. If the remote peer has a matching endpoint, then we use it - otherwise, we error out

2. if the user didn't specify a transport, then we look for matches against _all_ of our available transports, starting with fabric and then going to Ethernet, taking the first one that matches.

3. if we can't find any match, then we error out

	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

send_msg() -> Fixed case when the local provider chosen at time of opening conduit
is not present in peer (destination) node
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi Jayakumar <anandhi.s.jayakumar@intel.com>
2017-07-05 15:40:14 -07:00
Gilles Gouaillardet
fbeb7b94f4 Merge pull request #3802 from ggouaillardet/topic/gcc_builtin_atomics
configury: fix gcc builtin atomic detection
2017-07-04 10:33:43 +09:00
Gilles Gouaillardet
e77874bbaf configury: fix gcc builtin atomic detection
test for both 32 and 64 bits.
clang only support 32 bits builtin atomics when -m32 is used

Thanks Paul Hargrove for reporting this.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-04 09:47:45 +09:00
Howard Pritchard
f2c6e70ef0 Merge pull request #3800 from hppritcha/topic/fix_cray_pmix_problem
pmix/cray: fix handling of multiple finis
2017-07-03 17:07:32 -06:00
Ralph Castain
2753f53e6d Detect that we have a mix of BE/LE in the system, provide a warning that OMPI doesn't currently support this environment, and error out
Fixes #2817

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-03 15:47:05 -07:00
Howard Pritchard
1f2f3db553 pmix/cray: fix handling of multiple finis
The fini code for cray pmix wasn't correct.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-03 14:30:34 -05:00
Gilles Gouaillardet
d1c5955b73 coll/base: optimize handling of zero-byte datatypes in mca_coll_base_alltoallv_intra_basic_inplace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-06-30 09:47:08 +09:00
Ralph Castain
7cbea77238 Merge pull request #3778 from rhc54/topic/warn
Attempt to detect when we are direct-launched without the necessary P…
2017-06-29 16:53:12 -07:00
Ralph Castain
cb19296b71 Merge pull request #3794 from rhc54/topic/shutdown
Stop all progress threads prior to releasing the peer objects
2017-06-29 16:52:57 -07:00
Ralph Castain
85f8eb4c6b Stop all progress threads prior to releasing the peer objects to avoid a race condition whereby a lost connection could be reported after a peer object was freed and before the threads were stopped.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-29 15:48:18 -07:00
Ralph Castain
bd4a6fee22 Attempt to detect when we are direct-launched without the necessary PMI support, and thus are incorrectly identified as being "singleton". Advise the user on the required PMI(x) support and error out.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-29 15:26:53 -07:00
Gilles Gouaillardet
7e5e5fe887 Merge pull request #3719 from ggouaillardet/topic/libnbc_revamp
coll/libnbc: revisit NBC_Handle usage
2017-06-29 11:13:58 +09:00
Ralph Castain
eee4579f5a Merge pull request #3788 from rhc54/topic/fix
Deregister event handlers only on final call to finalize. Ensure we pass PMIx mca params
2017-06-28 16:36:07 -07:00
Ralph Castain
9178219e6b Deregister event handlers only on final call to finalize. Ensure we pass PMIx mca params
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-28 15:00:43 -07:00
Ralph Castain
e07ed6dccd Merge pull request #3785 from rhc54/topic/lock
Fix a threadlock when notifying clients of failures
2017-06-28 10:13:19 -07:00
Ralph Castain
d619de4f4c Fix a threadlock when notifying clients of failures
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-28 08:58:41 -07:00
Ralph Castain
cefcf7d72a Merge pull request #3773 from rhc54/topic/term
Need to signal -pgrp to get to all members of a process group.
2017-06-27 14:59:08 -07:00
Nathan Hjelm
022c658bbf osc/rdma: rework locking code to improve behavior of unlock
This commit changes the locking code to allow the lock release to be
non-blocking. This helps with releasing the accumulate lock which may
occur in a BTL callback.

Fixes #3616

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-27 15:29:51 -06:00
Ralph Castain
c6c0258cd8 Need to signal -pgrp to get to all members of a process group.
Thanks to Ted Sussman for the report and patience in tracking it down

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-27 12:10:34 -07:00
Ralph Castain
055ce802cd Merge pull request #3771 from rhc54/topic/recovery
Enable ORTE to continue running when a node fails
2017-06-27 10:47:35 -07:00
George Bosilca
f8ffec926e
Protect the monitoring infrastructure initialization. 2017-06-27 18:35:24 +02:00
Ralph Castain
0b9d8f8a41 Update ignores
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-27 09:07:19 -07:00
Ralph Castain
8a4565874e Enable ORTE to continue running when a node fails - user takes responsibility for zombies. Minor cleanup to orte-clean
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-27 09:05:26 -07:00
Clément FOYER
c885ee3f3c Fix Coverity warning CID 1413323 (#3764)
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>
2017-06-27 12:39:31 +02:00
Ralph Castain
1c336b8fad Merge pull request #3761 from rhc54/topic/pmixup
Track PMIx v2.0.1
2017-06-26 12:01:09 -07:00