1
1
Граф коммитов

26030 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
f2a80dc09f configury: check libnl version and abort in case of conflict
libnl and libnl-3 are known to conflict with each other, so detect
and abort if these two libs are both used directly (e.g. Open MPI
uses libnl-3) or indirectly (e.g. libibverbs.so might depend on libnl)
2016-10-25 09:23:59 +09:00
Ralph Castain
227d4d9609 Open the conduits for application procs - we probably can remove all the
RML-related frameworks from MPI applications now, but let's wait a bit
to ensure we have cleaned up all the points where messaging might occur.
2016-10-24 16:53:19 -07:00
rhc54
dae02c7e43 Merge pull request #2282 from rhc54/topic/routed
Update ORTE to more fully support the new conduit messaging system
2016-10-24 08:06:36 -07:00
Gilles Gouaillardet
4a886ac4cc pml/ob1: correctly reset receive request type before init
recvreq->req_recv.req_base.req_type should always be set before invoking
MCA_PML_OB1_RECV_REQUEST_INIT(recvreq, ...) otherwise, the previous type
might be set, and you could end up with MPC_PML_REQUEST_IMPROBE when
MCA_PML_REQUEST_RECV is expected.

Thanks Chris Pattison for the report and test case.

Fixes open-mpi/ompi#2275
2016-10-24 16:50:23 +09:00
Ralph Castain
649301a3a2 Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier).
Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.
2016-10-23 21:52:39 -07:00
Gilles Gouaillardet
055df6f7c6 fortran: correctly defines MPI_DISPLACEMENT_CURRENT with KIND=MPI_OFFSET_KIND
and remove unused ompi/include/mpif-mpi-io.h
2016-10-24 09:53:53 +09:00
Gilles Gouaillardet
e2769e4343 fortran/use-mpi-ignore-tkr: fix typo in MPI_File_write_at_all_begin interface 2016-10-24 09:53:52 +09:00
Gilles Gouaillardet
e02dc1e637 fortran/use-mpi-ignore-trk: fix typo in MPI_File_read_ordered_begin interface 2016-10-24 09:53:52 +09:00
Gilles Gouaillardet
6714f6aee7 coll/libnbc: fix MPI_Ialltoallv with MPI_IN_PLACE and without MPI param check 2016-10-24 09:29:06 +09:00
Gilles Gouaillardet
98f62690f1 man: fix typos in MPI_Info_get_{nkeys,nthkey}
Thanks Nicolas Joly for the patch
2016-10-24 09:04:29 +09:00
Jeff Squyres
7d7cf100ea Merge pull request #2268 from karasevb/fix_oshmem_examples
oshmem: updated API oshmem examples to OSHMEM 1.3
2016-10-22 09:29:16 -04:00
rhc54
4a65b197f9 Merge pull request #2271 from rhc54/topic/ft
Properly mark a node as down and decrease the number of daemons so any
2016-10-21 14:10:09 -05:00
rhc54
900ae15d49 Merge pull request #2221 from bharatpotnuri/master
btl/openib: remove unwanted ompi header inclusion in opal code.
2016-10-21 14:05:55 -05:00
Josh Hursey
d1ecc83e14 Merge pull request #2245 from jjhursey/topic/libnbc-error-path
coll/libnbc: Fix error path on internal error
2016-10-21 13:27:17 -05:00
Ralph Castain
df8ac7b747 Properly mark a node as down and decrease the number of daemons so any
subsequent grpcomm collectives can correctly operate. Note that only the
direct grpcomm component knows how to deal with down nodes.
2016-10-21 09:53:37 -07:00
Joshua Hursey
8748e54c11 coll/libnbc: Fix error path on internal error
* If an error is detected internal to libnbc (e.g., PML truncation error)
   this patch makes sure that the request is completed and the `MPI_ERROR`
   field is set approprately.
 * Make an attempt to cleanup outstanding requests before returning.
   - This is a "best attempt" since not all PMLs support canceling requests.
2016-10-21 11:41:08 -04:00
Boris Karasev
e894a89db8 oshmem: updated API oshmem examples to OSHMEM 1.3 2016-10-21 12:36:53 +06:00
rhc54
2a9f818d24 Merge pull request #2267 from rhc54/topic/pmixup
Update to latest PMIx master
2016-10-21 00:57:28 -05:00
Ralph Castain
9131eca9c6 Update to latest PMIx master 2016-10-20 21:13:40 -07:00
Ralph Castain
be3197fe27 Ensure that the libevent headers are installed for external libevent when --with-devel-headers is given. Correct the path for opal_config.h in the external hwloc header 2016-10-20 20:57:50 -07:00
Gilles Gouaillardet
45336d0bea libnbc: fix iallgather[v]
In order to optimize for MPI_IN_PLACE, data is sent from the receive buffer.
consequently, it should be sent with the receive type and count.

Thanks Josh Hursey for the report and test case

Refs open-mpi/ompi#2256
2016-10-21 10:24:25 +09:00
rhc54
2d94845f90 Merge pull request #2257 from rhc54/topic/pmixext3
Cleanup external PMIx v3 component for copy/paste errors - component and module require unique names
2016-10-20 12:04:38 -05:00
Ralph Castain
2f966bf3bf Cleanup external PMIx v3 component for copy/paste errors - component and module require unique names 2016-10-20 09:11:46 -07:00
Joshua Ladd
9538f4d715 Merge pull request #2255 from alex-mikheev/topic/oshmem_v1.3_cpr_updates
OSHMEM: updates copyrights in fortran fetch/set
2016-10-20 09:06:44 -04:00
Alex Mikheev
6c798fe08d OSHMEM: updates copyrights in fortran fetch/set
(cherry picked from commit f5297ccdb277208a96aaffd72a6454afe712fdb4)
2016-10-20 15:09:27 +03:00
rhc54
4237f1192e Merge pull request #2225 from ggouaillardet/topic/port_in_hostfile
add support for port=<port> in a hostfile for plm/rsh
2016-10-19 10:26:01 -05:00
Gilles Gouaillardet
1846c2d8ad plm/rsh: use an alternate port if the ORTE_NODE_PORT attribute is set 2016-10-19 16:18:52 +09:00
Gilles Gouaillardet
40424c9d0f orte/util/hostfile: add the port=<port> option
add the option to pass an alternate port to plm
for example
node0 port=2222
directs the plm (via the ORTE_NODE_PORT) attribute to use
the non default port 2222 (e.g. ssh -p 2222 node0 ...)
2016-10-19 15:04:01 +09:00
Gilles Gouaillardet
73ea87800b orte/util: add the ORTE_NODE_PORT attribute
this can be used to direct the plm component to use an alternate port
(e.g. ssh -p 2222 ...)
2016-10-19 15:04:01 +09:00
Gilles Gouaillardet
e78fcc4db9 coll/base: fix ompi_coll_base_{gather,scatter}_intra_binomial
receive type is only relevant for root with gather,
send type is only relevant for root with scatter,
so do not access these types on a non root task
2016-10-19 14:05:22 +09:00
Gilles Gouaillardet
cb76d93b4e ompi_wrapper_script: fix $extra_ldflags
use @OMPI_PKG_CONFIG_LDFLAGS@ instead of @OMPI_WRAPPER_EXTRA_LDFLAGS@
so @{libdir} is substitued with ${libdir}

Thanks Manesh Nanavalla for the report
2016-10-19 09:57:55 +09:00
rhc54
2826da727a Merge pull request #2244 from rhc54/topic/pmixext
Create PMIx v3 external component
2016-10-18 16:46:18 -05:00
Ralph Castain
8113a8d1b0 Now that we are hiding symbols in the internal PMIx component, we cannot reuse that component for integration to the external PMIx master as the symbols don't match. So create a new "ext3x" component and copy the PMIx v3 integration over there.
Also, remove a couple of build-product files from the pmix3x component.
2016-10-18 13:15:32 -07:00
rhc54
1884aa68e5 Merge pull request #2240 from rhc54/topic/badapp
Properly report failure to launch when someone mis-types the name of the application
2016-10-18 13:08:06 -05:00
Ralph Castain
16540c7422 Properly report failure to launch when someone mis-types the name of the application
Fixes #2233
2016-10-18 10:09:30 -07:00
Ralph Castain
7be607582e ORTE applications need to commit any modex send's prior to calling fence 2016-10-18 09:22:56 -07:00
Ralph Castain
7910aa23eb Set lazy_wait_in_init "on" by default for test in master 2016-10-18 08:47:04 -07:00
rhc54
0e5d46ae7a Merge pull request #2237 from rhc54/topic/thread
Ensure the PMIx progress thread is stopped prior to tearing anything down.
2016-10-18 10:38:03 -05:00
Ralph Castain
50c9f3de55 Ensure the PMIx progress thread is stopped prior to tearing anything down. Thanks to Gilles for spotting this error! 2016-10-18 00:27:52 -07:00
rhc54
a659cb2fda Merge pull request #2229 from rhc54/topic/dvm
Pickup the npernode and npersocket options and include them in the job object
2016-10-17 15:27:21 -05:00
Ralph Castain
57114a09ae Pickup the npernode and npersocket options and include them in the job object 2016-10-17 12:26:21 -07:00
Gilles Gouaillardet
1e3191115b Merge pull request #2172 from ggouaillardet/topic/ialltoall_in_place
support MPI_IN_PLACE in MPI_Ialltoall*
2016-10-17 17:00:47 +09:00
Gilles Gouaillardet
bd1b6fe661 rml/oob: add a missing include file 2016-10-16 10:25:00 +09:00
Gilles Gouaillardet
c530b0a07c mpi/cxx: remove duplicate and now useless typedef 2016-10-15 14:30:00 +09:00
Ralph Castain
50bb0ded70 Update the PMIx nightly scripts to generalize locations 2016-10-14 08:40:05 -07:00
Joshua Ladd
64a15188bd Merge pull request #2199 from vspetrov/coll_hcoll_ialltoallv
coll/hcoll: ialltoallv interface
2016-10-14 07:59:23 -06:00
Gilles Gouaillardet
9389de4199 topo/treematch: fix displacements in mca_topo_treematch_dist_graph_create() 2016-10-14 17:16:49 +09:00
Gilles Gouaillardet
4e19cd51b1 hwloc/external: add a missing include file 2016-10-14 09:27:33 +09:00
rhc54
ef0610dd56 Merge pull request #2223 from rhc54/topic/pmixfix
Repair event notification support and resync to PMIx master
2016-10-13 19:26:44 -05:00
Ralph Castain
6f65d0a173 Repair event notification support. Cleanup the long-suffering "epoll: warning" coming out of libevent whenever a process abnormally terminated.
Add changes to test program

Sync to PMIx master
2016-10-13 16:27:39 -07:00