1
1

26047 Коммитов

Автор SHA1 Сообщение Дата
Gilles Gouaillardet
5543b19e9a fortran/use-mpi-tkr: rename mpi-f90-cptr-interfaces.F90 into mpi-f90-cptr-interfaces.h
this file is meant to be included and not compiled, so use a consistent naming

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-27 08:54:07 +09:00
rhc54
2b18044051 Merge pull request #2301 from rhc54/topic/update
Update PMIx to latest master tarball. Ensure we set the HNP name for …
2016-10-26 16:42:15 -07:00
rhc54
60099c9d0e Merge pull request #2285 from anandhis/ofi2
Ofi2
2016-10-26 15:52:37 -07:00
Ralph Castain
f298f294e1 Update PMIx to latest master tarball. Ensure we set the HNP name for orted's so that PMIx_Lookup can find the server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-10-26 15:48:56 -07:00
Anandhi S Jayakumar
94593ca20b Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.
modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Fixed merge issues, and minor pull-request comments
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

The ofi plugin supports multiple providers, and identifies them
by ofi_prov_id,  changed the previous name conduit_id to ofi_prov_id
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_request.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Adding ofi plugin to allow for opening a conduit to use ethernet/fabric.

	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/base/rml_base_stubs.c
	deleted:    ../orte/mca/rml/ofi/.opal_ignore
	modified:   ../orte/mca/rml/ofi/Makefile.am
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

	Removed stale include directive
	modified:   ../orte/mca/rml/ofi/Makefile.am

Fixed merge issues, and minor pull-request comments
	modified:   ../orte/mca/rml/base/base.h
	modified:   ../orte/mca/rml/base/rml_base_frame.c
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Removed trailing space
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Cleaned up test- ofi_conduit_stress.c
	modified:   ../orte/test/system/ofi_conduit_stress.c

cleaned up printing the provider info during initialisation
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Fixing warnings
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

minor cleanup
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

more cleanup
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Sending the ethernet address only in the get_contact_info, rest will be sent through modex
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Adding error logging on failures
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Handling the OPAL_MODEX_SEND/RECV generically for all ofi providers.
	modified:   ../orte/mca/rml/ofi/rml_ofi.h
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
	modified:   ../orte/mca/rml/ofi/rml_ofi_send.c

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Adding to build ofi for limited people
	new file:   ../orte/mca/rml/ofi/.opal_ignore
	new file:   ../orte/mca/rml/ofi/.opal_unignore

Signed-off-by: Anandhi S Jayakumar <anandhi.s.jayakumar@intel.com>

Removign the error logging for now
	modified:   ../orte/mca/rml/ofi/rml_ofi_component.c
2016-10-26 13:11:07 -07:00
Alex Mikheev
f630b43285
OSHMEM: fixes crash during initialization
Do not call mpi comm_dup() if mpi failed to initialize. Also do not set
signal handlers.
Small code styling fixes.

Signed-off-by: Alex Mikheev <alexm@mellanox.com>
2016-10-26 11:15:06 +03:00
Gilles Gouaillardet
8cc3f288c9 opal: fix opal_class_finalize() usage
the class system can be initialized/finalized as many times as we like,
so there is no more need to have opal_class_finalize() invoked in a destructor

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-10-26 15:15:54 +09:00
rhc54
31799ece6f Merge pull request #2298 from rhc54/topic/notify
When mpirun operates in --continuous mode, we won't terminate the job…
2016-10-25 15:00:04 -07:00
Ralph Castain
d031946c46 When mpirun operates in --continuous mode, we won't terminate the job when a remote process dies. In that case, we have to activate both the waitpid _and_ the IOF complete states to ensure we properly mark the proc as dead and perform any required notifications
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-10-25 12:18:14 -07:00
Edgar Gabriel
2076622924 Merge pull request #2238 from edgargabriel/pr/delete-error-codes
update the error codes reported by file_delete
2016-10-25 12:38:03 -05:00
George Bosilca
028e747470 Do not alter ompi_coll_tuned_use_dynamic_rules.
This is set globally as an MCA parameter and should be never
altered based on a single communicator setting.
2016-10-25 12:17:25 -04:00
George Bosilca
253eb80e26 Code cleaning of the tuned module. 2016-10-25 12:17:25 -04:00
Edgar Gabriel
74441b960b update the error codes reported by file_delete 2016-10-25 10:15:14 -05:00
Jeff Squyres
582f290519 Merge pull request #1014 from ggouaillardet/poc/libnl_version_check
configury: check libnl version and warn in case of conflict
2016-10-25 10:34:18 -04:00
Gilles Gouaillardet
b1aedf457f Merge pull request #2283 from ggouaillardet/topic/pml_ob1_recv_request_type_reset
pml/ob1: correctly reset receive request type before init
2016-10-25 15:15:59 +09:00
Gilles Gouaillardet
8e788b5aee pml/ob1: refactor append_recv_req_to_queue() to improve readability
and fix a typo in a comment

Thanks George for the patch
2016-10-25 10:50:40 +09:00
rhc54
3e430caed8 Merge pull request #2290 from rhc54/topic/rmlapp
Open the conduits for application procs
2016-10-24 18:18:06 -07:00
Gilles Gouaillardet
f2a80dc09f configury: check libnl version and abort in case of conflict
libnl and libnl-3 are known to conflict with each other, so detect
and abort if these two libs are both used directly (e.g. Open MPI
uses libnl-3) or indirectly (e.g. libibverbs.so might depend on libnl)
2016-10-25 09:23:59 +09:00
Ralph Castain
227d4d9609 Open the conduits for application procs - we probably can remove all the
RML-related frameworks from MPI applications now, but let's wait a bit
to ensure we have cleaned up all the points where messaging might occur.
2016-10-24 16:53:19 -07:00
rhc54
dae02c7e43 Merge pull request #2282 from rhc54/topic/routed
Update ORTE to more fully support the new conduit messaging system
2016-10-24 08:06:36 -07:00
Gilles Gouaillardet
4a886ac4cc pml/ob1: correctly reset receive request type before init
recvreq->req_recv.req_base.req_type should always be set before invoking
MCA_PML_OB1_RECV_REQUEST_INIT(recvreq, ...) otherwise, the previous type
might be set, and you could end up with MPC_PML_REQUEST_IMPROBE when
MCA_PML_REQUEST_RECV is expected.

Thanks Chris Pattison for the report and test case.

Fixes open-mpi/ompi#2275
2016-10-24 16:50:23 +09:00
Ralph Castain
649301a3a2 Revise the routed framework to be multi-select so it can support the new conduit system. Update all calls to rml.send* to the new syntax. Define an orte_mgmt_conduit for admin and IOF messages, and an orte_coll_conduit for all collective operations (e.g., xcast, modex, and barrier).
Still not completely done as we need a better way of tracking the routed module being used down in the OOB - e.g., when a peer drops connection, we want to remove that route from all conduits that (a) use the OOB and (b) are routed, but we don't want to remove it from an OFI conduit.
2016-10-23 21:52:39 -07:00
Gilles Gouaillardet
055df6f7c6 fortran: correctly defines MPI_DISPLACEMENT_CURRENT with KIND=MPI_OFFSET_KIND
and remove unused ompi/include/mpif-mpi-io.h
2016-10-24 09:53:53 +09:00
Gilles Gouaillardet
e2769e4343 fortran/use-mpi-ignore-tkr: fix typo in MPI_File_write_at_all_begin interface 2016-10-24 09:53:52 +09:00
Gilles Gouaillardet
e02dc1e637 fortran/use-mpi-ignore-trk: fix typo in MPI_File_read_ordered_begin interface 2016-10-24 09:53:52 +09:00
Gilles Gouaillardet
6714f6aee7 coll/libnbc: fix MPI_Ialltoallv with MPI_IN_PLACE and without MPI param check 2016-10-24 09:29:06 +09:00
Gilles Gouaillardet
98f62690f1 man: fix typos in MPI_Info_get_{nkeys,nthkey}
Thanks Nicolas Joly for the patch
2016-10-24 09:04:29 +09:00
Jeff Squyres
7d7cf100ea Merge pull request #2268 from karasevb/fix_oshmem_examples
oshmem: updated API oshmem examples to OSHMEM 1.3
2016-10-22 09:29:16 -04:00
rhc54
4a65b197f9 Merge pull request #2271 from rhc54/topic/ft
Properly mark a node as down and decrease the number of daemons so any
2016-10-21 14:10:09 -05:00
rhc54
900ae15d49 Merge pull request #2221 from bharatpotnuri/master
btl/openib: remove unwanted ompi header inclusion in opal code.
2016-10-21 14:05:55 -05:00
Josh Hursey
d1ecc83e14 Merge pull request #2245 from jjhursey/topic/libnbc-error-path
coll/libnbc: Fix error path on internal error
2016-10-21 13:27:17 -05:00
Ralph Castain
df8ac7b747 Properly mark a node as down and decrease the number of daemons so any
subsequent grpcomm collectives can correctly operate. Note that only the
direct grpcomm component knows how to deal with down nodes.
2016-10-21 09:53:37 -07:00
Joshua Hursey
8748e54c11 coll/libnbc: Fix error path on internal error
* If an error is detected internal to libnbc (e.g., PML truncation error)
   this patch makes sure that the request is completed and the `MPI_ERROR`
   field is set approprately.
 * Make an attempt to cleanup outstanding requests before returning.
   - This is a "best attempt" since not all PMLs support canceling requests.
2016-10-21 11:41:08 -04:00
Boris Karasev
e894a89db8 oshmem: updated API oshmem examples to OSHMEM 1.3 2016-10-21 12:36:53 +06:00
rhc54
2a9f818d24 Merge pull request #2267 from rhc54/topic/pmixup
Update to latest PMIx master
2016-10-21 00:57:28 -05:00
Ralph Castain
9131eca9c6 Update to latest PMIx master 2016-10-20 21:13:40 -07:00
Ralph Castain
be3197fe27 Ensure that the libevent headers are installed for external libevent when --with-devel-headers is given. Correct the path for opal_config.h in the external hwloc header 2016-10-20 20:57:50 -07:00
Gilles Gouaillardet
45336d0bea libnbc: fix iallgather[v]
In order to optimize for MPI_IN_PLACE, data is sent from the receive buffer.
consequently, it should be sent with the receive type and count.

Thanks Josh Hursey for the report and test case

Refs open-mpi/ompi#2256
2016-10-21 10:24:25 +09:00
rhc54
2d94845f90 Merge pull request #2257 from rhc54/topic/pmixext3
Cleanup external PMIx v3 component for copy/paste errors - component and module require unique names
2016-10-20 12:04:38 -05:00
Ralph Castain
2f966bf3bf Cleanup external PMIx v3 component for copy/paste errors - component and module require unique names 2016-10-20 09:11:46 -07:00
Joshua Ladd
9538f4d715 Merge pull request #2255 from alex-mikheev/topic/oshmem_v1.3_cpr_updates
OSHMEM: updates copyrights in fortran fetch/set
2016-10-20 09:06:44 -04:00
Alex Mikheev
6c798fe08d OSHMEM: updates copyrights in fortran fetch/set
(cherry picked from commit f5297ccdb277208a96aaffd72a6454afe712fdb4)
2016-10-20 15:09:27 +03:00
rhc54
4237f1192e Merge pull request #2225 from ggouaillardet/topic/port_in_hostfile
add support for port=<port> in a hostfile for plm/rsh
2016-10-19 10:26:01 -05:00
Gilles Gouaillardet
1846c2d8ad plm/rsh: use an alternate port if the ORTE_NODE_PORT attribute is set 2016-10-19 16:18:52 +09:00
Gilles Gouaillardet
40424c9d0f orte/util/hostfile: add the port=<port> option
add the option to pass an alternate port to plm
for example
node0 port=2222
directs the plm (via the ORTE_NODE_PORT) attribute to use
the non default port 2222 (e.g. ssh -p 2222 node0 ...)
2016-10-19 15:04:01 +09:00
Gilles Gouaillardet
73ea87800b orte/util: add the ORTE_NODE_PORT attribute
this can be used to direct the plm component to use an alternate port
(e.g. ssh -p 2222 ...)
2016-10-19 15:04:01 +09:00
Gilles Gouaillardet
e78fcc4db9 coll/base: fix ompi_coll_base_{gather,scatter}_intra_binomial
receive type is only relevant for root with gather,
send type is only relevant for root with scatter,
so do not access these types on a non root task
2016-10-19 14:05:22 +09:00
Gilles Gouaillardet
cb76d93b4e ompi_wrapper_script: fix $extra_ldflags
use @OMPI_PKG_CONFIG_LDFLAGS@ instead of @OMPI_WRAPPER_EXTRA_LDFLAGS@
so @{libdir} is substitued with ${libdir}

Thanks Manesh Nanavalla for the report
2016-10-19 09:57:55 +09:00
rhc54
2826da727a Merge pull request #2244 from rhc54/topic/pmixext
Create PMIx v3 external component
2016-10-18 16:46:18 -05:00
Ralph Castain
8113a8d1b0 Now that we are hiding symbols in the internal PMIx component, we cannot reuse that component for integration to the external PMIx master as the symbols don't match. So create a new "ext3x" component and copy the PMIx v3 integration over there.
Also, remove a couple of build-product files from the pmix3x component.
2016-10-18 13:15:32 -07:00