1
1

1749 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
0cf2b478e0 Merge pull request #391 from hppritcha/topic/cray_pmi_kvs
pmix/cray: initial kvs removal work
2015-02-11 19:55:34 -07:00
Howard Pritchard
9955834ff1 pmix/cray: initial kvs removal work
Remove use of the Cray PMI KVS - which is designed for a lighweight
MPI that exchanges only a minimimal amount of connection info
(about 128 bytes per rank) - within cray/pmix.  Use Cray PMI
collective extensions instead.

This is the first of several steps to accelerate launch of
Open MPI on Cray systems using either native aprun or nativized
slurm.
2015-02-11 15:14:55 -08:00
Rolf vandeVaart
08dceda2c0 Fix logic for handling priority and eager RDMA. There was some refactoring that was done
in this code and it ended up changing the logic that is used to set up eager RDMA.
Rather than setting up eager RDMA with a high priority message, it did it the other
way around.  For some reason, CUDA-aware support did not like this.  So, basically,
restore the logic to the way it was prior to the refactoring.  The refactoring did not
intend to change this.  Lightly reviewed by hjelmn.
2015-02-11 16:38:36 -05:00
Jeff Squyres
4f1996df5d various: remove $(LTDLINCL) from Makefile.am's that didn't need it 2015-02-11 12:25:20 -08:00
Ralph Castain
3de8c5c7c6 Cleanup the munge support - the credential cannot be reused for multiple connections 2015-02-10 20:34:35 -08:00
George Bosilca
e173f9b0c0 Somehow we lost one of the most critical parameter
allowing the PML to decide how to order the different
interconnects. Bring it back !
2015-02-10 20:32:05 -05:00
Ralph Castain
3ae3b96c17 Fix master compilation - a buried header dependency must have been removed. 2015-02-10 07:22:10 -08:00
Mike Dubman
6816e3421f Merge pull request #377 from regrant/ib_wr_fix
fix problem with get_pathrecord posting too many recv requests
2015-02-10 08:47:23 +02:00
Ralph Castain
bef830efef Fix debug output 2015-02-09 20:49:04 -08:00
Ralph Castain
07134f5b17 Add munge security 2015-02-09 20:49:03 -08:00
Ralph Castain
a3275aa867 Once again, fix the blasted singleton comm_spawn 2015-02-05 17:34:25 -08:00
Jeff Squyres
0dbbffb753 pmix_base_frame: use the "= { 0 }" initializer
Per open-mpi/ompi#381, convert the specific intialization of opal_pmix
to use the generic "= { 0 }" initializer.  This form can be used to
initialize any type when the intent is just to zero out / assign
*some* value.
2015-02-05 17:51:06 -05:00
Ralph Castain
4d882796b6 Silence warnings 2015-02-05 11:41:00 -08:00
Howard Pritchard
e508a4078e Merge pull request #376 from regrant/ib_error_fix
fixes OpenIB connect error reporting for ibv_* calls that return an errn...
2015-02-04 10:22:03 -07:00
Jeff Squyres
621af3aa07 pmix_base: fix global opal_pmix symbol for static linking on OS X
OS X has weirdness when static linking.  If a symbol is not
initialized, it is put into the common block section, and Weird Things
happen (linking when trying to using that global symbol will fail).
If you initialize the variable, it goes into a different section (and
linking to it will work).

This link (that might go stale someday) has some information about OS
X linker scope and treatment of symbol definitions:
https://developer.apple.com/library/mac/documentation/DeveloperTools/Conceptual/MachOTopics/1-Articles/executing_files.html#//apple_ref/doc/uid/TP40001829-98432-TPXREF120

Fixes #375.
2015-02-04 12:12:31 -05:00
Ryan Grant
de93497789 fix problem with get_pathrecord posting too many recv requests 2015-02-04 09:53:58 -07:00
Ryan Grant
5d5e9bc1f8 fixes OpenIB connect error reporting for ibv_* calls that return an errno 2015-02-04 09:09:14 -07:00
Jeff Squyres
a3728f09af libfabric: add another missing file to the Makefile.am 2015-02-04 04:02:27 -08:00
Jeff Squyres
66a680879e libfabric: fix header file name in Makefile.am 2015-02-03 19:41:25 -08:00
Jeff Squyres
cb7cc171f9 usnic: update README.txt notes
Update notes about copying the usnic BTL between master and the v1.8
branch.
2015-02-03 15:54:36 -08:00
Jeff Squyres
edf7232e00 usnic: enable building with an external libfabric 2015-02-03 13:46:06 -08:00
Jeff Squyres
bfa54d5d7b usnic: update to match new libfabric 2015-02-03 13:46:06 -08:00
Jeff Squyres
d2490d2fd8 libfabric: update Makefile.am to match new libfabric drop 2015-02-03 13:46:05 -08:00
Jeff Squyres
3dc0abfbc4 libfabric: update to (just past) 1.0rc1
Updated to Github ofiwg/libfabric@6b005d0d19.
2015-02-03 13:46:05 -08:00
Ralph Castain
d3267c200f Add missing OMPI-changes to libevent 2.0.22 2015-02-02 20:57:40 -08:00
Jeff Squyres
965ccab6cc libfabric: remove a few warnings
Embedding libfabric is a temporary measure; I'm removing some warning
notifications so that the output isn't so cluttered (we're getting
the real warnings fixed upstream, but the OMPI community doesn't
really care/need to see the warnings in the meantime).
2015-01-29 17:38:02 -08:00
Todd Kordenbrock
37e6096fe7 Copyright update. 2015-01-29 11:08:13 -06:00
Todd Kordenbrock
ca30e129e8 Add the option to use the Portals4 logical to physical table.
This commit adds an MCA variable to select Portals4 logical
addressing, populates the logical-to-physical mapping table and
initializes the NI in this mode.
2015-01-29 11:08:13 -06:00
George Bosilca
b9a63cbe7a One less warning. 2015-01-27 13:25:55 -05:00
Ralph Castain
294ebc907a Fix singleton operations so they can work inside a slurm environment 2015-01-27 09:29:42 -06:00
Ralph Castain
ba25e8a0ce Fix singletons 2015-01-27 09:29:42 -06:00
Ralph Castain
028b00154d Complete implementation of the schizo framework to support OMPI component 2015-01-27 09:29:42 -06:00
Jeff Squyres
436223959d usnic: update to match new libfabric APIs 2015-01-24 05:49:36 -08:00
Jeff Squyres
7d5755f62b libfabric: update to ofiwg/libfabric@b3f7af4c67
Pull down a new embedded copy of libfabric from
https://github.com/ofiwg/libfabric.
2015-01-24 05:48:48 -08:00
Howard Pritchard
056daa05bf btl/ugni: use PMIX_GLOBAL for modex_send in ugni
Using PMIX_REMOTE is not the right thing for ugni
BTL when its possible that spawned ranks end up
on the same node as some of the spawnee ranks.
2015-01-22 06:53:45 -08:00
Gilles Gouaillardet
9f80aa2d28 btl/openib: regression fix when rdmacm or udcm are disabled
This fixes a regression introduced in open-mpi/ompi@661c35ca67

Thanks to Mark Santcroos for reporting this issue
2015-01-20 11:31:50 +09:00
Rolf vandeVaart
66f6026214 Improve error message to help user figure out what to do 2015-01-16 13:55:27 -05:00
Jeff Squyres
65a279019e usnic: fix typo in memchecker usage 2015-01-16 09:42:19 -08:00
Jeff Squyres
3969fe3a94 libfabric: ensure wrapper libs are loaded for static builds
For static builds, we need to also set
<framework>_<component>_WRAPPER_EXTRA_LIBS so that the wrappers know
what other libraries to add to link executables.
2015-01-16 09:29:52 -08:00
Gilles Gouaillardet
661c35ca67 cleanup dead code caused by the removal of the --with-threads configure option 2015-01-16 19:13:59 +09:00
Nathan Hjelm
006074c48d Merge pull request #332 from hjelmn/openib_updates
Openib updates
2015-01-15 15:05:18 -06:00
Jeff Squyres
d13c14ec82 CSCus22527: fix off-by-one error in checking the number of VFs
Ensure to count *this* process when checking for how many VFs we need
on the local server.

(cherry picked from commit 386c01934e98cb8dcb48ff648ecdfb0c8677baa9)
2015-01-15 11:44:29 -08:00
Jeff Squyres
4685767b2d libfabric: update usnic configury
Use new common m4 macro for choosing between libnl3 and libnl.
2015-01-15 07:12:39 -08:00
Jeff Squyres
400b02e566 libfabric: update to github:ofiwg/libfabric HEAD
Specifically: bbf0f3ea8e92c92a7cee56473ecdbbbb34cceb7d (15 Jan 2015)
2015-01-15 07:11:54 -08:00
Aurélien Bouteiller
f49981bb2a Disable coalescing until pull request #332 gets in. 2015-01-14 14:12:47 -05:00
Nathan Hjelm
cf4975501d rcache/vma: fix parent class of mca_rcache_vma_t
There was a mismatch between the structure for mca_rcache_vma_t and
the OBJ_CLASS_INSTANCE. One was opal_list_item_t and the other was
ompi_free_list_item_t. The super class in the structure looks like it
is the correct one. Changed the superclass in OBJ_CLASS_INSTANCE to
match.
2015-01-14 10:21:24 -07:00
Jeff Squyres
e4e5e7dbc0 usnic: ensure to clean up nicely in case of low resources
If there are not enough resources (e.g., low VFs), we can end up
calling finalize_one_channel() on the same channel multiple times.  So
ensure to NULL out fields that we have freed already so that we do not
try to free them a second time.

Fixes CSCus26648.
2015-01-13 14:37:31 -08:00
Jeff Squyres
8807ae2497 usnic libfabric: also set the us_netmask_be field.
From libfabric upstream commit ofiwg/libfabric@3976745.

Part of the fix for CSCus22495.
2015-01-13 12:04:57 -08:00
Jeff Squyres
d00cede718 usnic: fix if_include/exclude of CIDR-specified networks
Fix the ordering so that we obtain the usnic netmask information
*before* we do the filtering based on CIDR-specified networks.

Also requires upstream Github libfabric commit 3976745.

Fixes CSCus22495.
2015-01-13 12:04:51 -08:00
Jeff Squyres
a220b92cf8 usnic: fix function name in opal_output 2015-01-13 12:04:07 -08:00