Howard Pritchard
43cdcb745f
btl/ugni: add missing mutex lock
2014-12-24 11:50:23 -07:00
Howard Pritchard
83bcbd1cf9
btl/ugni: compilation fixes
...
Fix compilation problems in ugni btl associated with
async progress additions.
2014-12-24 11:50:23 -07:00
Howard Pritchard
13ab8a9e5a
btl/ugni: use MCA_BTL_DES_FLAGS_SIGNAL
...
Use MCA_BTL_DES_FLAGS_SIGNAL frag flag to indicate
whether or not an interrupt needs to be delivered
along with a control message going through smsg.
2014-12-24 11:50:23 -07:00
Howard Pritchard
3fc7b389ff
initial async progress changes for gni
2014-12-24 11:50:23 -07:00
Devendar Bureddy
ccafc62c07
OMPI: btl openib: fix max registarable memory caluclation
...
- by default allow to register maximum possible (i.e 2 * total_memory)
memory. This beheviour can be turned off using mca parameter
"btl_openib_allow_max_memory_registration"
- In fallback case, use device specific parameters to calulate
memory limit.
2014-12-23 23:35:54 +02:00
Howard Pritchard
ffbf9738a3
btl/vader: disable SGI UV xpmem for now
...
This commit allows master to build again on SGI UV systems.
Fixes #322
2014-12-23 12:04:25 -07:00
Gilles Gouaillardet
f6da257477
configury: test external hwloc version is 1.8 or greater
...
hwloc_topology_dup is only available from hwloc 1.8
2014-12-22 13:42:38 +09:00
Jeff Squyres
40dd4c5b76
configury: manually remove some stamp-h? files
...
Due to what might be a bug in Automake, we need to remove stamp-h?
files manually. See
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19418 .
2014-12-20 08:32:57 -08:00
Jeff Squyres
d5b3e5802e
libfabric configury: add more tests
...
Properly test for some dependent libraries; don't just assume
elsewhere in Open MPI's configury will find those libraries. Also
consolidate some CPPFLAGS and clarify some comments.
2014-12-20 08:32:47 -08:00
Jeff Squyres
012e008649
libfabric configury: make AC_CONFIG_FILES be unconditional
...
Also add the generated config.h file to .gitignore.
2014-12-20 08:32:47 -08:00
Jeff Squyres
45ef0352d7
libfabric: do a proper check for intrinsic atomics
2014-12-20 08:32:46 -08:00
Jeff Squyres
ff1364cbe4
Revert "libfabric: add missing header file"
...
That wasn't a missing header file; in fact, it should have been
.gitignored!
This reverts commit 35bf5fc60c
.
2014-12-19 17:39:30 -08:00
Jeff Squyres
35bf5fc60c
libfabric: add missing header file
2014-12-19 17:33:11 -08:00
Jeff Squyres
e0f660cb9e
libfabric: fix clang compile error in usnic provider
...
From ofiwg/libfabric@0078c93ae4
2014-12-19 15:45:16 -08:00
Jeff Squyres
75797c4f30
libfabric: update embedded libfabric configury
...
To support the newly-copied libfabric downloaded from github
ofiwg/libfabric@8da3957de3 .
2014-12-19 14:45:30 -08:00
Jeff Squyres
e2362988a9
libfabric: update to ofiwg/libfabric@8da3957de3
...
Pull down a new embedded copy of libfabric from
https://github.com/ofiwg/libfabric .
2014-12-19 14:45:21 -08:00
Howard Pritchard
91b0d03bf2
pmix/cray: remove dead code
2014-12-19 13:08:23 -08:00
Ralph Castain
123fdd603f
If we are using hwthread cpus, then default to binding there, letting the user override to whatever they want
2014-12-19 08:04:28 -08:00
Rolf vandeVaart
26482db736
Bump up max send size. Gives much better performance for GPU transfers while only decreasing host transfers by a small amount.
2014-12-18 13:22:58 -08:00
Jeff Squyres
c621d1e622
libfabric: don't LIBADD the common library in the static case
...
Adding the libfabric common library in the --disable-dlopen case will
result in duplicate symbols.
2014-12-18 11:04:08 -08:00
Jeff Squyres
140bb3d421
hwloc configure: fix typo -- add missing $
...
Arrgh! Missed a "$" in the last commit, making the test always
false.
2014-12-18 10:25:43 -08:00
Jeff Squyres
be6d46490f
hwloc: only add CPPFLAGS if hwloc is actually being built
...
As pointed out by @ggouaillardet, we were adding some unnecessary -I
flags to CPPFLAFGS when --without-hwloc was being used. This commit
slightly updates the hwloc191 component configury to only add such
things when the component is, in fact, going to be
compiled/installed.
2014-12-18 08:56:49 -08:00
Jeff Squyres
c205c70f39
usnic libfabric: remove useless "config.h" includes
...
This change was also committed upstream in libfabric.
2014-12-18 08:47:59 -08:00
Jeff Squyres
269d7f9713
openib: don't use opal_using_threads() in component_init
...
Use the flag that was passed in, instead.
2014-12-17 15:08:43 -08:00
Jeff Squyres
c1b43b6753
libfabric: the LIBADD should be unconditional
...
The LIBADD for the common libfabric library does not belong down in
the providers; it needs to be set when the libfabric core itself
decides to build.
2014-12-17 14:02:08 -08:00
Jeff Squyres
f1a5d3a90d
configury: propagate a libtool shared lib version for libfabric
2014-12-17 13:36:01 -08:00
Jeff Squyres
d6f059f538
configury: add some descriptive output messages in configure
...
Ensure that the ofi MTL and the usnic BTL have good descriptive output
messages in configure.
2014-12-17 13:36:01 -08:00
Jeff Squyres
6edc19d78d
libfabric: ensure that shell variables are initialized
...
Ensure that the <provider>_happy shell variables are initialized to
0. Without this, the --without-libfabric case would leave them
initialized, resulting in "test: -eq operator expecting a value" kinds
of errors.
2014-12-17 13:36:01 -08:00
Rolf vandeVaart
f55de452ab
Change the way we register the sm memory pool with CUDA. Rather than just registering local free lists, register the entire pool as the local process does not know which memory the remote processes are using for free lists. Fixes performance problem we were seeing with copying out of memory (since host piece was not pinned).
2014-12-17 14:21:34 -05:00
George Bosilca
830df07202
Fix the indentation.
2014-12-16 16:07:42 -05:00
George Bosilca
146ab96e29
These variables are now unnecessary.
2014-12-16 16:05:00 -05:00
Aurélien Bouteiller
ee3b090316
The fallback case when yama is not installed was not correct in CMA vader
2014-12-16 14:39:14 -05:00
Aurélien Bouteiller
0bf860ef02
indentation
2014-12-16 14:22:26 -05:00
Jeff Squyres
95da4a5a0e
usnic: no longer use opal_using_threads()
...
Instead, use the flag that is passed in.
2014-12-16 08:49:01 -08:00
George Bosilca
357daa834e
Stay on the safe side: Only one thread is allowed
...
to handle an event_base.
2014-12-15 23:19:51 -05:00
George Bosilca
2fec570fe7
There is no need to keep track of these events. They are scheduled
...
as triggers in libevent, so one bookkepping should be enough.
2014-12-15 22:35:29 -05:00
George Bosilca
46baab350c
The event is automatically deleted by default.
2014-12-15 21:59:20 -05:00
George Bosilca
b01abfa0d7
Don't over-do it!
2014-12-15 21:33:32 -05:00
George Bosilca
f87a4b691b
Solve another handshake problem, where one threads was calling del_event
...
while cleaning up after receiving a zero byte on the connect socket
(localyy started connection), while another was trying to accept a
new connection from the same peer. Create a zero-timed event and
delocalize the accept into a timer_event.
Add support for registering an error callback, that can be used when a
connection is discovered as failed during the initialization process.
2014-12-15 20:27:32 -05:00
George Bosilca
e20413c885
Rearrange the code to remove a compiler complaint about
...
the missing return from a non-void function.
2014-12-15 15:42:57 -05:00
Ralph Castain
573a574a3c
Remove an unused dstore type that was redundant with another one. Define a corresponding PMIX_NODE_ID type (contains the vpid of the daemon hosting the proc) and ensure that the PMIx server includes that info in its process map
2014-12-15 12:11:13 -08:00
Ralph Castain
9658256a98
Restore the passing of the complete job map to the local proc on first get_attr so the info can be used by the MPI layer without continual calls back to the server. We'll find a more memory efficient method later.
2014-12-13 18:44:09 -08:00
George Bosilca
2edbe16c47
Add the necessary infrastructure to allow the dumping of all TCP
...
informations related to an endpoint (status and all pending fragments).
Do some minor space cleanup.
2014-12-13 01:59:55 -05:00
George Bosilca
5b8616d890
Fix the race condition in endpoint connection initialization. The race
...
was quite subtle, and only happened on the process with the smallest
guid (as this process will tear down the connection created locally and
replace it with the result of accept). If multiple threads are active in
the system, the deadlock occurs during the recv event deletion as one
thread will hold the recv event lock of the endpoint and try to access
the TCP event base lock, while the other thread will hold the TCP event
base lock while trying to access the recv event lock (in case data is
available on the socket).
The proposed solution let the event callback fail to process the data,
preventing the deadlock and allowing the other thread to always complete
it's job. As the event is not execute the same triggered will trigger
again at the next opportunity, so this solution introduce a minimal
delay in the connection establishement.
2014-12-13 01:45:00 -05:00
Ralph Castain
bffb2b7a4b
Correct some issues with variables used before being set
2014-12-12 17:23:32 -08:00
Ralph Castain
0630680f36
Two cleanups required for transfer to 1.8.4:
...
* Use %d format for the topo signature as some systems apparently have problems with %u
* Use correct variable in show_help message
2014-12-12 17:23:32 -08:00
Howard Pritchard
6cf258638a
mpool/udreg: minor comment improvement
2014-12-12 14:05:18 -07:00
Nathan Hjelm
38d66272c5
btl/vader: fix compile on SGI UV
2014-12-12 09:09:01 -07:00
Jeff Squyres
e4b3c6f1c4
libfabric psm: fix (void*) dereference
...
Committed upstream to libfabric as well.
2014-12-11 20:12:13 -08:00
Jeff Squyres
0f28233b35
libfabric: don't use __thread
...
There's no real reason that this routine should use thread local
storage. Plus, __thread appears to be a GCC extension.
2014-12-11 14:10:48 -08:00