1
1
Граф коммитов

2993 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
43cdcb745f btl/ugni: add missing mutex lock 2014-12-24 11:50:23 -07:00
Howard Pritchard
83bcbd1cf9 btl/ugni: compilation fixes
Fix compilation problems in ugni btl associated with
async progress additions.
2014-12-24 11:50:23 -07:00
Howard Pritchard
13ab8a9e5a btl/ugni: use MCA_BTL_DES_FLAGS_SIGNAL
Use MCA_BTL_DES_FLAGS_SIGNAL frag flag to indicate
whether or not an interrupt needs to be delivered
along with a control message going through smsg.
2014-12-24 11:50:23 -07:00
Howard Pritchard
3fc7b389ff initial async progress changes for gni 2014-12-24 11:50:23 -07:00
Devendar Bureddy
ccafc62c07 OMPI: btl openib: fix max registarable memory caluclation
- by default allow to register maximum possible (i.e 2 * total_memory)
      memory. This beheviour can be turned off using mca parameter
      "btl_openib_allow_max_memory_registration"

    - In fallback case, use device specific parameters to calulate
      memory limit.
2014-12-23 23:35:54 +02:00
Howard Pritchard
ffbf9738a3 btl/vader: disable SGI UV xpmem for now
This commit allows master to build again on SGI UV systems.

Fixes #322
2014-12-23 12:04:25 -07:00
Gilles Gouaillardet
f6da257477 configury: test external hwloc version is 1.8 or greater
hwloc_topology_dup is only available from hwloc 1.8
2014-12-22 13:42:38 +09:00
Jeff Squyres
40dd4c5b76 configury: manually remove some stamp-h? files
Due to what might be a bug in Automake, we need to remove stamp-h?
files manually.  See
http://debbugs.gnu.org/cgi/bugreport.cgi?bug=19418.
2014-12-20 08:32:57 -08:00
Jeff Squyres
d5b3e5802e libfabric configury: add more tests
Properly test for some dependent libraries; don't just assume
elsewhere in Open MPI's configury will find those libraries.  Also
consolidate some CPPFLAGS and clarify some comments.
2014-12-20 08:32:47 -08:00
Jeff Squyres
012e008649 libfabric configury: make AC_CONFIG_FILES be unconditional
Also add the generated config.h file to .gitignore.
2014-12-20 08:32:47 -08:00
Jeff Squyres
45ef0352d7 libfabric: do a proper check for intrinsic atomics 2014-12-20 08:32:46 -08:00
Jeff Squyres
ff1364cbe4 Revert "libfabric: add missing header file"
That wasn't a missing header file; in fact, it should have been
.gitignored!

This reverts commit 35bf5fc60c.
2014-12-19 17:39:30 -08:00
Jeff Squyres
35bf5fc60c libfabric: add missing header file 2014-12-19 17:33:11 -08:00
Jeff Squyres
e0f660cb9e libfabric: fix clang compile error in usnic provider
From ofiwg/libfabric@0078c93ae4
2014-12-19 15:45:16 -08:00
Jeff Squyres
75797c4f30 libfabric: update embedded libfabric configury
To support the newly-copied libfabric downloaded from github
ofiwg/libfabric@8da3957de3.
2014-12-19 14:45:30 -08:00
Jeff Squyres
e2362988a9 libfabric: update to ofiwg/libfabric@8da3957de3
Pull down a new embedded copy of libfabric from
https://github.com/ofiwg/libfabric.
2014-12-19 14:45:21 -08:00
Howard Pritchard
91b0d03bf2 pmix/cray: remove dead code 2014-12-19 13:08:23 -08:00
Ralph Castain
123fdd603f If we are using hwthread cpus, then default to binding there, letting the user override to whatever they want 2014-12-19 08:04:28 -08:00
Rolf vandeVaart
26482db736 Bump up max send size. Gives much better performance for GPU transfers while only decreasing host transfers by a small amount. 2014-12-18 13:22:58 -08:00
Jeff Squyres
de31b08a24 Merge pull request #319 from miked-mellanox/topic/opal_path_nfs_autofs
skip check for autofs if fstype is autofs jenkins: check
2014-12-18 15:47:16 -05:00
Mike Dubman
da5b8c6879 OPAL: skip comparison when when fs=autofs in mtab, because we are looking for reals fs type 2014-12-18 21:42:25 +02:00
Jeff Squyres
c621d1e622 libfabric: don't LIBADD the common library in the static case
Adding the libfabric common library in the --disable-dlopen case will
result in duplicate symbols.
2014-12-18 11:04:08 -08:00
Jeff Squyres
140bb3d421 hwloc configure: fix typo -- add missing $
Arrgh!  Missed a "$" in the last commit, making the test always
false.
2014-12-18 10:25:43 -08:00
Jeff Squyres
be6d46490f hwloc: only add CPPFLAGS if hwloc is actually being built
As pointed out by @ggouaillardet, we were adding some unnecessary -I
flags to CPPFLAFGS when --without-hwloc was being used.  This commit
slightly updates the hwloc191 component configury to only add such
things when the component is, in fact, going to be
compiled/installed.
2014-12-18 08:56:49 -08:00
Jeff Squyres
c205c70f39 usnic libfabric: remove useless "config.h" includes
This change was also committed upstream in libfabric.
2014-12-18 08:47:59 -08:00
Jeff Squyres
269d7f9713 openib: don't use opal_using_threads() in component_init
Use the flag that was passed in, instead.
2014-12-17 15:08:43 -08:00
Jeff Squyres
c1b43b6753 libfabric: the LIBADD should be unconditional
The LIBADD for the common libfabric library does not belong down in
the providers; it needs to be set when the libfabric core itself
decides to build.
2014-12-17 14:02:08 -08:00
Jeff Squyres
f1a5d3a90d configury: propagate a libtool shared lib version for libfabric 2014-12-17 13:36:01 -08:00
Jeff Squyres
d6f059f538 configury: add some descriptive output messages in configure
Ensure that the ofi MTL and the usnic BTL have good descriptive output
messages in configure.
2014-12-17 13:36:01 -08:00
Jeff Squyres
6edc19d78d libfabric: ensure that shell variables are initialized
Ensure that the <provider>_happy shell variables are initialized to
0.  Without this, the --without-libfabric case would leave them
initialized, resulting in "test: -eq operator expecting a value" kinds
of errors.
2014-12-17 13:36:01 -08:00
Rolf vandeVaart
f55de452ab Change the way we register the sm memory pool with CUDA. Rather than just registering local free lists, register the entire pool as the local process does not know which memory the remote processes are using for free lists. Fixes performance problem we were seeing with copying out of memory (since host piece was not pinned). 2014-12-17 14:21:34 -05:00
George Bosilca
830df07202 Fix the indentation. 2014-12-16 16:07:42 -05:00
George Bosilca
146ab96e29 These variables are now unnecessary. 2014-12-16 16:05:00 -05:00
Aurélien Bouteiller
ee3b090316 The fallback case when yama is not installed was not correct in CMA vader 2014-12-16 14:39:14 -05:00
Aurélien Bouteiller
0bf860ef02 indentation 2014-12-16 14:22:26 -05:00
Jeff Squyres
95da4a5a0e usnic: no longer use opal_using_threads()
Instead, use the flag that is passed in.
2014-12-16 08:49:01 -08:00
Artem Polyakov
01601f3284 Merge pull request #305 from artpol84/timing
Timing framework improvement
2014-12-16 15:13:48 +06:00
George Bosilca
357daa834e Stay on the safe side: Only one thread is allowed
to handle an event_base.
2014-12-15 23:19:51 -05:00
George Bosilca
2fec570fe7 There is no need to keep track of these events. They are scheduled
as triggers in libevent, so one bookkepping should be enough.
2014-12-15 22:35:29 -05:00
George Bosilca
46baab350c The event is automatically deleted by default. 2014-12-15 21:59:20 -05:00
George Bosilca
b01abfa0d7 Don't over-do it! 2014-12-15 21:33:32 -05:00
George Bosilca
f87a4b691b Solve another handshake problem, where one threads was calling del_event
while cleaning up after receiving a zero byte on the connect socket
(localyy started connection), while another was trying to accept a
new connection from the same peer. Create a zero-timed event and
delocalize the accept into a timer_event.
Add support for registering an error callback, that can be used when a
connection is discovered as failed during the initialization process.
2014-12-15 20:27:32 -05:00
George Bosilca
e20413c885 Rearrange the code to remove a compiler complaint about
the missing return from a non-void function.
2014-12-15 15:42:57 -05:00
Ralph Castain
573a574a3c Remove an unused dstore type that was redundant with another one. Define a corresponding PMIX_NODE_ID type (contains the vpid of the daemon hosting the proc) and ensure that the PMIx server includes that info in its process map 2014-12-15 12:11:13 -08:00
Mike Dubman
2fbe87defe Merge pull request #314 from miked-mellanox/topic/fix_opal_path_nfs
add support for autofs and make check pass. jenkins: check,src_rpm
2014-12-15 20:52:52 +02:00
Ralph Castain
9658256a98 Restore the passing of the complete job map to the local proc on first get_attr so the info can be used by the MPI layer without continual calls back to the server. We'll find a more memory efficient method later. 2014-12-13 18:44:09 -08:00
Mike Dubman
42f3fa0d1e OPAL: add support for autofs magic type 2014-12-13 20:27:47 +02:00
Jeff Squyres
9e6b157cb6 opal: minor update to guess_strlen
This is a minor update to
open-mpi/ompi@c52601f0c5.

If we have vsnprintf(), we might as well not have the rest of the
guess_strlen() routine.  Also document the nifty trick/behavior of
vsnprintf() that enables this shortcut (it was new to me!).
2014-12-13 08:09:34 -05:00
George Bosilca
2edbe16c47 Add the necessary infrastructure to allow the dumping of all TCP
informations related to an endpoint (status and all pending fragments).
Do some minor space cleanup.
2014-12-13 01:59:55 -05:00
George Bosilca
5b8616d890 Fix the race condition in endpoint connection initialization. The race
was quite subtle, and only happened on the process with the smallest
guid (as this process will tear down the connection created locally and
replace it with the result of accept). If multiple threads are active in
the system, the deadlock occurs during the recv event deletion as one
thread will hold the recv event lock of the endpoint and try to access
the TCP event base lock, while the other thread will hold the TCP event
base lock while trying to access the recv event lock (in case data is
available on the socket).

The proposed solution let the event callback fail to process the data,
preventing the deadlock and allowing the other thread to always complete
it's job. As the event is not execute the same triggered will trigger
again at the next opportunity, so this solution introduce a minimal
delay in the connection establishement.
2014-12-13 01:45:00 -05:00