1
1
Граф коммитов

21554 Коммитов

Автор SHA1 Сообщение Дата
Jeff Squyres
c1b43b6753 libfabric: the LIBADD should be unconditional
The LIBADD for the common libfabric library does not belong down in
the providers; it needs to be set when the libfabric core itself
decides to build.
2014-12-17 14:02:08 -08:00
Jeff Squyres
85a92a56c9 configure: arrgh; mistakenly removed the alps version number
The alps component still exists; it shouldn't have been removed by
open-mpi/ompi@6c468b8691.
2014-12-17 13:41:32 -08:00
Jeff Squyres
6c468b8691 configury: remove unused shared library version numbers
These components do not exist any more, so remove their shared library
version numbers.
2014-12-17 13:36:01 -08:00
Jeff Squyres
f1a5d3a90d configury: propagate a libtool shared lib version for libfabric 2014-12-17 13:36:01 -08:00
Jeff Squyres
d6f059f538 configury: add some descriptive output messages in configure
Ensure that the ofi MTL and the usnic BTL have good descriptive output
messages in configure.
2014-12-17 13:36:01 -08:00
Jeff Squyres
4dcb92ab0b ofi: remove use of non-existent macros 2014-12-17 13:36:01 -08:00
Jeff Squyres
6edc19d78d libfabric: ensure that shell variables are initialized
Ensure that the <provider>_happy shell variables are initialized to
0.  Without this, the --without-libfabric case would leave them
initialized, resulting in "test: -eq operator expecting a value" kinds
of errors.
2014-12-17 13:36:01 -08:00
Rolf vandeVaart
f55de452ab Change the way we register the sm memory pool with CUDA. Rather than just registering local free lists, register the entire pool as the local process does not know which memory the remote processes are using for free lists. Fixes performance problem we were seeing with copying out of memory (since host piece was not pinned). 2014-12-17 14:21:34 -05:00
Jeff Squyres
9d1d34c0c0 Fortran: do not dist mpif-h/sizeof_f.f90; it is generated 2014-12-17 10:24:31 -08:00
Jeff Squyres
01a24c4a6c Merge branch 'ggouaillardet-topic/ts_29113' 2014-12-17 03:04:16 -08:00
Jeff Squyres
e5b0b81ff7 Fortran: tweak wording of C_FUNLOC test message 2014-12-17 03:03:28 -08:00
Gilles Gouaillardet
27aec2ef5b configury: disable f08 fortran bindings if the compiler does
not support c_funloc with TS 29113 subclause 8.1 aka
removed restrictions on ISO_C_BINDING module procedures.
2014-12-17 17:35:45 +09:00
Jeff Squyres
f3be0a5882 ofi: ensure that null_addr is initialized to NULL
And when null_addr is freed, set it back to NULL so that we don't try
to free it again in the error: label.
2014-12-16 17:32:15 -08:00
Jeff Squyres
8c7b6d266e ofi: add "unused" attribute to rc to prevent compiler warning 2014-12-16 17:30:46 -08:00
Jeff Squyres
60e99b92c5 Merge pull request #316 from yburette/ofi-mtl
Adding an Open Fabrics Interfaces (OFI) MTL.
2014-12-16 20:17:59 -05:00
Jeff Squyres
f91d8277a5 NEWS: Sync with NEWS on v1.8 branch for v1.8.4 release 2014-12-16 16:52:30 -08:00
Yohann Burette
58a7a1e4ac Adding an Open Fabrics Interfaces (OFI) MTL.
This MTL implementation uses the OFIWG libfabric's tag messaging capabilities.
2014-12-16 15:43:39 -08:00
Mangala Jyothi Bhaskar
68d78fd718 Aggregator selection logic Part 2 and reorganized Part1 2014-12-16 15:48:40 -06:00
George Bosilca
830df07202 Fix the indentation. 2014-12-16 16:07:42 -05:00
George Bosilca
146ab96e29 These variables are now unnecessary. 2014-12-16 16:05:00 -05:00
Aurélien Bouteiller
3c867157ca Merge branch 'master' of github.com:open-mpi/ompi 2014-12-16 14:40:16 -05:00
Aurélien Bouteiller
ee3b090316 The fallback case when yama is not installed was not correct in CMA vader 2014-12-16 14:39:14 -05:00
Mangala Jyothi Bhaskar
2bd52cc410 Initialize req variable to fix a warning 2014-12-16 13:24:28 -06:00
Aurélien Bouteiller
0bf860ef02 indentation 2014-12-16 14:22:26 -05:00
Jeff Squyres
1b63129de3 fortran: ensure to specify the shared library version 2014-12-16 11:16:46 -08:00
Howard Pritchard
6e1317db68 alps/config: add WRAPPER defines when alps found
Add WRAPPER flags for alps libraries to support
static builds.
2014-12-16 09:52:24 -08:00
Jeff Squyres
95da4a5a0e usnic: no longer use opal_using_threads()
Instead, use the flag that is passed in.
2014-12-16 08:49:01 -08:00
Alex Mikheev
c76261da07 OSHMEM: atomic mxm: fix mkey conversion
Correctly return mxm_empty_mem_key when shmem mkey is empty
2014-12-16 16:34:42 +02:00
Alex Mikheev
71ebbca26d OSHMEM: spml ikrit: fix spelling in help file 2014-12-16 16:18:38 +02:00
Alex Mikheev
3f7ed56548 OSHMEM: spml ikrit: fix mxm disconnect flow
Add out of band barrier before performing mxm disconnect.
It will make sure that every pe is ready to disconnect. Otherwise
bad things may happen.
2014-12-16 15:07:17 +02:00
Gilles Gouaillardet
cfcce01faf configury: test the __sun macro to detect solaris OS.
recent oraclestudio compilers do not set the __sun__ macro
2014-12-16 18:21:58 +09:00
Artem Polyakov
01601f3284 Merge pull request #305 from artpol84/timing
Timing framework improvement
2014-12-16 15:13:48 +06:00
George Bosilca
357daa834e Stay on the safe side: Only one thread is allowed
to handle an event_base.
2014-12-15 23:19:51 -05:00
George Bosilca
2fec570fe7 There is no need to keep track of these events. They are scheduled
as triggers in libevent, so one bookkepping should be enough.
2014-12-15 22:35:29 -05:00
George Bosilca
46baab350c The event is automatically deleted by default. 2014-12-15 21:59:20 -05:00
George Bosilca
b01abfa0d7 Don't over-do it! 2014-12-15 21:33:32 -05:00
George Bosilca
f87a4b691b Solve another handshake problem, where one threads was calling del_event
while cleaning up after receiving a zero byte on the connect socket
(localyy started connection), while another was trying to accept a
new connection from the same peer. Create a zero-timed event and
delocalize the accept into a timer_event.
Add support for registering an error callback, that can be used when a
connection is discovered as failed during the initialization process.
2014-12-15 20:27:32 -05:00
George Bosilca
e20413c885 Rearrange the code to remove a compiler complaint about
the missing return from a non-void function.
2014-12-15 15:42:57 -05:00
Ralph Castain
573a574a3c Remove an unused dstore type that was redundant with another one. Define a corresponding PMIX_NODE_ID type (contains the vpid of the daemon hosting the proc) and ensure that the PMIx server includes that info in its process map 2014-12-15 12:11:13 -08:00
Mike Dubman
2fbe87defe Merge pull request #314 from miked-mellanox/topic/fix_opal_path_nfs
add support for autofs and make check pass. jenkins: check,src_rpm
2014-12-15 20:52:52 +02:00
Ralph Castain
91bec7e9dd Fix some type declarations so make check works for SPARC. Thanks to Paul Hargrove for the report and correction 2014-12-15 06:44:51 -08:00
Ralph Castain
a22cc45769 Close the pmix server sockets on exec 2014-12-13 20:30:21 -08:00
Ralph Castain
f4ff791335 Close oob/usock connections upon exec 2014-12-13 20:24:09 -08:00
Ralph Castain
6c4d5a51c4 Close tcp sockets upon exec 2014-12-13 20:23:53 -08:00
Ralph Castain
9658256a98 Restore the passing of the complete job map to the local proc on first get_attr so the info can be used by the MPI layer without continual calls back to the server. We'll find a more memory efficient method later. 2014-12-13 18:44:09 -08:00
Mike Dubman
42f3fa0d1e OPAL: add support for autofs magic type 2014-12-13 20:27:47 +02:00
Jeff Squyres
9e6b157cb6 opal: minor update to guess_strlen
This is a minor update to
open-mpi/ompi@c52601f0c5.

If we have vsnprintf(), we might as well not have the rest of the
guess_strlen() routine.  Also document the nifty trick/behavior of
vsnprintf() that enables this shortcut (it was new to me!).
2014-12-13 08:09:34 -05:00
George Bosilca
3430714989 Correctly propagate the requested level of thread support during the
component init calls.
2014-12-13 02:36:21 -05:00
George Bosilca
2edbe16c47 Add the necessary infrastructure to allow the dumping of all TCP
informations related to an endpoint (status and all pending fragments).
Do some minor space cleanup.
2014-12-13 01:59:55 -05:00
George Bosilca
5b8616d890 Fix the race condition in endpoint connection initialization. The race
was quite subtle, and only happened on the process with the smallest
guid (as this process will tear down the connection created locally and
replace it with the result of accept). If multiple threads are active in
the system, the deadlock occurs during the recv event deletion as one
thread will hold the recv event lock of the endpoint and try to access
the TCP event base lock, while the other thread will hold the TCP event
base lock while trying to access the recv event lock (in case data is
available on the socket).

The proposed solution let the event callback fail to process the data,
preventing the deadlock and allowing the other thread to always complete
it's job. As the event is not execute the same triggered will trigger
again at the next opportunity, so this solution introduce a minimal
delay in the connection establishement.
2014-12-13 01:45:00 -05:00