1
1
Граф коммитов

1275 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
ee56d9dc1a Shorten the session directory name as some OS's are now providing unusually long temp directory names, causing us to overflow the sockaddr field 2016-07-05 14:59:50 -07:00
Joshua Hursey
96779f68e8 mpi/c: Add each check for count==0 in nonblocking reduce interface
* Matches the blocking versions of these interfaces
   - `iallreduce.c` to match `allreduce.c`
   - `ireduce.c` to match `reduce.c`
   - `ireduce_scatter.c` to match `reduce_scatter.c`
 * Workaround for IMB-NBC benchmark, similar to the workaround
   in place for the IMB-MPI1 benchmark for the blocking collectives.
2016-07-01 13:45:30 -05:00
Nathaniel Graham
bb9485bcd9 Fix Java Coverity issue
Fixing a possible error that Coverity pointed out in
ompi_java_exceptionCheck.

Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>
2016-06-23 15:09:07 +02:00
Nathaniel Graham
679a66ccc8 Merge pull request #1803 from nrgraham23/jni_error_handling
Java bindings exception handling fix
2016-06-22 04:14:00 -07:00
Nathaniel Graham
88dea4e4de Java bindings exception handling fix
Fixed an error where if there were no MPI exceptions, a
JNI error could still exist and not get handled.

Signed-off-by: Nathaniel Graham <nrgraham23@gmail.com>
2016-06-21 12:40:31 +02:00
Ralph Castain
5d330d5220 Enable the PMIx event notification capability and use that for all error notifications, including debugger release. This capability requires use of PMIx 2.0 or above as the features are not available with earlier PMIx releases. When OMPI master is built against an earlier external version, it will fallback to the prior behavior - i.e., debugger will be released via RML and all notifications will go strictly to the default error handler.
Add PMIx 2.0

Remove PMIx 1.1.4

Cleanup copying of component

Add missing file

Touchup a typo in the Makefile.am

Update the pmix ext114 component

Minor cleanups and resync to master

Update to latest PMIx 2.x

Update to the PMIx event notification branch latest changes
2016-06-14 13:08:41 -07:00
Jeff Squyres
5071602c59 PSM/PSM2: Disable signal handler hijacking by default
Per discussion on https://github.com/open-mpi/ompi/pull/1767 (and some
subsequent phone calls and off-issue email discussions), the PSM
library is hijacking signal handlers by default.  Specifically: unless
the environment variables `IPATH_NO_BACKTRACE=1` (for PSM / Intel
TrueScale) is set, the library constructor for this library will
hijack various signal handlers for the purpose of invoking its own
error reporting mechanisms.

This may be a bit *surprising*, but is not a *problem*, per se.  The
real problem is that older versions of at least the PSM library do not
unregister these signal handlers upon being unloaded from memory.
Hence, a segv can actually result in a double segv (i.e., the original
segv and then another segv when the now-non-existent signal handler is
invoked).

This PSM signal hijacking subverts Open MPI's own signal reporting
mechanism, which may be a bit surprising for some users (particularly
those who do not have Intel TrueScale).  As such, we disable it by
default so that Open MPI's own error-reporting mechanisms are used.

Additionally, there is a typo in the library destructor for the PSM2
library that may cause problems in the unloading of its signal
handlers.  This problem can be avoided by setting `HFI_NO_BACKTRACE=1`
(for PSM2 / Intel OmniPath).

This is further compounded by the fact that the PSM / PSM2 libraries
can be loaded by the OFI MTL and the usNIC BTL (because they are
loaded by libfabric), even when there is no Intel networking hardware
present.  Having the PSM/PSM2 libraries behave this way when no Intel
hardware is present is clearly undesirable (and is likely to be fixed
in future releases of the PSM/PSM2 libraries).

This commit sets the following two environment variables to disable
this behavior from the PSM/PSM2 libraries (if they are not already
set):

* IPATH_NO_BACKTRACE=1
* HFI_NO_BACKTRACE=1

If the user has set these variables before invoking Open MPI, we will
not override their values (i.e., their preferences will be honored).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-06-14 11:45:23 -07:00
Nathan Hjelm
e968ddfe64 start bug fixes (#1729)
* mpi/start: fix bugs in cm and ob1 start functions

There were several problems with the implementation of start in Open
MPI:

 - There are no checks whatsoever on the state of the request(s)
   provided to MPI_Start/MPI_Start_all. It is erroneous to provide an
   active request to either of these calls. Since we are already
   looping over the provided requests there is little overhead in
   verifying that the request can be started.

 - Both ob1 and cm were always throwing away the request on the
   initial call to start and start_all with a particular
   request. Subsequent calls would see that the request was
   pml_complete and reuse it. This introduced a leak as the initial
   request was never freed. Since the only pml request that can
   be mpi complete but not pml complete is a buffered send the
   code to reallocate the request has been moved. To detect that
   a request is indeed mpi complete but not pml complete isend_init
   in both cm and ob1 now marks the new request as pml complete.

 - If a new request was needed the callbacks on the original request
   were not copied over to the new request. This can cause osc/pt2pt
   to hang as the incoming message callback is never called.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>

* osc/pt2pt: add request for gc after starting a new request

Starting a new receive may cause a recursive call into the pt2pt
frag receive function. If this happens and the prior request is
on the garbage collection list it could cause problems. This commit
moves the gc insert until after the new request has been posted.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-06-02 20:22:40 -04:00
Jeff Squyres
59f4a765b3 Merge pull request #1656 from hpcraink/pr/make_manpage
In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in …
2016-05-28 11:02:12 -04:00
bosilca
b90c83840f Refactor the request completion (#1422)
* Remodel the request.
Added the wait sync primitive and integrate it into the PML and MTL
infrastructure. The multi-threaded requests are now significantly
less heavy and less noisy (only the threads associated with completed
requests are signaled).

* Fix the condition to release the request.
2016-05-24 18:20:51 -05:00
George Bosilca
0641005dab Only check the parameters on valid dimensions. 2016-05-21 15:54:04 -04:00
Rainer Keller
0fb0913cd4 In case, we do not build Fortran, Fortran 2008 or CXX, the regexp in make_manpage.pl will delete all
lines up to the next ".fi" -- which for functions that do not implement the corresponding interface
as code will have all eliminated.
Change to delete the man page's content up to the next section header ".SH"

Also in case of make V=1, we'd like to see the command line, too.

Amend OMPI_Affinity_str according to the other man-pages definitions.
2016-05-17 14:21:35 +02:00
Aurélien Bouteiller
8344d00418 use-mpi extensions do not have a .la lib, so the fortran module should not depend on them. 2016-05-03 11:54:35 -04:00
Jeff Squyres
265e5b9795 Merge pull request #1552 from kmroz/wip-hostname-len-cleanup-1
ompi/opal/orte/oshmem/test: max hostname length cleanup
2016-05-02 09:44:18 -04:00
George Bosilca
6e6ed62a3c Allow NULL arrays for emoty datatypes.
When building an empty datatype (aka. size = 0) because the count of
included datatypes is 0, be less strict on what the arguments are
(allow NULL pointers).
2016-05-01 12:37:02 -04:00
Gilles Gouaillardet
01c90d4e71 fortran/mpif-h: fix *_create_keyval_f
correctly handle out parameter *_keyval when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 13:34:32 +09:00
Gilles Gouaillardet
178dde6a20 fortran/mpif-h: fix MPI_Win_shared_query
correctly handle out parameter disp_unit when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 11:22:09 +09:00
Gilles Gouaillardet
7f59d2a8c7 fortran/mpif-h: fix MPI_Win_free_keyval
initialize inout parameter when OMPI_SIZEOF_FORTRAN_INTEGER > SIZEOF_INT
2016-04-27 10:46:14 +09:00
Nathan Hjelm
c16e639b2f Merge pull request #1563 from hjelmn/ompi_coverity
ompi coverity fixes
2016-04-26 09:17:48 -06:00
Karol Mroz
3322347da9 ompi: fixup hostname max length usage
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-25 07:08:23 +02:00
Nathan Hjelm
ae0ffbb67f Merge pull request #1397 from hjelmn/enable_thread_multiple
ompi: always enable MPI_THREAD_MULTIPLE support
2016-04-23 08:40:22 -06:00
KAWASHIMA Takahiro
e854404570 fortran: Change the line order of #pragma
No code change.
These lines were introduced in my recent commit 17d32ac.
I had a editing mistake and the order is different from
other lines/files.
2016-04-15 12:49:03 +09:00
Jeff Squyres
4566286b9a Merge pull request #1538 from kawashima-fj/pr/fortran-binding-fix
fortran: Fix many Fortran binding bugs
2016-04-14 17:18:59 -04:00
Jeff Squyres
fdf33674b3 Merge pull request #1532 from kmroz/wip-hindexed-cleanup-1
romio,java: cleanup deprecated hindexed call
2016-04-14 17:07:31 -04:00
Jeff Squyres
2374d8fcf7 Merge pull request #1536 from kawashima-fj/pr/inplace-fix
mpi/c, mpi/fortran: Fix `MPI_IN_PLACE`-related bugs
2016-04-14 15:56:55 -04:00
KAWASHIMA Takahiro
4944ba7edc datatype: Fix incorrect predefined datatype names and other datatype bugs (#1537)
* datatype: Fix a incorrect datatype name of `MPI_UNSIGNED`

Name of predefined datatype for C `unsigned int` gotten by
`MPI_TYPE_GET_NAME` should be `MPI_UNSIGNED`, not `MPI_UNSIGNED_INT`.

* datatype: Fix incorrect datatype names of `MPI_C_BOOL` and `MPI_CXX_*`

Names of predefined datatypes gotten by `MPI_TYPE_GET_NAME` are:

after this commit (correct) | before this commit (incorrect)
-----------------------------------------------------------
MPI_C_BOOL                    MPI_BOOL
MPI_CXX_BOOL                  MPI_BOOL
MPI_CXX_FLOAT_COMPLEX         MPI_C_FLOAT_COMPLEX
MPI_CXX_DOUBLE_COMPLEX        MPI_C_DOUBLE_COMPLEX
MPI_CXX_LONG_DOUBLE_COMPLEX   MPI_C_LONG_DOUBLE_COMPLEX

* datatype: Fix a incorrect datatype name of `MPI_2DOUBLE_PRECISION`

Name of the predefined datatype for Fortran two `double precision`
gotten by `MPI_TYPE_GET_NAME` should be `MPI_2DOUBLE_PRECISION`,
not `MPI_2DBLPREC`.

This bug was caused by setting the name to `opal_datatype_t::name`
instead of `ompi_datatype_t::name`.

* datatype: Fix `MPI_UNSIGNED_CHAR` internal flag

`MPI_UNSIGNED_CHAR` is an integer type.

* ompi/cxx: Fix C++ `MPI::LONG_DOUBLE_INT` definition

Just a typo fix. Without this fix, `MPI::MAX_LOC` and `MPI::MIN_LOC`
cannot be used with `MPI::LONG_DOUBLE_INT` in C++ programs.

I know the C++ binding is obsolete, but fixing this is harmless.

* Add FUJITSU copyright
2016-04-12 20:17:46 +02:00
KAWASHIMA Takahiro
17d32acbb6 fortran: Add missing (P)MPI_Alloc_mem_cptr_{f,f08} symbols
This commit adds the following symbols

  MPI_Alloc_mem_cptr_f
  MPI_Alloc_mem_cptr_f08
  PMPI_Alloc_mem_cptr_f
  PMPI_Alloc_mem_cptr_f08

These are implemented in the same way as other `_cptr` routines.
2016-04-12 22:40:58 +09:00
KAWASHIMA Takahiro
d48c8525ed fortran: Fix incorrect weak symbol names 2016-04-12 22:16:32 +09:00
KAWASHIMA Takahiro
5d32a601ff fortran: Add missing interfaces (part 2) 2016-04-12 22:06:35 +09:00
KAWASHIMA Takahiro
6f09d53e34 fortran: Add missing interfaces 2016-04-12 21:44:33 +09:00
KAWASHIMA Takahiro
f3b9a49ad1 fortran: Add missing PMPI interfaces 2016-04-12 20:55:41 +09:00
KAWASHIMA Takahiro
b6cb0bc257 fortran: Fix an incorrect interface name 2016-04-12 20:48:08 +09:00
KAWASHIMA Takahiro
96e93a9c5f fortran: Sort declared subroutins in alphabetical order
And insert necessary empty lines and remove unnecessary empty lines.

No code change.
2016-04-12 20:36:46 +09:00
KAWASHIMA Takahiro
334c63cf0a fortran: Change subroutine declaration order
Same order for `comm`, `type`, and `win`.

No code change.
2016-04-12 20:10:15 +09:00
KAWASHIMA Takahiro
10c11ff5b5 fortran: Add missing MPI_DUP_FN subroutine
Though the `MPI_DUP_FN` subroutine is depricated, it is not yet removed
as of MPI-3.1.
2016-04-12 20:06:50 +09:00
KAWASHIMA Takahiro
d3d6386578 mpi/forran: Support MPI_IN_PLACE on (I)ALLTOALLW and (I)EXSCAN
`MPI_IN_PLACE` support for `MPI_ALLTOALLW` and `MPI_EXSCAN` was
added in MPI-2.2 but it was missed in OMPI Fortran binding code.
2016-04-11 20:38:28 +09:00
KAWASHIMA Takahiro
eb5c31521b mpi/c: Fix MPI_IALLTOALLW memchecker 2016-04-11 18:47:30 +09:00
KAWASHIMA Takahiro
1ced7f213c mpi/c: Fix IALLTOALL{V|W} + MPI_IN_PLACE param check
`sendcounts`, `sdispls`, and `sendtype(s)` must be ignored
if `MPI_IN_PLACE` is specified for `sendbuf`.
This commit makes the param check code same as the blocking
`ALLTOALL{V|W}` function.
2016-04-11 18:34:11 +09:00
Karol Mroz
f8ecdbd623 java: replace deprecated hindexed call
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-04-10 19:56:22 +02:00
Gilles Gouaillardet
7b803ac557 MPI_Unpack: fix error code when insize <= 0
this fixes a regression from open-mpi/ompi@f2e33c725f
2016-04-06 09:47:21 +09:00
Gilles Gouaillardet
f2e33c725f MPI_Unpack: fix return status
this regression was previously introduced in open-mpi/ompi@221e6e2eab
2016-03-31 09:56:54 +09:00
Gilles Gouaillardet
5932287cef datatype/[un]pack_external[_size]: move subroutines down to ompi/datatype
so it can be directly used by test/datatype/external32
2016-03-30 13:01:33 +09:00
Gilles Gouaillardet
221e6e2eab Add the datatype checks to the pack/unpack functions.
The datatype must satisfy the same constraints as for the
corresponding communication function (send for pack and
recv for unpack).
2016-03-30 11:40:08 +09:00
Gilles Gouaillardet
a89f113507 mpi/c: add missing OPAL_CR_EXIT_LIBRARY() in [un]pack[_external] 2016-03-30 11:25:21 +09:00
Nathan Hjelm
b9d100929b man: fix typo in MPI_Win_allocate_shared
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-16 14:47:40 -06:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Nathan Hjelm
60a3eb12ac comm_spawn_multiple_f: fix coverity issue
Fix CID 1327338: Resource leak (RESOURCE_LEAK):

Confirmed that the c_info array was being leaked. Free the array
before returning.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-11 13:07:01 -07:00
Gilles Gouaillardet
0da1374f22 man: fix typo in MPI_File related man pages 2016-03-11 14:16:21 +09:00
Jeff Squyres
89d0a033b7 cxx: "rank" is now a function in C++11
Use "myrank" instead (I tried using ::rank, but had varied
success... so I just renamed the variable).
2016-02-25 15:56:08 -06:00
Nathan Hjelm
230d04327e ompi: always enable MPI_THREAD_MULTIPLE support
This commit removes the --with-mpi-thread-multiple option and forces
MPI_THREAD_MULTIPLE support. This cleans up an abstration violation
in opal where OMPI_ENABLE_THREAD_MULTIPLE determines whether the
opal_using_threads is meaningful. To reduce the performance hit on
MPI_THREAD_SINGLE programs an OPAL_UNLIKELY is used for the
check on opal_using_threads in OPAL_THREAD_* macros.

This commit does not clean up the arguments to the various functions
that take whether muti-threading support is enabled. That should be
done at a later time.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-23 10:02:14 -07:00