1
1
Граф коммитов

20979 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
7069f2361a disqualify coll ml for MPI_THREAD_MULTIPLE
This commit was SVN r32814.
2014-09-29 21:02:15 +00:00
Howard Pritchard
1df933ea27 remove ompi/runtime/params.h include in ugni btl
This commit was SVN r32813.
2014-09-29 19:26:33 +00:00
Ralph Castain
eb95d6f892 ompi_info_get_bool returns "success" if the value isn't found, setting "flag" to false, but doesn't set the value of the param itself. So if you don't specify "blocking_fence" in MPI_Info, then the "blocking_fence" flag wasn't being set.
Initialize the blocking_fence flag to false as the code logic indicates that it should only be set if someone provides that flag.

Thanks to Lisandro Dalcin for reporting it

cmr=v1.8.4:reviewer=hjelmn

This commit was SVN r32812.
2014-09-29 17:21:28 +00:00
Ralph Castain
4320457394 Fix the debug output - you can't print the cpuset pointer using the %p format without generating warnings
This commit was SVN r32811.
2014-09-29 17:10:38 +00:00
Howard Pritchard
201d4ec3ad fix setting of PMIX_NODE_RANK in cray pmix comp.
Per discussions with pmix folks, it was determined that
the way the cray pmi pmix component was computing the
PMIX_NODE_RANK attribute for a process was incorrect.
This commit fixes the problem.

This commit was SVN r32810.
2014-09-29 16:55:31 +00:00
Howard Pritchard
f8ac8bb6b0 remove improper use of hwloc_bitmap_free
When using the native aprun launcher, it was observed that
there were frequent memory corruption errors occuring either
during a PMI kvs-fence operation, or at mpi termation during
opal cleanup of allocated objects.  This was especially bad
when using

aprun --c none

In some cases, the application would even just hang in finalize
if using ptmalloc, owing to some kind of infinite loop in
cleanup of small blocks, etc.

It turns out that the proble was in orte_ess_base_proc_binding's
improper use of opal_hwloc_base_get_available_cpus.  The cpuset
(bitmap) returned from that function is not meant to be freed
by the caller.

This problem is likely never observed when using the mpirun launcher
as there's an early exit if the OMPI_MCA_orte_bound_at_launch
environment variable is set.

This commit was SVN r32809.
2014-09-29 16:10:37 +00:00
Rolf vandeVaart
16d6e82ed2 Fix issue repored on User list. Error out when --with-cuda and --disable-dlopen are together.
This commit was SVN r32808.
2014-09-29 13:25:17 +00:00
George Bosilca
49e79a9ade Fix the case of a single process.
This commit was SVN r32807.
2014-09-28 22:06:39 +00:00
Jeff Squyres
6d1409b17b NEWS: add credit for Joshua Randall for LSF fixes
This commit was SVN r32803.
2014-09-26 19:01:35 +00:00
Jeff Squyres
91114c22d4 fortran: strengthen the storage_size() check
It's not enough to AC_COMPILE_IFELSE, do AC_LINK_IFELSE to really make
sure the compiler suite supports it.

Refs trac:4917

This commit was SVN r32802.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-26 18:17:55 +00:00
Rolf vandeVaart
399dc3db43 Code to check for managed memory. Configure support also.
This commit was SVN r32801.
2014-09-26 16:24:45 +00:00
Rolf vandeVaart
35858f837a Revert r32713. Have different code for this.
This commit was SVN r32800.

The following SVN revision numbers were found above:
  r32713 --> open-mpi/ompi@9a2bab0e27
2014-09-26 14:56:18 +00:00
Gilles Gouaillardet
9661e4537f oob/tcp: fix a race condition
Mimick the btl/tcp protocol to solve the race condition that happens
when two peers try to connect to each other at the same time

cmr=v1.8.4:reviewer=rhc

This commit was SVN r32799.
2014-09-26 06:54:30 +00:00
Nathan Hjelm
e0eb1f2e73 btl/vader: make vader registration lookup/caching thread safe
This commit was SVN r32798.
2014-09-25 22:24:06 +00:00
Jeff Squyres
318e3b426a fortran: workaround Absoft linker issue
MTT found that the addition of the MPI_SIZEOF interfaces to mpif.h was
causing a linker error with the Absoft compiler.  Absoft is working on
a fix, but we can workaround the issue for now.  See comment in
Makefile.am in this commit for a lengthy explanation.

Refs trac:4917

This commit was SVN r32797.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-25 21:07:46 +00:00
Nathan Hjelm
9c788ff940 coll/basic: fix segmentation fault in neighborhood collectives if the degree
of the topology is higher than the communicator size

It is possible to have a topology degree higher than the size of the communicator.
For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave
the neighborhood collectives with a request buffer that is too small. This commit
adds a call that will dynamically increase the size of the request buffer if it
is too small.

A better fix would be to create the topology *before* calling the coll_select
routine on a communicator. This will take some discussion and the solution will
not likely be ready anytime soon.

Thanks to Lisandro Dalcin for reporting this.

Original thread: http://www.open-mpi.org/community/lists/devel/2014/08/15713.php

cmr=v1.8.3:reviewer=jsquyres

This commit was SVN r32796.
2014-09-25 17:43:29 +00:00
George Bosilca
53e012ae97 Fix typo.
This commit was SVN r32795.
2014-09-25 17:18:27 +00:00
Gilles Gouaillardet
b1c4daa956 configury: silence warning on Solaris
* remove config/ompi_config_solaris_threads.m4 which was dead code
 * check if pthreads work "as is" on all platforms including Solaris
   (FWIW, the test should have been skipped if Solaris threads are used
    and not if configure is ran on a Solaris box)

Refs trac:4911

This commit was SVN r32792.

The following Trac tickets were found above:
  Ticket 4911 --> https://svn.open-mpi.org/trac/ompi/ticket/4911
2014-09-25 07:26:16 +00:00
Gilles Gouaillardet
5d0e65085f configury: silence warning on Solaris
revert r32501

Refs trac:4911

This commit was SVN r32791.

The following SVN revision numbers were found above:
  r32501 --> open-mpi/ompi@0afed999cd

The following Trac tickets were found above:
  Ticket 4911 --> https://svn.open-mpi.org/trac/ompi/ticket/4911
2014-09-25 07:17:58 +00:00
Jeff Squyres
d13034d0b0 fortran: add configury to check for storage_size()
gfortran 4.8 does not support storage_size() on all relevant types
that we need.  So add a configure test to check and see if the
compiler's storage_size() intrinsic supports enough types for us to do
MPI_SIZEOF.

Also remove an accidentally redundant check for fortran INTERFACE.

Refs trac:4917

This commit was SVN r32790.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-25 00:17:29 +00:00
Jeff Squyres
c9ea7f2732 fortran: ensure that sizeof_f08.h is built before mpi-f08.lo
mpi-f08.F90 includes sizeof_f08.h, so we need to add a Makefile
dependency to ensure that sizeof_f08.h is built first.

Refs trac:4917

This commit was SVN r32789.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-24 23:59:18 +00:00
Nathan Hjelm
aba87f3776 btl/vader:silence warning
This commit was SVN r32788.
2014-09-24 22:10:23 +00:00
Ralph Castain
35b26805c8 Update NEWS for 1.8.3
cmr=v1.8.3:reviewer=ompi-gk1.8

This commit was SVN r32785.
2014-09-24 21:54:57 +00:00
Howard Pritchard
bae3837121 Title: Comment out all mpi_abort_print_stack in lanl platform files
Description:
setting mpi_abort_print_stack in mca params file now makes openmpi
unhappy. Comment these out in all the LANL platform files.
Requested by TOSS OpenMPI support person.

cmr=v1.8.3

This commit was SVN r32782.
2014-09-24 18:25:58 +00:00
Ralph Castain
4024c8af9e Have to include the mpisync directory so the Makefile.in gets built - just don't build the binary and install it if timing isn't enabled
This commit was SVN r32781.
2014-09-24 01:18:21 +00:00
Ralph Castain
17846411c3 Now that we have an ORTE thread running in apps, we can't just call "exit"
during RTE abort as that is happening in a thread, and (at least in some
environments) doesn't result in the main thread being immediately
terminated. Instead, we wind up going thru orte_finalize in the main
thread, which isn't what we want.

So replace the call to "exit" with the "quick exit" variant "_exit", which
causes the entire process to exit immediately.

(custom patch has been posted for 1.8.3)

This commit was SVN r32780.
2014-09-23 22:51:10 +00:00
Nathan Hjelm
79881ca892 btl/vader: prevent double-destruction of endpoints and move endpoint teardown code into destructor
This commit was SVN r32779.
2014-09-23 21:51:15 +00:00
Nathan Hjelm
2d8fba0861 btl/vader: silence warning
This commit was SVN r32778.
2014-09-23 21:33:45 +00:00
Edgar Gabriel
05c34946f7 implementation of non-blocking read/write operations through aio
functions for the posix module. Som interface changes for the fbtl were
necessary for that.

This commit was SVN r32777.
2014-09-23 21:27:57 +00:00
Nathan Hjelm
8bd3160432 btl/vader: fix several typos in vader update
This commit was SVN r32775.
2014-09-23 20:25:36 +00:00
Nathan Hjelm
12bfd13150 btl/vader: improve performance for both single and multiple threads
This is a large update that does the following:

 - Only allocate fast boxes for a peer if a send count threshold
   has been reached (default: 16). This will greatly reduce the memory
   usage with large numbers of local peers.

 - Improve performance by limiting the number of fast boxes that can
   be allocated per peer (default: 32). This will reduce the amount
   of time spent polling for fast box messages.

 - Provide new MCA variables to configure the size, maximum count,
   and send count thresholds for fast boxes allocations.

 - Updated buffer design to increase the range of message sizes that
   can be sent with a fast box.

 - Add thread protection around fast box allocation (locks). When
   spin locks are available this should be updated to use spin locks.

 - Various fixes and cleanup.

This commit was SVN r32774.
2014-09-23 18:11:22 +00:00
Howard Pritchard
1508a01325 Fixes to enable mpirun to work again on Cray
The ess pmi module was not handling aprun launched
daemons.  All daemons were thinking they were vpid 1.

Also, turns out that on cray systems using MOM nodes
for launched jobs, just detecting whether or not a
process is in a PAGG container is not sufficient.

Crank up the priority of the alps PLM component in the
event that the configure detected the presence of both
slurm and alps.

Have the ESS pmi component open the pmix framework and
select a pmix component.

This commit was SVN r32773.
2014-09-23 15:37:26 +00:00
Artem Polyakov
f2e586980b Fix timing framework:
1. Fixes according to (http://www.open-mpi.org/community/lists/devel/2014/09/15869.php)
2. Force mpisync:rank0 to gather results. Now sync info is written by rank0 to the output file.
3. Improve mpirun_prof: 1) adopt to the environment (SLURM/TORQUE); 2) recognize some noteset-related mpirun options.

This commit was SVN r32772.
2014-09-23 12:59:54 +00:00
Ralph Castain
9c20940190 Remove mpif-sizeof.h during distclean
This commit was SVN r32771.
2014-09-21 14:26:19 +00:00
Ralph Castain
70896550bf Per input from Artem, update the copyrights on these files, ensuring to include all the licensing info for the files broght over from the mpiperf project.
This commit was SVN r32770.
2014-09-20 14:54:24 +00:00
Jeff Squyres
7f419dc5b6 fortran: set CLEANFILES properly
CLEANFILES was previously set; we need to use += to add to it.

refs trac:4917

This commit was SVN r32769.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-20 10:43:49 +00:00
MPI Team
9cf584d6c6 Update git/hg ignore files
This commit was SVN r32768.
2014-09-20 05:00:28 +00:00
Artem Polyakov
70587d1804 Remove outdated OPAL parameter "opal_pmi_version". Now PMI selection is handled by PMIx MCA.
This commit was SVN r32767.
2014-09-20 02:30:23 +00:00
Jeff Squyres
040611556f fortran: don't complain about script args if we're not building fortran
refs trac:4917

This commit was SVN r32766.

The following Trac tickets were found above:
  Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
2014-09-20 01:22:40 +00:00
Jeff Squyres
d7eaca83fa Fortran: Fix MPI_SIZEOF. What a disaster. :-(
What started as a simple ticket ended up reaching the way up to the
MPI Forum.
    
It turns out that we are supposed to have MPI_SIZEOF for all Fortran
interfaces: mpif.h, the mpi module, and the mpi_f08 module.
    
It further turns out that to properly support MPI_SIZEOF, your Fortran
compiler *has* support the INTERFACE keyword and ISO_FORTRAN_ENV.  We
can't use "ignore TKR" functionality, because the whole point of
MPI_SIZEOF is that the implementation knows what type was passed to it
("ignore TKR" functionality, by definition, throws that information
away).  Hence, we have to have an MPI_SIZEOF interface+implementation
for all intrinsic types, kinds, and ranks.

This commit therefore adds a perl script that generates both the
interfaces and implementations for MPI_SIZEOF in each of mpif.h, the
mpi module, and mpi_f08 module (yay consolidation!).

The perl script uses the results of some new configure tests:

* check if the Fortran compiler supports the INTERFACE keyword
* check if the Fortran compiler supports ISO_FORTRAN_ENV
* find the max array rank (i.e., dimension) that the compiler supports

If the Fortran compiler supports both INTERFACE and ISO_FORTRAN_ENV,
then we'll build the MPI_SIZEOF interfaces.  If not, we'll skip
MPI_SIZEOF in mpif.h and the mpi module.  Note that we won't build the
mpi_f08 module -- to include the MPI_SIZEOF interfaces -- if the
Fortran compiler doesn't support INTERFACE, ISO_FORTRAN_ENV, and a
whole bunch of ther modern Fortran stuff.

Since MPI_SIZEOF interfaces are now generated by the perl script, this
commit also removes all the old MPI_SIZEOF implementations (which were
laden with a zillion #if blocks).

cmr=v1.8.3

This commit was SVN r32764.
2014-09-19 13:44:52 +00:00
Jeff Squyres
0c98cf709e opal_get_version.m4: use better "git log" command
For when you have a git repository, this is a bit more succinct form
of getting a git hash of the HEAD.

This commit was SVN r32758.
2014-09-18 21:03:31 +00:00
Rolf vandeVaart
5c73101a72 Fix typo.
This commit was SVN r32755.
2014-09-18 13:58:54 +00:00
Gilles Gouaillardet
5fa2b6c59c oob/tcp: fix a race condition
Refs trac:4909

This commit was SVN r32754.

The following Trac tickets were found above:
  Ticket 4909 --> https://svn.open-mpi.org/trac/ompi/ticket/4909
2014-09-18 08:17:25 +00:00
Vasily Filipov
c7c63fe73e COLL/TUNED: alltoall - return previous default values of algorithm choosing decision thresholds (were changed by r32735)
reviewed by miked
    cmr=v1.8.3:reviewer=ompi-rm1.8

This commit was SVN r32753.

The following SVN revision numbers were found above:
  r32735 --> open-mpi/ompi@5fecf65daf
2014-09-18 08:07:51 +00:00
Jeff Squyres
cf56a369c2 fortran: configure test for "optional" keyword was too aggressive
After discussions with Craig, it looks like the check in this
configure test was too aggressive and based on a prior mpi_f08
implementation model.  We don't use "value" any more, and we also
don't pass optional parameters back to C code, so this test was
checking far more than it needed to... in a non-standard way (which
breaks in the Intel 2015 Fortran compiler).  All we ''really'' need is
to check whether the compiler supports the "optional" keyword.  So
make the test ''much'' simpler and just check whether ''optional''
compiles successfull or not.

Reviewed by Craig Rasmussen

cmr=v1.8.3:reviewer=ompi-rm1.8

This commit was SVN r32752.
2014-09-17 22:10:11 +00:00
Jeff Squyres
0f29f222f2 fortran: remove 2 unused files
As noted in the comments of these files, they aren't used.  Instead,
the Fortran interfaces for WTICK/WTIME just BIND(C) invoke the
back-end C functions (yay BIND(C)!).  Hence, there's no need to keep
these old wrapper files around any more.

cmr=v1.8.3

This commit was SVN r32751.
2014-09-17 21:49:24 +00:00
Rolf vandeVaart
8db1f89dd1 Small change to allow CUDA-aware to work with non-reduction nonblocking collectives.
Only used when CUDA-aware feature compiled in.

This commit was SVN r32750.
2014-09-17 16:55:01 +00:00
Ralph Castain
3a437cbdb3 Silence set-but-not-used warning when timing isn't enabled
This commit was SVN r32749.
2014-09-17 00:40:10 +00:00
Ralph Castain
414f4e9783 Try to provide a real hostname for the remote host to aid in debugging
Refs trac:4908

This commit was SVN r32748.

The following Trac tickets were found above:
  Ticket 4908 --> https://svn.open-mpi.org/trac/ompi/ticket/4908
2014-09-17 00:39:49 +00:00
Jeff Squyres
9dc49c5f92 oob_tcp_connection: print "<unknown>" instead of "NULL"
"NULL" doesn't meany anything to the user, and is somewhat confusing
to see in an error message.  "<unknown>" at least indicates that
there's an error, and we know who the peer is.

This commit was SVN r32747.
2014-09-16 22:47:57 +00:00