Initialize the blocking_fence flag to false as the code logic indicates that it should only be set if someone provides that flag.
Thanks to Lisandro Dalcin for reporting it
cmr=v1.8.4:reviewer=hjelmn
This commit was SVN r32812.
Per discussions with pmix folks, it was determined that
the way the cray pmi pmix component was computing the
PMIX_NODE_RANK attribute for a process was incorrect.
This commit fixes the problem.
This commit was SVN r32810.
When using the native aprun launcher, it was observed that
there were frequent memory corruption errors occuring either
during a PMI kvs-fence operation, or at mpi termation during
opal cleanup of allocated objects. This was especially bad
when using
aprun --c none
In some cases, the application would even just hang in finalize
if using ptmalloc, owing to some kind of infinite loop in
cleanup of small blocks, etc.
It turns out that the proble was in orte_ess_base_proc_binding's
improper use of opal_hwloc_base_get_available_cpus. The cpuset
(bitmap) returned from that function is not meant to be freed
by the caller.
This problem is likely never observed when using the mpirun launcher
as there's an early exit if the OMPI_MCA_orte_bound_at_launch
environment variable is set.
This commit was SVN r32809.
It's not enough to AC_COMPILE_IFELSE, do AC_LINK_IFELSE to really make
sure the compiler suite supports it.
Refs trac:4917
This commit was SVN r32802.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
Mimick the btl/tcp protocol to solve the race condition that happens
when two peers try to connect to each other at the same time
cmr=v1.8.4:reviewer=rhc
This commit was SVN r32799.
MTT found that the addition of the MPI_SIZEOF interfaces to mpif.h was
causing a linker error with the Absoft compiler. Absoft is working on
a fix, but we can workaround the issue for now. See comment in
Makefile.am in this commit for a lengthy explanation.
Refs trac:4917
This commit was SVN r32797.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
of the topology is higher than the communicator size
It is possible to have a topology degree higher than the size of the communicator.
For example, a periodic cartesian communicator on MPI_COMM_SELF. This will leave
the neighborhood collectives with a request buffer that is too small. This commit
adds a call that will dynamically increase the size of the request buffer if it
is too small.
A better fix would be to create the topology *before* calling the coll_select
routine on a communicator. This will take some discussion and the solution will
not likely be ready anytime soon.
Thanks to Lisandro Dalcin for reporting this.
Original thread: http://www.open-mpi.org/community/lists/devel/2014/08/15713.php
cmr=v1.8.3:reviewer=jsquyres
This commit was SVN r32796.
* remove config/ompi_config_solaris_threads.m4 which was dead code
* check if pthreads work "as is" on all platforms including Solaris
(FWIW, the test should have been skipped if Solaris threads are used
and not if configure is ran on a Solaris box)
Refs trac:4911
This commit was SVN r32792.
The following Trac tickets were found above:
Ticket 4911 --> https://svn.open-mpi.org/trac/ompi/ticket/4911
gfortran 4.8 does not support storage_size() on all relevant types
that we need. So add a configure test to check and see if the
compiler's storage_size() intrinsic supports enough types for us to do
MPI_SIZEOF.
Also remove an accidentally redundant check for fortran INTERFACE.
Refs trac:4917
This commit was SVN r32790.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
mpi-f08.F90 includes sizeof_f08.h, so we need to add a Makefile
dependency to ensure that sizeof_f08.h is built first.
Refs trac:4917
This commit was SVN r32789.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
Description:
setting mpi_abort_print_stack in mca params file now makes openmpi
unhappy. Comment these out in all the LANL platform files.
Requested by TOSS OpenMPI support person.
cmr=v1.8.3
This commit was SVN r32782.
during RTE abort as that is happening in a thread, and (at least in some
environments) doesn't result in the main thread being immediately
terminated. Instead, we wind up going thru orte_finalize in the main
thread, which isn't what we want.
So replace the call to "exit" with the "quick exit" variant "_exit", which
causes the entire process to exit immediately.
(custom patch has been posted for 1.8.3)
This commit was SVN r32780.
This is a large update that does the following:
- Only allocate fast boxes for a peer if a send count threshold
has been reached (default: 16). This will greatly reduce the memory
usage with large numbers of local peers.
- Improve performance by limiting the number of fast boxes that can
be allocated per peer (default: 32). This will reduce the amount
of time spent polling for fast box messages.
- Provide new MCA variables to configure the size, maximum count,
and send count thresholds for fast boxes allocations.
- Updated buffer design to increase the range of message sizes that
can be sent with a fast box.
- Add thread protection around fast box allocation (locks). When
spin locks are available this should be updated to use spin locks.
- Various fixes and cleanup.
This commit was SVN r32774.
The ess pmi module was not handling aprun launched
daemons. All daemons were thinking they were vpid 1.
Also, turns out that on cray systems using MOM nodes
for launched jobs, just detecting whether or not a
process is in a PAGG container is not sufficient.
Crank up the priority of the alps PLM component in the
event that the configure detected the presence of both
slurm and alps.
Have the ESS pmi component open the pmix framework and
select a pmix component.
This commit was SVN r32773.
1. Fixes according to (http://www.open-mpi.org/community/lists/devel/2014/09/15869.php)
2. Force mpisync:rank0 to gather results. Now sync info is written by rank0 to the output file.
3. Improve mpirun_prof: 1) adopt to the environment (SLURM/TORQUE); 2) recognize some noteset-related mpirun options.
This commit was SVN r32772.
CLEANFILES was previously set; we need to use += to add to it.
refs trac:4917
This commit was SVN r32769.
The following Trac tickets were found above:
Ticket 4917 --> https://svn.open-mpi.org/trac/ompi/ticket/4917
What started as a simple ticket ended up reaching the way up to the
MPI Forum.
It turns out that we are supposed to have MPI_SIZEOF for all Fortran
interfaces: mpif.h, the mpi module, and the mpi_f08 module.
It further turns out that to properly support MPI_SIZEOF, your Fortran
compiler *has* support the INTERFACE keyword and ISO_FORTRAN_ENV. We
can't use "ignore TKR" functionality, because the whole point of
MPI_SIZEOF is that the implementation knows what type was passed to it
("ignore TKR" functionality, by definition, throws that information
away). Hence, we have to have an MPI_SIZEOF interface+implementation
for all intrinsic types, kinds, and ranks.
This commit therefore adds a perl script that generates both the
interfaces and implementations for MPI_SIZEOF in each of mpif.h, the
mpi module, and mpi_f08 module (yay consolidation!).
The perl script uses the results of some new configure tests:
* check if the Fortran compiler supports the INTERFACE keyword
* check if the Fortran compiler supports ISO_FORTRAN_ENV
* find the max array rank (i.e., dimension) that the compiler supports
If the Fortran compiler supports both INTERFACE and ISO_FORTRAN_ENV,
then we'll build the MPI_SIZEOF interfaces. If not, we'll skip
MPI_SIZEOF in mpif.h and the mpi module. Note that we won't build the
mpi_f08 module -- to include the MPI_SIZEOF interfaces -- if the
Fortran compiler doesn't support INTERFACE, ISO_FORTRAN_ENV, and a
whole bunch of ther modern Fortran stuff.
Since MPI_SIZEOF interfaces are now generated by the perl script, this
commit also removes all the old MPI_SIZEOF implementations (which were
laden with a zillion #if blocks).
cmr=v1.8.3
This commit was SVN r32764.
reviewed by miked
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32753.
The following SVN revision numbers were found above:
r32735 --> open-mpi/ompi@5fecf65daf
After discussions with Craig, it looks like the check in this
configure test was too aggressive and based on a prior mpi_f08
implementation model. We don't use "value" any more, and we also
don't pass optional parameters back to C code, so this test was
checking far more than it needed to... in a non-standard way (which
breaks in the Intel 2015 Fortran compiler). All we ''really'' need is
to check whether the compiler supports the "optional" keyword. So
make the test ''much'' simpler and just check whether ''optional''
compiles successfull or not.
Reviewed by Craig Rasmussen
cmr=v1.8.3:reviewer=ompi-rm1.8
This commit was SVN r32752.
As noted in the comments of these files, they aren't used. Instead,
the Fortran interfaces for WTICK/WTIME just BIND(C) invoke the
back-end C functions (yay BIND(C)!). Hence, there's no need to keep
these old wrapper files around any more.
cmr=v1.8.3
This commit was SVN r32751.
"NULL" doesn't meany anything to the user, and is somewhat confusing
to see in an error message. "<unknown>" at least indicates that
there's an error, and we know who the peer is.
This commit was SVN r32747.