always check the permissions of the created directory,
in case some one else created the very same directory but
with incompatible permissions
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Due to the conversion from ssize_t to int we were losing bytes, and
ended up writing outside the receiver buffer. Similarly on the send,
due to the conversion to a lesser type, we could missinterpret the
end of the fragment.
The linux timer code was multiplying the result of the x86 time stamp
counter by 1000000 before dividing by the cpu frequency. This can
cause us to overflow 64 bits if the time stamp counter grows larger
than ~ 1.8e13 (about 8400 seconds after boot). To fix the issue the
units of opal_timer_linux_freq have been changed to MHz.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
the hwloc topology might not contain a NUMA object with hwloc < v2
if the node is not NUMA, so force the NUMA object count to one
in order to correctly allocate mca_btl_sm_component.sm_mpools.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Add a missing constraint to the input operand list.
This fixes a regression caused by d4be138a7b.
Thanks to Orion Poplawski for reporting the issue.
Refs #2610
Signed-off-by: Nysal Jan K.A <jnysal@in.ibm.com>
This commit fixes a deadlock that can occur when the libc version
holds a lock when calling munmap. In this case we could end up calling
free() from vma_tree_delete which would in turn try to obtain the lock
in libc. To avoid the issue put any deleted vma's in a new list on the
vma module and release them on the next call to vma_tree_insert. This
should be safe as this function is not called from the memory hooks.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This should probably not go to the v2.x branch, since it changes the
output format of the usnic stats.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Double check the queue lengths that we get back from libfabric to
ensure that they are at least as long as we need. They *should* never
be shorter than we need, but let's just check to be sure.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Don't just blindly send ACKs; ensure that we have send credits before
doing so. If we don't have any send credits, just don't send the ACK
(it'll come again soon enough; it's not a tragedy if we don't send it
now).
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The libfabric usnic provider may give you back TX/RX queues that are
longer than you asked for. So just use the TX/RQ/CQ lengths that we
asked for, regardless of what length comes back.
Additionally, keep the length of the priority channel CQ separate from
the length of the data CQ.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
Since the usnic BTL is single-threaded in this area, there really is
no danger, but don't use one of the pointers hanging off the frag
after we return it to the freelist. Instead, save the endpoint
pointer before returning the frag.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
The types are technically typedef equivalent, but it's less confusing
to use the types that agree with the name of the constructor.
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
on Linux, sending the header and then the message data does severely
impact performances of ptl/tcp :
on the receiver, reading the data can often result in an PMIX_ERR_RESOURCE_BUSY
or PMIX_ERR_WOULD_BLOCK, which ends up degrading performances)
this commit send both header and message data at the same time via writev()
and makes ptl/tcp virtually as efficient as ptl/usock.
Short writev generally occur when the kernel buffer is full, so there is no
point for retrying in this case.
fwiw, no such degradation was observed on OSX.
Refs open-mpi/ompi#2657
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
The MCA variable code calls the string from value function with a NULL
string to verify values. The verbosity enumerator was not correctly
checking for a non-NULL value before trying to set the string.
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
This commit fixes a bug in the timer check. When -fPIC is used we need
to save/restore ebx. The code copied from patcher was meant for 32-bit
systems and did not work correctly on 64-bit systems. This commit
updates the save/restore to use rbx instead of ebx.
Fixes#2678
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
there is no such thing as pthread_join(main_thread), so key destructors
are never invoked on the main thread, which causes valgrind report
some memory leaks. Manually store and then invoke the key destructors and
make valgrind happy.
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
Samples are taken after MPI_Init, and then again after MPI_Barrier. This allows the user to see memory consumption caused by add_procs, as well as any modex contribution from forming connections if pmix_base_async_modex is given.
Using the probe simply involves executing it via mpirun, with however many copies you want per node. Example:
$ mpirun -npernode 2 ./mpi_memprobe
Sampling memory usage after MPI_Init
Data for node rhc001
Daemon: 12.483398
Client: 6.514648
Data for node rhc002
Daemon: 11.865234
Client: 4.643555
Sampling memory usage after MPI_Barrier
Data for node rhc001
Daemon: 12.520508
Client: 6.576660
Data for node rhc002
Daemon: 11.879883
Client: 4.703125
Note that the client value on node rhc001 is larger - this is where rank=0 is housed, and apparently it gets a larger footprint for some reason.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>