1
1
Граф коммитов

3467 Коммитов

Автор SHA1 Сообщение Дата
Ralph Castain
a1b15c5666 Roll in update to PMIx master. Transfer updates from pmix2x component to ext2x
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-22 13:06:47 -07:00
Brice Goglin
2d242ab9f0 hwloc/shmem: don't abort on failure to load from shmem
Adopting can fail if the server-side hole isn't available on the client.

We can fallback to other ways to load the topology.

Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 19:57:38 +02:00
Brice Goglin
ffd209fc2e hwloc/shmem: dump /proc/self/maps if failed to find a hole and verbosity > 4
Signed-off-by: Brice Goglin <Brice.Goglin@inria.fr>
2017-08-21 19:57:38 +02:00
Ralph Castain
d515f48885 The local PMIx server is notifying its clients of all events, but for some reason I don't recall, the broadcast notification was marked for delivery only to non-default event handlers. This creates a discrepancy between the two behaviors, so don't restrict the broadcast notifications.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-18 17:26:11 -07:00
Brian Barrett
c667719a3f Merge pull request #3955 from mohanasudhan/master
Btl tcp: Improved diagnostic output and failure mode
2017-08-18 11:42:27 -07:00
Mohan
fc32ae401e Btl Tcp: Updated tcp handshake methods
This commit has two changes

1. Adding magic string during handshake can cause
issue when used with older version of MPI. Hence set
RCVTIMEO paramter to 2 second
2. Using single call during handshake instead of
two calls

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-18 10:06:52 -07:00
Mohan
e3dfe11da9 Btl tcp: Improving verbose around tcp
As part of improvement towards tcp btl we
are improving verbose in general

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 17:22:16 -07:00
Mohan
4bc7b214dc Btl tcp: Improving verbose around IPV6
As part of improvement around tcp btl debugging
& verbose. we are improving verbose around IPV6

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
0741fad479 Btl tcp: BTL_ERROR to show_help & update func behaviour
As part of improvement towards tcp debugging
we are moving few BTL_ERROR to show_help and also
update the function behaviour of
mca_btl_tcp_endpoint_complete_connect to return
SUCCESS and ERROR cases.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:14 -07:00
Mohan
368f9f0dfc Btl tcp: Using magic string to verify mpi connection
As part of improvement towards handling failure case
in btl tcp we are using magic string to verify mpi
connection. In case if there is mismatch or missing
magic string we can identify that we are trying to
connect with someother process.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Mohan
c30a42917c Btl tcp: Refactoring non-blocking send/receive function
Moving non-blocking send/receive function to btl_tcp
will help reusing these function where ever needed.
In this case we plan to reuse receive function to
retrive magic string to validate established connection
is from mpi process.

Signed-off-by: Mohan Gandhi <mohgan@amazon.com>
2017-08-17 16:45:13 -07:00
Ralph Castain
088b6cdeee Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-17 09:49:35 -07:00
Ralph Castain
41df973359 Add diagnostics for hwloc get_topology
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 14:21:27 -07:00
Jeff Squyres
cd8db5313e Merge pull request #4101 from jsquyres/pr/usnic-restore-configure-summary-line
btl/usnic: restore configure usNIC summary line
2017-08-16 16:36:19 -04:00
Jeff Squyres
a591159fb4 btl/usnic: restore configure usNIC summary line
Not sure how/when this got deleted, but put back the "Cisco usNIC"
line in the transport summary at the end of configure.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 12:37:59 -07:00
Ralph Castain
c4d5dbfcdc Change test per recommendation of @jsquyres
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-16 11:19:15 -07:00
Jeff Squyres
ce3a032b5e rcash_base_frame: fix compiler warning
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-08-16 09:48:31 -07:00
Ralph Castain
eb69df02ae Update to PMIx v2.1.0rc1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 19:59:15 -07:00
Ralph Castain
65fb6070d9 Update tool support by adding MCA params to direct orted's to drop
session and/or system-level tool rendezous files. Ensure PMIx is
enabled for tools

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 17:49:47 -07:00
Ralph Castain
98f36711e3 Update hwloc to latest shmem branch. Correct typos in update-my-copyright.pl.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 13:32:12 -07:00
Ralph Castain
033a0eb373 Fix the --disable-dlopen --with-devel-headers case by not having libpmix link back to libopen-pal as the latter won't exist in time during this build case
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-15 10:51:35 -07:00
Ralph Castain
daf548b328 Apply patch from @bgoglin
Fixes #4027

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-11 07:16:14 -07:00
Ralph Castain
4290247d64 Update to latest PMIx v2.1.0a
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-10 18:48:07 -07:00
Howard Pritchard
6dfb48d866 Merge pull request #4056 from hppritcha/topic/swat_issue_4020
mca/registry: fix problem group_component_register
2017-08-09 10:25:00 -06:00
Jeff Squyres
6889948475 Merge pull request #4058 from thananon/pr/usnic_fix_credit
btl/usnic: assign the number of send credit correctly.
2017-08-09 11:46:42 -04:00
Howard Pritchard
55774d1390 mca/registry: fix problem group_component_register
Turns out that supplying NULL to group_register in the
mca_base_var_group_component_register is not a good
idea if one wants for ompi_info to work as intended.

The ugni and vader btl's both call this before
registering component variables.  This borks up
the ompi_info works since NULL is supplied as the project
name.  So, now supply the project name rather than
just NULL to group register.

Fixes #4020.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-08-08 19:50:27 -06:00
Thananon Patinyasakdikul
68658e4bab btl/usnic: assign the number of send credit correctly.
usnic endpoints was always created with default send credit value of 8. This
commit assign the correct number from the hardware instead.

Signed-off-by: Thananon Patinyasakdikul <apatinya@cisco.com>
2017-08-08 17:01:16 -07:00
Ralph Castain
53c9270af7 Silence coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-08 06:10:14 -07:00
Nathan Hjelm
b870d150dd rcache/base: remove erroneous comment
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 15:17:12 -06:00
Nathan Hjelm
76320a8ba5 opal: rename opal_atomic_init to opal_atomic_lock_init
This function is used to initalize and opal atomic lock. The old name
was confusing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 14:15:11 -06:00
Ralph Castain
9921237f99 Merge pull request #4012 from rhc54/topic/p3
Cover the use-cases for OPAL_PREFIX and PMIX_INSTALL_PREFIX options
2017-08-07 11:42:53 -07:00
Ralph Castain
9499acc56a Merge pull request #4043 from rhc54/topic/libpmix
Fix libpmix linking
2017-08-07 11:28:15 -07:00
Ralph Castain
d593e5a4ce When we specify --with-devel-headers, we also emit a copy of libpmix. However, that library was built against the OPAL libevent component, which means all the libevent functions are prefixed with OPAL names. So ensure that the emitted libpmix is linked back against libopen-pal so those symbols will be resolved.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-07 09:36:16 -07:00
Nathan Hjelm
813762334e memory/patcher: hook madvise
It is not possible to use the patcher based memory hooks without
hooking madvise (MADV_DONTNEED). This commit updates the patcher
memory hooks to always hook madvise. This should be safe with recent
rcache updates.

References #3685. Close when merged into v2.0.x, v2.x, and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 10:29:45 -06:00
Nathan Hjelm
b6bf3f4d95 rcache/base: reduce probability of deadlock when hooking madvise
The current VMA cache implementation backing rcache/grdma can run into
a deadlock situation in multi-threaded code when madvise is hooked and
the c library uses locks. In this case we may run into the following
situation:

Thread 1:

    ...
    free ()           <- Holding libc lock
    madvice_hook ()
    vma_iteration ()  <- Blocked waiting for vma lock

Thread 2:
    ...
    vma_insert ()     <- Holding vma lock
    vma_item_new ()
    malloc ()         <- Blocked waiting for libc lock

To fix this problem we chose to remove the madvise () hook but that
fix is causing issue #3685. This commit aims to greatly reduce the
chance that the deadlock will be hit by putting vma items into a free
list. This moves the allocation outside the vma lock. In general there
are a relatively small number of vma items so the default is to
allocate 2048 vma items. This default is configurable but it is likely
the number is too large not too small.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-08-07 10:29:45 -06:00
Ralph Castain
d1b7c3d8d5 Silence some compile-time warnings. Update scripts now that AUTHORS is gone
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 20:08:31 -07:00
Ralph Castain
a239b4c3c3 Per discussion on the PMIx side, do a better job of detecting mismatches between location directives for OPAL and PMIx. Provide a more helpful error message and error out if we find a mismatch. If any OPAL values are set and the PMIx equivalent is not, then transfer it.
Do not clear PMIX_INSTALL_PREFIX from the daemon's launch environment

Fixes #3980
Closes #4007
Refs #3985

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-04 19:36:00 -07:00
Joshua Hursey
196b314643 btl/sm: Missing argv header
Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-08-04 21:10:49 -04:00
Ralph Castain
f128b4c546 Fix incorrect usage of '==' in test comparisons
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-03 21:21:26 -07:00
Howard Pritchard
5ce07a6983 Merge pull request #3997 from hppritcha/topic/swat_compiler_warning
btl/ugni: swat compiler warning
2017-08-02 15:44:09 -06:00
Artem Polyakov
500c8be888 pmix: fix PMIx envar name for the installation prefix.
Signed-off-by: Artem Polyakov <artpol84@gmail.com>
2017-08-02 08:03:36 +03:00
Ralph Castain
f39ce67982 Merge pull request #3951 from rhc54/topic/hwloc2
Update to hwloc 2.0.0a
2017-08-01 15:18:31 -06:00
Ralph Castain
69612b3e2a Merge pull request #3990 from rhc54/topic/p2
Move handling of OPAL_PREFIX to PMIX_PREFIX down into embedded PMIx integration code
2017-08-01 15:13:59 -06:00
Brian Barrett
c4ae36f971 Merge pull request #3869 from Zzzoom/find_freq_bogomips
opal: Get x86 TSC frequency from bogomips
2017-08-01 13:23:21 -07:00
Howard Pritchard
12a5aacdfd btl/ugni: swat compiler warning
Signed-off-by: Howard Pritchard <hppritcha@gmail.com>
2017-08-01 12:21:57 -06:00
Ralph Castain
8f34fa4a56 Move the detection of OPAL_PREFIX and subsequent posting of PMIX_PREFIX to the internal integration code for PMIx so we only do this when running with the embeddied PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-08-01 08:24:27 -06:00
Sylvain Jeaugey
eee494fc8a common/cuda: Fix near-hang when remote side has exited
Ignore errors caused by remote side having exited when closing CUDA IPC mappings.
openmpi/ompi#3244

Signed-off-by: Sylvain Jeaugey <sjeaugey@nvidia.com>
2017-07-31 10:34:45 -07:00
Boris Karasev
e20b581529 pmix: fixed immediate request
This commit fixes a hang when using external PMIx v1 module

Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-28 15:53:48 +06:00
Gilles Gouaillardet
825116044e hwloc/base: fix info message for opal_hwloc_base_binding_policy
if np > 2, the default binding is now "numa"

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-28 11:17:15 +09:00
Jeff Squyres
d954167ecf Merge pull request #3881 from bharatpotnuri/master
master: btl/openib: Handle EOPNOTSUPP
2017-07-26 11:32:40 -04:00
Ralph Castain
6ebaed8c01 Restore support for user-provided cpulist
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 23:51:21 -07:00
Ralph Castain
7a83fdb9bb Update to hwloc 2.0.0a with shmem support.
Update to support passing of HWLOC shmem topology to client procs
Update use of distance API per @bgoglin
Have the openib component lookup its object in the distance matrix
Bring usnic up-to-date
Restore binding for hwloc2

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 20:26:22 -07:00
Ralph Castain
6fe5b36b50 Merge pull request #3963 from rhc54/topic/hwfix
Restore binding support
2017-07-25 22:09:04 -05:00
Ralph Castain
96f07aebfa Restore binding support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 18:44:44 -07:00
Ralph Castain
0042c758f1 Update the tools support so it allows tools to access PMIx
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 17:10:08 -07:00
Ralph Castain
058e802b11 Add missing export directives
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-25 07:19:08 -07:00
George Bosilca
1ea8fab095
Make external symbols visible.
All symbols that need to be accessed from a MCA component must be marked
explicitly as visible using PMIX_EXPORT. This patch allows current trunk
to almost work on OsX. More on the devel mailing list.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-07-25 01:14:22 -04:00
Ralph Castain
af85e48dd7 Silence Coverity warning, silence pmix_error_log of success
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 15:33:16 -07:00
Ralph Castain
492f98f8a5 Update to latest PMIx v2.1.0a
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-21 12:58:09 -07:00
Ralph Castain
f7e8780a42 Remove fortran support from platform file
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 21:02:30 -07:00
Ralph Castain
b225366012 Bring the ofi/rml component online by completing the wireup protocol for the daemons. Cleanup the current confusion over how connection info gets created and
passed to make it all flow thru the opal/pmix "put/get" operations. Update the PMIx code to latest master to pickup some required behaviors.

Remove the no-longer-required get_contact_info and set_contact_info from the RML layer.

Add an MCA param to allow the ofi/rml component to route messages if desired. This is mainly for experimentation at this point as we aren't sure if routing wi
ll be beneficial at large scales. Leave it "off" by default.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 21:01:57 -07:00
Ralph Castain
0e4e3af1db Remove problem installation of hwloc 2.0
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 18:18:08 -07:00
Ralph Castain
7d8d877837 Remove build product and update .gitignore to avoid picking it up again
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 11:49:48 -07:00
Ralph Castain
8c30958879 Update to PMIx v2.1.0alpha
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-20 11:12:06 -07:00
Gilles Gouaillardet
593e4ce63f hwloc: add hwloc2x
internal hwloc 2x is used with --with-hwloc=future

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-20 17:39:51 +09:00
Gilles Gouaillardet
60aa9cfcb6 hwloc: add support for hwloc v2 API
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-20 17:39:44 +09:00
Gilles Gouaillardet
9f29f3bff4 hwloc: since WHOLE_SYSTEM is no more used, remove useless
checks related to offline and disallowed elements

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-20 17:39:21 +09:00
Gilles Gouaillardet
1a34224948 hwloc: do not set the HWLOC_TOPOLOGY_FLAG_WHOLE_SYSTEM flag
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-20 17:39:16 +09:00
Ralph Castain
fca68b070b Merge pull request #3934 from rhc54/topic/singleton
Fix the isolated pmix component. Cleanup the ess/singleton component …
2017-07-19 16:02:37 -05:00
Ralph Castain
543c16b28d Fix the isolated pmix component. Cleanup the ess/singleton component - we shouldn't be automatically discovering the local topology as that is now done on-demand.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-19 12:14:29 -07:00
Howard Pritchard
2fa0c4c6ec pmix/s1: fix problems with ref counting in s1
s1 pmix component wasn't doing proper ref counting

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-18 15:59:28 -06:00
Josh Hursey
8688219091 Merge pull request #3775 from jjhursey/fix/mca_base_verbose-file
opal/mca: Fix mca_base_verbose file suffix processing
2017-07-18 10:14:42 -05:00
Howard Pritchard
771f51af12 Merge pull request #3917 from hppritcha/topic/remove_cr_config_master
configure: remove CR/FT related options
2017-07-17 16:12:07 -06:00
Nathan Hjelm
2060fcf8bb mca/base: use the project name when registering pvars
References #3918. Close when applied to v2.0.x, v2.x, and v3.0.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-07-17 15:57:50 -05:00
Howard Pritchard
45e2771162 configure: remove CR/FT related options
As part of the process for addressing removal of CR/FT related
code from master (and hence from the 3.0.0 release), it was agreed
at the OMPI devel F2F on 7/13/17 that we'd break this in to two
pieces:

1) remove the configure arguments (fewer changes)
2) remove all the CR/FT code, etc. in a subsequent bigger commit
    that may not make it in to 3.0.0 in time.

By doing 1), the available configure options would not change
in a subsequent 3.0.x release if we end up not being able to do 2)
before 3.0.0 is released.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-17 13:48:59 -06:00
Nathan Hjelm
e5343c16c0 btl/vader: remove debug code that should not be in a release
References #3902. Close when in master, v3.0.x, and v2.x.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-07-17 11:58:47 -05:00
Gilles Gouaillardet
6e35cfc19a btl/sm: fix misc memory leak
as reported by Coverity with CID 1415105

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-16 13:02:55 +09:00
Jeff Squyres
5cf64e6555 btl/sm: effectively delete the SM BTL
If a user explicitly asks for the "sm" BTL, print a show_help message
saying that the SM BTL is dead, and the user should be using "vader".

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-07-15 09:33:08 -07:00
Artem Polyakov
0929c32cd8 Merge pull request #3893 from karasevb/yoda_spml_remove
Remove Yoda SPML
2017-07-15 08:47:31 -07:00
Gilles Gouaillardet
9124afbeae pmix: do not invoke PMIX_INFO_CREATE() with a zero size
Thanks Lisandro Dalcin for the report

Fixes open-mpi/ompi#3854

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-14 15:00:05 +09:00
Boris Karasev
77c50efb95 Yoda SPML is removed
Signed-off-by: Boris Karasev <karasev.b@gmail.com>
2017-07-14 08:47:16 +03:00
Artem Polyakov
4d3e22e815 Merge pull request #3870 from hppritcha/topic/repair_s2_launch
pmix/s2: fix srun native launch for pmi2
2017-07-13 12:45:22 -05:00
Potnuri Bharat Teja
9154ade8b1 btl/openib: Handle EOPNOTSUPP
Updated openib BTL to handle EOPNOTSUPP as per
https://www.open-mpi.org/community/lists/devel/2016/04/18839.php

Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2017-07-13 21:05:32 +05:30
Howard Pritchard
eeb91bc82b pmix/s2: fix srun native launch for pmi2
recent changes that broke native launch on cray
using srun or aprun was also broke native launch
using pmi2.

This commit fixes this problem.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-12 17:45:52 -06:00
Jeff Squyres
ccf17808b6 Merge pull request #3258 from markalle/pr/symbol_name_pollution
symbol name pollution
2017-07-12 16:19:25 -05:00
Carlos Bederián
b5883a358b Get x86 TSC frequency from bogomips
Signed-off-by: Carlos Bederián <bc@famaf.unc.edu.ar>
2017-07-12 17:31:25 -03:00
Gilles Gouaillardet
32606ad476 btl/tcp: fix heterogeneous support for put / large messages
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-07-12 10:27:45 +09:00
Nathan Hjelm
c18007d095 btl/vader: work around ob1 pending fragment bug
This commit ensures that the pml callback is always made when
sending fragments. This is needed to avoid #3845. Once that is
fixed the #if 0'd code can be restored.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-07-11 15:59:56 -06:00
Howard Pritchard
550e8c4afe Merge pull request #3842 from hppritcha/topic/fix_cray_pmix_problem
pmix/cray: add a bit of debug output
2017-07-11 08:29:56 -06:00
Howard Pritchard
26a8142c97 pmix/cray: add a bit of debug output
add a bit of debug output to help with pmix finalize issues

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-11 05:45:49 -05:00
Mark Allen
552216f9ba scripted symbol name change (ompi_ prefix)
Passed the below set of symbols into a script that added ompi_ to them all.

Note that if processing a symbol named "foo" the script turns
    foo  into  ompi_foo
but doesn't turn
    foobar  into  ompi_foobar

But beyond that the script is blind to C syntax, so it hits strings and
comments etc as well as vars/functions.

    coll_base_comm_get_reqs
    comm_allgather_pml
    comm_allreduce_pml
    comm_bcast_pml
    fcoll_base_coll_allgather_array
    fcoll_base_coll_allgatherv_array
    fcoll_base_coll_bcast_array
    fcoll_base_coll_gather_array
    fcoll_base_coll_gatherv_array
    fcoll_base_coll_scatterv_array
    fcoll_base_sort_iovec
    mpit_big_lock
    mpit_init_count
    mpit_lock
    mpit_unlock
    netpatterns_base_err
    netpatterns_base_verbose
    netpatterns_cleanup_narray_knomial_tree
    netpatterns_cleanup_recursive_doubling_tree_node
    netpatterns_cleanup_recursive_knomial_allgather_tree_node
    netpatterns_cleanup_recursive_knomial_tree_node
    netpatterns_init
    netpatterns_register_mca_params
    netpatterns_setup_multinomial_tree
    netpatterns_setup_narray_knomial_tree
    netpatterns_setup_narray_tree
    netpatterns_setup_narray_tree_contigous_ranks
    netpatterns_setup_recursive_doubling_n_tree_node
    netpatterns_setup_recursive_doubling_tree_node
    netpatterns_setup_recursive_knomial_allgather_tree_node
    netpatterns_setup_recursive_knomial_tree_node
    pml_v_output_close
    pml_v_output_open
    intercept_extra_state_t
    odls_base_default_wait_local_proc
    _event_debug_mode_on
    _evthread_cond_fns
    _evthread_id_fn
    _evthread_lock_debugging_enabled
    _evthread_lock_fns
    cmd_line_option_t
    cmd_line_param_t
    crs_base_self_checkpoint_fn
    crs_base_self_continue_fn
    crs_base_self_restart_fn
    event_enable_debug_output
    event_global_current_base_
    event_module_include
    eventops
    sync_wait_mt
    trigger_user_inc_callback
    var_type_names
    var_type_sizes

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:23 -04:00
Mark Allen
efc25168cd symbol name pollution: making some vars static
As part of addressing symbol name pollution, I'm switching a few
vars/functions to static.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-07-11 02:13:22 -04:00
Ralph Castain
a190b4b89f Prefix the MB macro in one more place
Fixes #3830

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-07 06:07:47 -07:00
Ralph Castain
2a580fa71e Merge pull request #3801 from rhc54/topic/hetero
Detect that we have a mix of BE/LE in the system
2017-07-06 15:29:06 -07:00
Ralph Castain
ed43492867 Not really necessary, but technically correct
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-06 06:00:03 -07:00
Ralph Castain
31130a4bee Replace syntax with something less strictly C99
Fixes #3809

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-05 16:54:36 -07:00
Ralph Castain
2753f53e6d Detect that we have a mix of BE/LE in the system, provide a warning that OMPI doesn't currently support this environment, and error out
Fixes #2817

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-07-03 15:47:05 -07:00
Howard Pritchard
1f2f3db553 pmix/cray: fix handling of multiple finis
The fini code for cray pmix wasn't correct.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-07-03 14:30:34 -05:00
Ralph Castain
9178219e6b Deregister event handlers only on final call to finalize. Ensure we pass PMIx mca params
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-28 15:00:43 -07:00
Ralph Castain
d619de4f4c Fix a threadlock when notifying clients of failures
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-28 08:58:41 -07:00
Joshua Hursey
3b780ac137 opal/mca: Fix mca_base_verbose file suffix processing
* `-mca mca_base_verbose file:foo` should create an output file with
    the suffix `foo`. But since we free the pointer at the end of this
    function then by the time we use it it is pointing to invalid memory.
 * This commit fixes that corruption
 * This commit also fixes the behavior of `file:` with no suffix.
   Makes it the same as `file` without the colon.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-06-27 16:52:56 -05:00
Ralph Castain
e6c2a8d346 Track PMIx v2.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-26 09:34:57 -07:00
bosilca
d55b666834 Topic/monitoring (#3109)
Add a monitoring PML, OSC and IO. They track all data exchanges between processes,
with capability to include or exclude collective traffic. The monitoring infrastructure is
driven using MPI_T, and can be tuned of and on any time o any communicators/files/windows.
Documentations and examples have been added, as well as a shared library that can be
used with LD_PRELOAD and that allows the monitoring of any application.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>


* add ability to querry pml monitorinting results with MPI Tools interface
using performance variables "pml_monitoring_messages_count" and
"pml_monitoring_messages_size"

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Fix a convertion problem and add a comment about the lack of component
retain in the new component infrastructure.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Allow the pvar to be written by invoking the associated callback.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Various fixes for the monitoring.
Allocate all counting arrays in a single allocation
Don't delay the initialization (do it at the first add_proc as we
know the number of processes in MPI_COMM_WORLD)

Add a choice: with or without MPI_T (default).

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Cleanup for the monitoring module.
Fixed few bugs, and reshape the operations to prepare for
global or communicator-based monitoring. Start integrating
support for MPI_T as well as MCA monitoring.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Adding documentation about how to use pml_monitoring component.

Document present the use with and without MPI_T.
May not reflect exactly how it works right now, but should reflects
how it should work in the end.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Change rank into MPI_COMM_WORLD and size(MPI_COMM_WORLD) to global variables in pml_monitoring.c.
Change mca_pml_monitoring_flush() signature so we don't need the size and rank parameters.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Improve monitoring support (including integration with MPI_T)

Use mca_pml_monitoring_enable to check status state. Set mca_pml_monitoring_current_filename iif parameter is set
Allow 3 modes for pml_monitoring_enable_output: - 1 : stdout; - 2 : stderr; - 3 : filename
Fix test : 1 for differenciated messages, >1 for not differenciated. Fix output.
Add documentation for pml_monitoring_enable_output parameter. Remove useless parameter in example
Set filename only if using mpi tools
Adding missing parameters for fprintf in monitoring_flush (for output in std's cases)
Fix expected output/results for example header
Fix exemple when using MPI_Tools : a null-pointer can't be passed directly. It needs to be a pointer to a null-pointer
Base whether to output or not on message count, in order to print something if only empty messages are exchanged
Add a new example on how to access performance variables from within the code
Allocate arrays regarding value returned by binding

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add overhead benchmark, with script to use data and create graphs out of the results
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix segfault error at end when not loading pml
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Start create common monitoring module. Factorise version numbering
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix microbenchmarks script
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Improve readability of code

NULL can't be passed as a PVAR parameter value. It must be a pointer to NULL or an empty string.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add osc monitoring component

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add error checking if running out of memory in osc_monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Resolve brutal segfault when double freeing filename
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Moving to ompi/mca/common the proper parts of the monitoring system
Using common functions instead of pml specific one. Removing pml ones.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add calls to record monitored data from osc. Use common function to translate ranks.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix test_overhead benchmark script distribution

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix linking library with mca/common

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add passive operations in monitoring_test

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix from rank calculation. Add more detailed error messages

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix alignments. Fix common_monitoring_get_world_rank function. Remove useless trailing new lines

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix osc_monitoring mget_message_count function call

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Change common_monitoring function names to respect the naming convention. Move to common_finalize the common parts of finalization. Add some comments.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add monitoring common output system

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add error message when trying to flush to a file, and open fails. Remove erroneous info message when flushing wereas the monitoring is already disabled.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Consistent output file name (with and without MPI_T).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Always output to a file when flushing at pvar_stop(flush).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Update the monitoring documentation.
Complete informations from HowTo. Fix a few mistake and typos.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use the world_rank for printf's.
Fix name generation for output files when using MPI_T. Minor changes in benchmarks starting script

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean potential previous runs, but keep the results at the end in order to potentially reprocess the data. Add comments.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add security check for unique initialization for osc monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean the amout of symbols available outside mca/common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Remove use of __sync_* built-ins. Use opal_atomic_* instead.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Allocate the hashtable on common/monitoring component initialization. Define symbols to set the values for error/warning/info verbose output. Use opal_atomic instead of built-in function in osc/monitoring template initialization.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Deleting now useless file : moved to common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add histogram ditribution of message sizes

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add histogram array of 2-based log of message sizes. Use simple call to reset/allocate arrays in common_monitoring.c

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add informations in dumping file. Separate per category (pt2pt/osc/coll (to come)) monitored data

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add coll component for collectives communications monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix warning messages : use c_name as the magic id is not always defined. Moreover, there was a % missing. Add call to release underlying modules. Add debug info messages. Add warning which may lead to further analysis.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix log10_2 constant initialization. Fix index calculation for histogram array.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add debug info messages to follow more easily initialization steps.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Group all the var/pvar definitions to common_monitoring. Separate initial filename from the current on, to ease its lifetime management. Add verifications to ensure common is initialized once only. Move state variable management to common_monitoring.
monitoring_filter only indicates if filtering is activated.
Fix out of range access in histogram.
List is not used with the struct mca_monitoring_coll_data_t, so heritate only from opal_object_t.
Remove useless dead code.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix invalid memory allocation. Initialize initial_filename to empty string to avoid invalid read in mca_base_var_register.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Don't install the test scripts.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix missing procs in hashtable. Cache coll monitoring data.
    * Add MCA_PML_BASE_FLAG_REQUIRE_WORLD flag to the PML layer.
    * Cache monitoring data relative to collectives operations on creation.
    * Remove double caching.
    * Use same proc name definition for hash table when inserting and
      when retrieving.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use intermediate variable to avoid invalid write while retrieving ranks in hashtable.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add missing release of the last element in flush_all. Add release of the hashtable in finalize.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use a linked list instead of a hashtable to keep tracks of communicator data. Add release of the structure at finalize time.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Set world_rank from hashtable only if found

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Use predefined symbol from opal system to print int

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Move collective monitoring data to a hashtable. Add pvar to access the monitoring_coll_data. Move functions header to a private file only to be used in ompi/mca/common/monitoring

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix pvar registration. Use OMPI_ERROR isntead of -1 as returned error value. Fix releasing of coll_data_t objects. Affect value only if data is found in the hashtable.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add automated check (with MPI_Tools) of monitoring.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix procs list caching in common_monitoring_coll_data_t

    * Fix monitoring_coll_data type definition.
    * Use size(COMM_WORLD)-1 to determine max number of digits.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add linking to Fortran applications for LD_PRELOAD usage of monitoring_prof

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add PVAR's handles. Clean up code (visibility, add comments...). Start updating the documentation

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix coll operations monitoring. Update check_monitoring accordingly to the added pvar. Fix monitoring array allocation.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Documentation update.
Update and then move the latex and README documentation to a more logical place

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Aggregate monitoring COLL data to the generated matrix. Update documentation accordingly.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix monitoring_prof (bad variable.vector used, and wrong array in PMPI_Gather).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add reduce_scatter and reduce_scatter_block monitoring. Reduce memory footprint of monitoring_prof. Unify OSC related outputs.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add the use of a machine file for overhead benchmark

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Check for out-of-bound write in histogram

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Fix common_monitoring_cache object init for MPI_COMM_WORLD

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add RDMA benchmarks to test_overhead
Add error file output. Add MPI_Put and MPI_Get results analysis. Add overhead computation for complete sending (pingpong / 2).

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add computation of average and median of overheads. Add comments and copyrigths to the test_overhead script

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add technical documentation

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Adapt to the new definition of communicators

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Update expected output in test/monitoring/monitoring_test.c

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add dumping histogram in edge case

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Adding a reduce(pml_monitoring_messages_count, MPI_MAX) example

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add consistency in header inclusion.
Include ompi/mpi/fortran/mpif-h/bindings.h only if needed.
Add sanity check before emptying hashtable.
Fix typos in documentation.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* misc monitoring fixes

* test/monitoring: fix test when weak symbols are not available
* monitoring: fix a typo and add a missing file in Makefile.am
and have monitoring_common.h and monitoring_common_coll.h included in the distro
* test/monitoring: cleanup all tests and make distclean a happy panda
* test/monitoring: use gettimeofday() if clock_gettime() is unavailable
* monitoring: silence misc warnings (#3)

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>

* Cleanups.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>

* Changing int64_t to size_t.
Keep the size_t used accross all monitoring components.
Adapt the documentation.
Remove useless MPI_Request and MPI_Status from monitoring_test.c.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add parameter for RMA test case

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Clean the maximum bound computation for proc list dump.
Use ptrdiff_t instead of OPAL_PTRDIFF_TYPE to reflect the changes from commit fa5cd0dbe5.

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add communicator-specific monitored collective data reset

Signed-off-by: Clement Foyer <clement.foyer@inria.fr>

* Add monitoring scripts to the 'make dist'
Also install them in the build and the install directories.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-06-26 18:21:39 +02:00
Ralph Castain
79fd359848 Merge pull request #3713 from rhc54/topic/ofi
Enable use of OFI fabrics for launch and other collective operations.…
2017-06-25 11:47:40 -07:00
Ralph Castain
ed85512a7c Update to track PMIx v2.0.1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-25 07:29:32 -07:00
Ralph Castain
ef56c7d47a Correctly transfer size_t data fields
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-24 20:11:54 -07:00
Ralph Castain
f4411c4393 Enable use of OFI fabrics for launch and other collective operations. Update the PMIx repo to the latest master to get the required support for the server to "push" modex info, and to retrieve all its own "modex" values for sending back to mpirun. Have mpirun cache them in its local modex hash as OFI goes point-to-point direct and doesn't route - so the remote daemons don't need a copy of this connection info.
Remove the opal_ignore from the RML/OFI component, but disable that component unless the user specifically requests it via the "rml_ofi_desired=1" MCA param. This will let us test compile in various environments without interfering with operations while we continue to debug

Fix an error when computing the number of infos during server init

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-23 19:57:21 -07:00
Ralph Castain
8263efff65 Fix uninitialized variables
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-23 11:12:26 -07:00
George Bosilca
bd5650d680
Fix the TCP performance impact when not used
Based on an idea from Brian move the libevent trigger update to a later
stage instead of the generic add/del procs. So, we are doing the
increment/decrement when we register the recv handler for an endpoint,
so basically when we create and connect a socket to a peer. The benefit
is that as long as TCP is not used, there should be no impact on the
performance of other BTLs. The drawback is that the first TCP connection
will be slightly slower, but then once we have a peer connected over
TCP things go back to normal.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-06-23 11:15:45 +02:00
Ralph Castain
6ec2ad5288 Fix the pmix_query API when it asks for something that returns an array of pmix_info_t. Protect the PMIX_INFO_FREE macro from NULL arrays. Update the mpi_memprobe scaling test
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-22 20:11:36 -07:00
Nathan Hjelm
4252258338 Merge pull request #3721 from hjelmn/list_cleanup
opal: use opal_list_t convienience macros
2017-06-22 09:12:23 -06:00
Ralph Castain
3e78f84093 Silence Coverity warnings
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-21 13:19:51 -07:00
Ralph Castain
cba127bc43 Update the ext2x component to match the internal one
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 11:42:14 -07:00
Nathan Hjelm
ffd8ee2dfd opal: use opal_list_t convienience macros
This commit cleans up code in opal to use OPAL_LIST_FOREACH(_SAFE),
OPAL_LIST_DESTRUCT, and OPAL_LIST_RELEASE.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-06-20 12:37:12 -06:00
Ralph Castain
952726c121 Update to latest PMIx master - equivalent to 2.0rc2. Update the thread support in the opal/pmix framework to protect the framework-level structures.
This now passes the loop test, and so we believe it resolves the random hangs in finalize.

Changes in PMIx master that are included here:

* Fixed a bug in the PMIx_Get logic
* Fixed self-notification procedure
* Made pmix_output functions thread safe
* Fixed a number of thread safety issues
* Updated configury to use 'uname -n' when hostname is unavailable

Work on cleaning up the event handler thread safety problem
Rarely used functions, but protect them anyway
Fix the last part of the intercomm problem
Ensure we don't cover any PMIx calls with the framework-level lock.
Protect against NULL argv comm_spawn

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-20 09:02:15 -07:00
Ralph Castain
8f09929469 Fix rank-file mapper launch by correctly setting up the remote map from the provided data
Put a simple protection for the case where procs fail while we are trying to deregister handlers

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-15 08:33:29 -07:00
KAWASHIMA Takahiro
b5b6b22848 Merge pull request #3678 from kawashima-fj/pr/signal-abort-delay
Apply `opal_abort_delay` to the OPAL signal handler
2017-06-12 10:35:11 +09:00
Ralph Castain
548cd24e4e Forward-port changes proposed for v3.0 to master from PR #3677
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-09 07:51:21 -07:00
KAWASHIMA Takahiro
362445d486 Use same prefix format for [host:pid]
Hostname and PID are output as a message prefix in many places in
our code. Their printf-formats were either `[%s:%d]` or `[%s:%05d]`.
This commit changes `[%s:%d]` to `[%s:%05d]`. The latter was more
widely used in our code (including OPAL output system and the signal
handler).

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-06-08 19:35:03 +09:00
Ralph Castain
2d65908184 Correct the external pmix configury
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-07 00:33:29 -07:00
Ralph Castain
bd1793ad17 Get the pmix/ext2x component to work. Fix a minor problem in the libevent external component.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-06 20:06:28 -07:00
Ralph Castain
c3e6dc2022 Update to pmix v2.0.0rc1, including thread safety fixes
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-06 15:16:34 -07:00
Ralph Castain
93cf3c7203 Update OPAL and ORTE for thread safety
(I swear, if I look this over one more time, I'll puke)

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-06 12:30:57 -07:00
Ralph Castain
2f85d10600 Update to PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-06 08:19:25 -07:00
Ralph Castain
8f526968c2 Do not hang if we cannot relay messages. Eliminate extra error log message
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-06-05 06:35:19 -07:00
Ralph Castain
9d6b929894 Fix uninitialized variable. Set exit codes for failed launch so we get pretty error messages
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-31 07:38:37 -07:00
Ralph Castain
26d96061aa Roll in latest PMIx updates
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-30 21:35:35 -07:00
Ralph Castain
9f1f9d6606 Update to PMIx v2.0.0rc1
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-28 10:30:58 -07:00
Ralph Castain
9f60cd0fe7 Update the connect/accept support so we check to see if we have the proper infrastructure and RTE support, including whether we have ompi-server available if the connect/accept spans multiple applications. Print pretty help messages in all cases where we do not have support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-27 10:47:08 -07:00
Nathan Hjelm
33d59886e1 Merge pull request #3587 from hjelmn/event_abstraction
pmix/pmix2x: fix errors in event abstration
2017-05-26 10:44:18 -06:00
Nathan Hjelm
a512b8962d pmix/pmix2x: fix errors in event abstration
Parts of the pmix2x component called the event_* functions directly
instead of the opal_event_* wrappers. This is fine as long as we are
using libevent but becomes a problem with other event libraries.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-05-26 09:49:11 -06:00
Ralph Castain
2f721a3366 Merge pull request #3585 from rhc54/topic/pmix20
Update to pmix v2.0beta
2017-05-26 06:05:44 -07:00
Ralph Castain
e1e264711a Update to pmix v2.0beta
Fix atomics - again
Fix initialization of notification ring buffer
Fix wait_sync definitions

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-26 03:33:18 -07:00
Ralph Castain
657e701c65 Add debug verbosity to the orte data server and pmix pub/lookup functions
Start updating the various mappers to the new procedure. Remove the stale lama component as it is now very out-of-date. Bring round_robin and PPR online, and modify the mindist component (but cannot test/debug it).

Remove unneeded test

Fix memory corruption by re-initializing variable to NULL in loop

Resolve the race condition identified by @ggouaillardet by resetting the
mapped flag within the same event where it was set. There is no need to
retain the flag beyond that point as it isn't used again.

Add a new job attribute ORTE_JOB_FULLY_DESCRIBED to indicate that all the job information (including locations and binding) is included in the launch message. Thus, the backend daemons do not need to do any map computation for the job. Use this for the seq, rankfile, and mindist mappers until someone decides to update them.

Note that this will maintain functionality, but means that users of those three mappers will see large launch messages and less performant scaling than those using the other mappers.

Have the mindist module add procs to the job's proc array as it is a fully described module

Protect the hnp-not-in-allocation case

Per path suggested by Gilles - protect the HNP node when it gets added in the absence of any other allocation or hostfile

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-25 18:41:27 -07:00
Thananon Patinyasakdikul
bf7534d32c btl/usnic: changed fi_ep_bind flags for AV from NULL to 0 due to
compiler warning.

This commit fixed compiler warning generated from earlier commit :
ddbe1726c5

Signed-off-by: Thananon Patinyasakdikul <apatinya@cisco.com>
2017-05-22 10:09:43 -07:00
Geoff Paulsen
50f9287c03 Merge pull request #2941 from markalle/pr/mpi-info-update2
Finally Merging this in.  MPI_*_get_info/set_info().
Targeting v3.1 release.  @hjelmn were you interested in switching some internal pieces to begin using this?  Should we target v3.1 (or whatever we call the Oct 15th release?)
2017-05-22 09:22:04 -05:00
Mark Allen
482d84b6e5 fixes for Dave's get/set info code
The expected sequence of events for processing info during object creation
is that if there's an incoming info arg, it is opal_info_dup()ed into the obj
at obj->s_info first. Then interested components register callbacks for
keys they want to know about using opal_infosubscribe_infosubscribe().

Inside info_subscribe_subscribe() the specified callback() is called with
whatever matching k/v is in the object's info, or with the default. The
return string from the callback goes into the new k/v stored in info, and
the input k/v is saved as __IN_<key>/<val>. It's saved the same way
whether the input came from info or whether it was a default. A null return
from the callback indicates an ignored key/val, and no k/v is stored for
it, but an __IN_<key>/<val> is still kept so we still have access to the
original.

At MPI_*_set_info() time, opal_infosubscribe_change_info() is used. That
function calls the registered callbacks for each item in the provided info.
If the callback returns non-null, the info is updated with that k/v, or if
the callback returns null, that key is deleted from info. An __IN_<key>/<val>
is saved either way, and overwrites any previously saved value.

When MPI_*_get_info() is called, opal_info_dup_mpistandard() is used, which
allows relatively easy changes in interpretation of the standard, by looking
at both the <key>/<val> and __IN_<key>/<val> in info. Right now it does
  1. includes system extras, eg k/v defaults not expliclty set by the user
  2. omits ignored keys
  3. shows input values, not callback modifications, eg not the internal values

Currently the callbacks are doing things like
    return some_condition ? "true" : "false"
that is, returning static strings that are not to be freed. If the return
strings start becoming more dynamic in the future I don't see how unallocated
strings could support that, so I'd propose a change for the future that
the callback()s registered with info_subscribe_subscribe() do a strdup on
their return, and we change the callers of callback() to free the strings
it returns (there are only two callers).

Rough outline of the smaller changes spread over the less central files:
  comm.c
    initialize comm->super.s_info to NULL
    copy into comm->super.s_info in comm creation calls that provide info
    OBJ_RELEASE comm->super.s_info at free time
  comm_init.c
    initialize comm->super.s_info to NULL
  file.c
    copy into file->super.s_info if file creation provides info
    OBJ_RELEASE file->super.s_info at free time
  win.c
    copy into win->super.s_info if win creation provides info
    OBJ_RELEASE win->super.s_info at free time

  comm_get_info.c
  file_get_info.c
  win_get_info.c
    change_info() if there's no info attached (shouldn't happen if callbacks
      are registered)
    copy the info for the user

The other category of change is generally addressing compiler warnings where
ompi_info_t and opal_info_t were being used a little too interchangably. An
ompi_info_t* contains an opal_info_t*, at &(ompi_info->super)

Also this commit updates the copyrights.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-05-17 01:12:49 -04:00
Thananon Patinyasakdikul
a705f2cf7b usNIC: fix fi_ep_bind flag. FI_RECV should not be associated with av.
Signed-off-by: Thananon Patinyasakdikul <tpatinya@utk.edu>
2017-05-16 18:22:28 -04:00
Jeff Squyres
23325c31d3 Merge pull request #3338 from jjhursey/topic/ompi_info_show_failed
`ompi_info --show-failed` feature
2017-05-16 17:08:43 -04:00
David Solt
50aa143ab6 Major structural changes to data types: .super infosubscriber
ompi_communicator_t, ompi_win_t, ompi_file_t all have a super class of type opal_infosubscriber_t instead of a base/super type of opal_object_t (in previous code comm used c_base, but file used super).  It may be a bit bold to say that being a subscriber of MPI_Info is the foundational piece that ties these three things together, but if you object, then I would prefer to turn infosubscriber into a more general name that encompasses other common features rather than create a different super class.  The key here is that we want to be able to pass comm, win and file objects as if they were opal_infosubscriber_t, so that one routine can heandle all 3 types of objects being passed to it.

MPI_INFO_NULL is still an ompi_predefined_info_t type since an MPI_Info is part of ompi but the internal details of the underlying information concept is part of opal.

An ompi_info_t type still exists for exposure to the user, but it is simply a wrapper for the opal object.

Routines such as ompi_info_dup, etc have all been moved to opal_info_dup and related to the opal directory.

Fortran to C translation tables are only used for MPI_Info that is exposed to the application and are therefore part of the ompi_info_t and not the opal_info_t

The data structure changes are primarily in the following files:

    communicator/communicator.h
    ompi/info/info.h
    ompi/win/win.h
    ompi/file/file.h

The following new files were created:

    opal/util/info.h
    opal/util/info.c
    opal/util/info_subscriber.h
    opal/util/info_subscriber.c

This infosubscriber concept is that communicators, files and windows can have subscribers that subscribe to any changes in the info associated with the comm/file/window.  When xxx_set_info is called, the new info is presented to each subscriber who can modify the info in any way they want.  The new value is presented to the next subscriber and so on until all subscribers have had a chance to modify the value.  Therefore, the order of subscribers can make a difference but we hope that there is generally only one subscriber that cares or modifies any given key/value pair.  The final info is then stored and returned by a call to xxx_get_info.

The new model can be seen in the following files:

    ompi/mpi/c/comm_get_info.c
    ompi/mpi/c/comm_set_info.c
    ompi/mpi/c/file_get_info.c
    ompi/mpi/c/file_set_info.c
    ompi/mpi/c/win_get_info.c
    ompi/mpi/c/win_set_info.c

The current subscribers where changed as follows:

    mca/io/ompio/io_ompio_file_open.c
    mca/io/ompio/io_ompio_module.c
    mca/osc/rmda/osc_rdma_component.c (This one actually subscribes to "no_locks")
    mca/osc/sm/osc_sm_component.c (This one actually subscribes to "blocking_fence" and "alloc_shared_contig")

Signed-off-by: Mark Allen <markalle@us.ibm.com>

Conflicts:
	AUTHORS
	ompi/communicator/comm.c
	ompi/debuggers/ompi_mpihandles_dll.c
	ompi/file/file.c
	ompi/file/file.h
	ompi/info/info.c
	ompi/mca/io/ompio/io_ompio.h
	ompi/mca/io/ompio/io_ompio_file_open.c
	ompi/mca/io/ompio/io_ompio_file_set_view.c
	ompi/mca/osc/pt2pt/osc_pt2pt.h
	ompi/mca/sharedfp/addproc/sharedfp_addproc.h
	ompi/mca/sharedfp/addproc/sharedfp_addproc_file_open.c
	ompi/mca/topo/treematch/topo_treematch_dist_graph_create.c
	ompi/mpi/c/lookup_name.c
	ompi/mpi/c/publish_name.c
	ompi/mpi/c/unpublish_name.c
	opal/mca/mpool/base/mpool_base_alloc.c
	opal/util/Makefile.am
2017-05-12 14:41:05 -04:00
Gilles Gouaillardet
026f3dd2dd pmix2x: plug a misc memory leak
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-05-10 14:57:44 +09:00
Ralph Castain
0afcb1a448 Update to support server self-notifications
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-08 10:04:50 -07:00
Ralph Castain
ef0e0171c9 Implement the changes required to support cross-library coordination. Update PMIx to support intra-process notifications and ensure that we always notify ourselves for events. Add a new ompi/interlib directory where cross-lib coordination code can go, and put the code to declare ourselves there (called from ompi_mpi_init.c).
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-08 10:04:50 -07:00
Ralph Castain
3bca715780 Fix pmix configury so that libpmix is still emitted when --with-devel-headers is given, even under static builds
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-05-05 11:15:32 -07:00
Jeff Squyres
eb03679d7f Merge pull request #3444 from jsquyres/pr/fix-pmix-static-devel-header-builds
pmix/configure.m4: always use embedded mode
2017-05-04 14:25:28 -04:00
Jeff Squyres
af336ac0e8 pmix/configure.m4: always use embedded mode
Looks like embedded mode was mistakenly disabled when
--with-devel-headers was specified.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-05-04 10:01:41 -07:00
Ralph Castain
a737d0f963 Merge pull request #3430 from bosilca/topic/tcp_hostname
Use the OPAL function to get the hostname.
2017-05-03 06:42:02 -07:00
Brian Barrett
3b991498be btl tcp: Don't set socket buffer size by default
Set the default send and receive socket buffer size to 0,
which means Open MPI will not try to set a buffer size during
startup.

The default behavior since near day one of the TCP BTL has been
to set the send and receive socket buffer sizes to 128 KiB.  A
number that works great on 1 GbE, but not so great on 10 GbE
fabrics of any real size.  Modern TCP stacks, particularly on
Linux, have gotten much smarter about buffer sizes and are much
less efficient if a buffer size is set (even if set to something
large).

Signed-off-by: Brian Barrett <bbarrett@amazon.com>
2017-04-28 14:14:49 -07:00
George Bosilca
2d8943d920
Use the OPAL function to get the hostname. 2017-04-28 02:48:15 -04:00
Nathan Hjelm
387467c358 btl/ugni: remove erroneous mca_btl_ugni_frag_return call
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-04-27 09:14:51 -06:00
Ralph Castain
8b1f01dfe6 Set the default modex parameters back to full blocking modex while we continue to test and debug the slow modex - it seems to be having issues on the Cray
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-22 15:19:46 -07:00
Howard Pritchard
f2a27cc991 Merge pull request #3396 from hppritcha/topic/swat_compiler_warning
btl/sm: swat a compiler warning
2017-04-22 14:31:21 -06:00
Ralph Castain
f2ed293ecd Merge pull request #3398 from rhc54/topic/modex
Implement a background fence that collects all data during modex operation
2017-04-21 15:15:49 -07:00
Ralph Castain
9fc3079ac2 Implement a background fence that collects all data during modex operation
The direct modex operation is slow, especially at scale for even modestly-connected applications. Likewise, blocking in MPI_Init while we wait for a full modex to complete takes too long. However, as George pointed out, there is a middle ground here. We could kickoff the modex operation in the background, and then trap any modex_recv's until the modex completes and the data is delivered. For most non-benchmark apps, this may prove to be the best of the available options as they are likely to perform other (non-communicating) setup operations after MPI_Init, and so there is a reasonable chance that the modex will actually be done before the first modex_recv gets called.

Once we get instant-on-enabled hardware, this won't be necessary. Clearly, zero time will always out-perform the time spent doing a modex. However, this provides a decent compromise in the interim.

This PR changes the default settings of a few relevant params to make "background modex" the default behavior:

* pmix_base_async_modex -> defaults to true

* pmix_base_collect_data -> continues to default to true (no change)

* async_mpi_init - defaults to true. Note that the prior code attempted to base the default setting of this value on the setting of pmix_base_async_modex. Unfortunately, the pmix value isn't set prior to setting async_mpi_init, and so that attempt failed to accomplish anything.

The logic in MPI_Init is:

* if async_modex AND collect_data are set, AND we have a non-blocking fence available, then we execute the background modex operation

* if async_modex is set, but collect_data is false, then we simply skip the modex entirely - no fence is performed

* if async_modex is not set, then we block until the fence completes (regardless of collecting data or not)

* if we do NOT have a non-blocking fence (e.g., we are not using PMIx), then we always perform the full blocking modex operation.

* if we do perform the background modex, and the user requested the barrier be performed at the end of MPI_Init, then we check to see if the modex has completed when we reach that point. If it has, then we execute the barrier. However, if the modex has NOT completed, then we block until the modex does complete and skip the extra barrier. So we never perform two barriers in that case.

HTH
Ralph

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-21 10:29:23 -07:00
Jeff Squyres
1d5e08f44a usnic: more iov_limit fixes
Follow on to 7bd2de9960: move setting
the iov_limit to 1 earlier in the startup sequence.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-04-21 09:14:28 -07:00
Howard Pritchard
782f1bb9af btl/sm: swat a compiler warning
gnu 6.3.1 complaining about uninitialized variable

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-04-21 10:02:56 -05:00
Jeff Squyres
e9e89e502b Merge pull request #3245 from hjelmn/auto_bool
mca/base: accept y and n for bool and auto bool enumerator
2017-04-21 10:41:10 -04:00
Howard Pritchard
462342d148 Merge pull request #3311 from hppritcha/topic/libfabric_moves_to_ofi
common/libfabric: move libfabric to ofi
2017-04-21 07:50:38 -06:00
Jeff Squyres
7bd2de9960 usnic: ensure to set the iov_limit to 1
The usNIC BTL does not use more than 1 iov, so be sure to set it to 1
so that we don't allocate cq/rq/sq entries based on a default (i.e.,
>1) number of iovs per entry.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-04-20 13:28:15 -07:00
Howard Pritchard
841192645b common/libfabric: move libfabric to ofi
This PR renames the common library for OFI libfabric from
libfabric to ofi.  There are a number of reasons this
is good to do:

1) its shorter and replaces 9 characters with three for
   function names for what may eventually be a fairly extensive interface
2) OFI is the term used for MTL and RML components that use
   the OFI libfabric interface
3) A planned OSC component will also use the OFI term.
4) Other HPC libraries that can use OFI libfabric tend to use
   the term "ofi" internally and also in their configure options
   relevant to OFI libfabric (i.e. MPICH/CH4, Intel MPI, Sandia SHMEM)

There seem to be comments in places in the Open MPI source
code that indicate that this common library will be going away.
Far from it as we will want to be able to share things like
AV objects between OMPI and possibly OSHMEM components that
use the OFI libfabric interface.

This PR also adds a synonym to the --with-libfabric(-libdir)
configury options: --with-ofi and with-ofi-libdir.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-04-20 13:07:16 -06:00
Nathaniel Graham
34b4aeb17f Merge pull request #3339 from nrgraham23/mpirun_help_improvements
Additional mpirun --help changes
2017-04-19 14:05:07 -06:00
Nathaniel Graham
01312b2f90 Additional mpirun --help changes
This commit recategorizes several mpirun arguments,
and moves the information for mpirun --help arguments
to the bottom of the general help message.  I also
added the OPAL_CMD_LINE_OTYPE field to two commands
that were missed initially because they were not
in the same area as the others.

Signed-off-by: Nathaniel Graham <ngraham@lanl.gov>
2017-04-19 11:43:45 -06:00
Jeff Squyres
a0543616ee dl/dlopen: add libs to wrapper LIBS
With this, libs (e.g., "-ldl") are not added to the wrapper LIBS
flags.  This may work on some platforms, but on at least RHEL 7.3, it
does not (i.e., compiling MPI applications fails because it can't find
dlopen).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-04-15 09:30:18 -07:00
Ralph Castain
ffbfd22d84 Fix event registration - need to increment the event index and record the number of codes in the event handler
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-13 17:35:10 -07:00
Joshua Hursey
3ad3d4e3e7 opal_info: Add ability to report load failures
* Add a path for failed component load information to be reported up.
 * This allows ompi_info to display this information inline to make it
   easier for folks to see if the component is present but failed for
   some reason. Most likely a missing library, but could be a libnl
   conflict.
 * Add MCA parameter to enable this feature:
   - `mca_base_component_track_load_errors` takes a boolean
   - Default: `false`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-04-12 16:06:21 -05:00
Ralph Castain
9f73974fe1 Update to latest PMIx master, including disabling the pmi-1 and pmi-2 backward compatibility as these interfere with the s1,s2 components
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-12 12:34:27 -07:00
Ralph Castain
95ae0d1df3 Cleanup timing macros for portability across compilers. Rename the --enable-timing configure option to be --enable-pmix-timing so it doesn't pickup external timing requests. Remove a stale function reference in PMIx so it can compile with timing enabled.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-10 12:56:38 +06:00
Mark Allen
655a06f559 IB fork
The key change was in btl_openib_connect_udcm.c where a buffer was
being pinned with size 65664 (whether openib was being used or not).
The start of the buffer was page aligned, but because of the size
the end wasn't. That makes it too easy for a forked child to accidentally
touch pinned memory on the same page as the end of that buffer.

So this change increases the size of the allocated buffer to use the
rest of the page.

I inspected the rest of the ibv_reg_mr() calls and changed one other
place to page align its buffer too, although I think the above is
the one that really matters.

Signed-off-by: Mark Allen <markalle@us.ibm.com>
2017-04-05 17:35:52 -04:00
Gilles Gouaillardet
10ea991d0a hwloc: add CUDA include dir to CPPFLAGS
so hwloc configury can find nvml.h when CUDA support is built

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-04-05 11:46:22 +09:00
Gilles Gouaillardet
8d7541f766 hwloc: disable nvml is CUDA support is not built in Open MPI
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-04-05 11:07:34 +09:00
Ralph Castain
92c996487c Update how we pass the node regex so we pass _all_ nodes, even those without daemons. This allows the backend daemons to form a complete picture of the allocation. Include info on which nodes have daemons on them, and populate that info on the backend as well.
Set the daemons' state to "running" and mark them as "alive" by default when constructing the nidmap

Get the DVM running again

Fix direct modex by eliminating race condition caused by releasing data while sending it

Up the size limit before compressing

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-03 19:25:15 -07:00
Ralph Castain
2cc5fea8be Update to PMIx v2.0alpha
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-04-03 10:02:29 -07:00
Gilles Gouaillardet
81062b7cd2 hwloc: update hwloc to 1.11.6
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-03-31 13:35:16 +09:00
Ralph Castain
7dd34d0c9a Use the correct callback data - the callback function was expecting a bool*, not a pmix_ptl_sr_t*.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-28 17:21:47 -07:00
Nathan Hjelm
676cfe2a35 mca/base: accept y and n for bool and auto bool enumerator
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-28 09:20:14 -06:00
Ralph Castain
b398d721d5 Merge pull request #3236 from rhc54/topic/craycleanups
Silence a flood of warnings when compiling with gcc on Cray
2017-03-24 13:33:46 -07:00
Ralph Castain
ecc8000136 Silence a flood of warnings when compiling with gcc on Cray
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 13:37:11 -06:00
Ralph Castain
470452cba0 Correctly check the sa_family and cast the data correctly before passing it to inet_nop, and don't be quite as fancy with the pointer arithmetic as the combination was causing us to segfault every time this debug message was called.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 11:42:57 -07:00
Ralph Castain
35f817911e Fix coverity issues
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-24 08:09:46 -07:00
Ralph Castain
c0bcd11bcf Fix permissions - no CI required
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-23 08:05:52 -07:00
Ralph Castain
55e4fba5f5 If we lose connection to the server after initiating a send/recv in PMIx (e.g., in PMIx_Abort), then we need to "resolve" all pending recvs to avoid hanging.
Fixes #3225

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-23 02:53:21 -07:00
Ralph Castain
d645557fa0 Update to include the PMIx 2.0 APIs for monitoring and job control. Include required integration, but leave the monitors off for now. Move the sensor framework out of ORTE as it is being absorbed into PMIx
Fix typo and silence warnings

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-21 17:47:08 -07:00
Ralph Castain
4b6d220a83 You cannot include both pmi.h and pmi2.h as they have conflicting defines in them.
Thanks to Kilian Cavalotti for pointing it out

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-19 11:53:54 -07:00
Jeff Squyres
ce0e1cd32c Merge pull request #3201 from hppritcha/jjhursey-topic/timer-gettimeofday
Jjhursey topic/timer gettimeofday
2017-03-18 20:12:36 -04:00
Jeff Squyres
b8dfd49e97 hwloc: re-enable use of autogen.pl in a tarball
Commit fec519a793 broke the ability to
run autogen.pl in a distribution tarball.  This commit restores that
ability by also distributing opal/mca/hwloc/autogen.options in the
tarball.

Skipping CI because CI does not test this functionality:

[skip ci]
bot:notest

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-17 11:41:17 -07:00
Jeff Squyres
b51c4e2797 memory/patcher: fix a compiler warning
Don't define the madvise intercept functions since we're not currently
intercepting madvise.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-16 05:43:51 -07:00
Jeff Squyres
616f20c52c timer/linux: rename component-specific functions
Several component-specific functions were named with a prefix of
"opal_timer_base", which was quite confusing.  Rename them to have a
prefix "opal_timer_linux" to make it clear that they are here in this
component (and different than *actual* opal_timer_base symbols).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-15 21:03:13 -05:00
Jeff Squyres
290d4598df timer/linux: remove global variable
This variable is only used in one file, so make it static.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-15 21:03:06 -05:00
Nathan Hjelm
6b210fa2c4 btl/ugni: do not return a frag from sendi if an endpoint is waitlisted
This fixes a hang that can occur when running bandwidth tests.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-14 10:14:13 -06:00
Nathan Hjelm
2e42b0afbd btl/ugni: move connection check into sync event
This commit makes datagram checks time based and reduces their
frequency when only the wildcard datagram is posted. This change
improves latency on knl systems.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-14 10:10:05 -06:00
Nathan Hjelm
d5aaeb74b6 btl/ugni: return a descriptor from sendi
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-13 14:56:54 -06:00
Nathan Hjelm
a19e7023d1 btl/ugni: always check local SMSG CQ
This commit removes the local operation count check from the local SMSG
completion queue. This check was leading to hangs due to an undocumented
feature of the ugni library. The local SMSG CQ is used to send credit
return messages back to the sender. The ugni library never checks for
the completion itself but relying on the SMSG user to periodically
check the CQ.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-13 14:56:54 -06:00
Nathan Hjelm
d5cdeb81d0 btl/ugni: improve multi-threaded performance
This commit updates the ugni btl to make use of multiple device
contexts to improve the multi-threaded RMA performance. This commit
contains the following:

 - Cleanup the endpoint structure by removing unnecessary field. The
   structure now also contains all the fields originally handled by the
   common/ugni endpoint.

 - Clean up the fragment allocation code to remove the need to
   initialize the my_list member of the fragment structure. This
   member is not initialized by the free list initializer function.

 - Remove the (now unused) common/ugni component. btl/ugni no longer
   need the component. common/ugni was originally split out of
   btl/ugni to support bcol/ugni. As that component exists there is no
   reason to keep this component.

 - Create wrappers for the ugni functionality required by
   btl/ugni. This was done to ease supporting multiple device
   contexts. The wrappers are thread safe and currently use a spin
   lock instead of a mutex. This produces better performance when
   using multiple threads spread over multiple cores. In the future
   this lock may be replaced by another serialization mechanism. The
   wrappers are located in a new file: btl_ugni_device.h.

 - Remove unnecessary device locking from serial parts of the ugni
   btl. This includes the first add-procs and module finalize.

 - Clean up fragment wait list code by moving enqueue into common
   function.

 - Expose the communication domain flags as an MCA variable. The
   defaults have been updated to reflect the recommended setting for
   knl and haswell.

 - Avoid allocating fragments for communication with already
   overloaded peers.

 - Allocate RDMA endpoints dyncamically. This is needed to support
   spreading RMA operations accross multiple contexts.

 - Add support for spreading RMA communication over multiple ugni
   device contexts. This should greatly improve the threading
   performance when communicating with multiple peers. By default the
   number of virtual devices depends on 1) whether
   opal_using_threads() is set, 2) how many local processes are in the
   job, and 3) how many bits are available in the pid. The last is
   used to ensure that each CDM is created with a unique id.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-13 14:46:06 -06:00
Nathan Hjelm
12bf38a25c btl/ugni: add MPI_T performance variables for ugni counters
This commit exposes ugni statistics for use with MPI_T. There is
no overhead to providing these counters.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-13 14:42:58 -06:00
Ralph Castain
c6bc3ccb76 Sync to latest PMIx master and PMIx reference server
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-11 12:50:38 -08:00
Nathan Hjelm
3caeda21dc memory/patcher: do not hook madvise
It is not possible to hook madvise at this time due to a deadlock when
using glibc.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-03-07 16:26:53 -07:00
Joshua Ladd
e2ba60b778 Merge pull request #3111 from jladd-mlnx/topic/cx5-device-param
Adding latest ConnectX-5 adapter vendor part id to OpenIB device params.
2017-03-07 13:55:46 -05:00
Nathan Hjelm
15ea9c5524 Merge pull request #3013 from hjelmn/rcache_lifo
rcache/base: do not free memory with the vma lock held
2017-03-07 09:11:04 -07:00
Jeff Squyres
c2adf359cf Merge pull request #3083 from ggouaillardet/topic/hwloc_v15
hwloc: add support for hwloc v1.5
2017-03-07 10:01:24 -05:00
Joshua Ladd
b28647857f Adding latest ConnectX-5 adapter vendor part id to OpenIB device params.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2017-03-07 00:19:54 +02:00
Ralph Castain
aca7091114 Fix some minor compatibility issues by ensuring job-level data gets stored against wildcard rank in the cray, s1, and s2 components, and that the ext1 component translates all wildcard rank requests into the peer's rank since v1.x of PMIx doesn't understand wildcard ranks
Closes #3101

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-05 10:30:59 -08:00
Ralph Castain
1de72ff023 Silence an unnecessary error log
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-02 17:18:34 -08:00
Gilles Gouaillardet
7e01be60d9 hwloc: add support for hwloc v1.5
hwloc v1.5 does not support HWLOC_OBJ_OSDEV_COPROC
nor hwloc_topology_dup(), so for this version :
- do not search for coprocessors
- do not try hwloc_topology_dup(), note this is not
  used anywhere in the code base

Thanks Jeff for helping with the wording

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-03-03 09:39:24 +09:00
Ralph Castain
83199979ba Remove the stale opal/sec framework
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-03-02 15:41:56 -08:00
Jeff Squyres
5b484c91f4 btl/tcp: use show_help to print the dropped-TCP warning
Make the message more friendly / more detailed, and de-duplicate it
(just in case it happens a lot).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-03-01 16:31:29 -08:00
George Bosilca
ec4a235e6a
Allow a TCP proc release during the create.
This is mostly for error cases, where we need to release the
newly created proc. Currently the code deadlocks because the endpoint
lock is help at the release and the lock is not recursive.

Aslo added some code to print the IP addresses that don't match during
the TCP connection step.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-03-01 13:17:54 -05:00
Jeff Squyres
d5266aba90 Merge pull request #2955 from jsquyres/pr/hwloc-external-fixes
Fix --with-hwloc=external
2017-02-28 14:57:07 -05:00
Josh Hursey
0006f0d7c5 Merge pull request #2773 from jjhursey/topic/hook-fwk
Add a 'hook' framework
2017-02-28 12:29:50 -06:00
Jeff Squyres
fec519a793 hwloc: rename opal/mca/hwloc/hwloc.h -> hwloc-internal.h
Per a prior commit, the presence of "hwloc.h" can cause ambiguity when
using --with-hwloc=external (i.e., whether to include
opal/mca/hwloc/hwloc.h or whether to include the system-installed
hwloc.h).

This commit:

1. Renames opal/mca/hwloc/hwloc.h to hwloc-internal.h.
2. Adds opal/mca/hwloc/autogen.options to tell autogen.pl to expect to
   find hwloc-internal.h (instead of hwloc.h) in opal/mca/hwloc.
3. s@opal/mca/hwloc/hwloc.h@opal/mca/hwloc/hwloc-internal.h@g in the
   rest of the code base.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-28 07:48:42 -08:00
Joshua Hursey
c10bbfded6 ompi/hook: Add the hook/license framework
* Include a 'demo' component that shows some of the features.
 * Currently has hooks for:
   - MPI_Initialized
     - top, bottom
   - MPI_Init_thread
     - top, bottom
   - MPI_Finalized
     - top, bottom
   - MPI_Init
     - top (pre-opal_init), top (post-opal_init), error, bottom
   - MPI_Finalize
     - top, bottom
 * Other places in ompi can 'register' to hook into any one of these places
   by passing back a component structure filled with function pointers.
 * Add a `MCA_BASE_COMPONENT_FLAG_REQUIRED` flag to the MCA structure that
   is checked by the `hook` framework. If a required, static component has
   been excluded then the `hook` framework will fail to initialize.
   - See note in `opal/mca/mca.h` as to why this is checked in the `hook`
     framework and not in `opal/mca/base/mca_base_component_find.c`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-02-27 12:05:53 -05:00
Gilles Gouaillardet
af0b5cffb4 asm: rename the AMD64 into X86_64
in this context, AMD64 really means amd64 or em64t, so let's
rename this into X86_64 in order to avoid any confusion

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-27 15:10:50 +09:00
Ralph Castain
e86a0dbf39 Update to PMIx master to include dlopen fixes and addition of libltdl support
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-22 11:54:33 -08:00
Nathan Hjelm
60ad9d1817 rcache/base: do not free memory with the vma lock held
This commit makes the vma tree garbage collection list a lifo. This
way we can avoid having to hold any lock when releasing vmas. In
theory this should finally fix the hold-and-wait deadlock detailed
in #1654.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-21 21:04:46 -07:00
Ralph Castain
8cffdcf127 Ensure that the pmix headers and lib get installed when --with-devel-headers is given so that PMIx applications can be built and executed against the "embedded" PMIx version
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-21 13:46:46 -08:00
Gilles Gouaillardet
4184c01be5 Merge pull request #2393 from bosilca/topic/no_predefined_ddt_refcount
Don't refcount the predefined datatypes.
2017-02-21 09:38:11 +09:00
Gilles Gouaillardet
bb2481a84b pmix2x: synchronize to the latest PMIx master
pmix/master@f57d9b2953

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-20 10:45:17 +09:00
Ralph Castain
f49118eaab Fix some pmix configuration code
Remove stale file reference that caused a check to always fail. Update psm2 function check to new libs

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-16 10:54:47 -08:00
Howard Pritchard
b272f87926 Merge pull request #2968 from hjelmn/pmix_cray
pmix/cray: performance improvements and cleanup
2017-02-16 11:41:59 -07:00
Ralph Castain
201f8571ca Ensure we retain the peer object until we are done with it, then detect that the socket has closed due to a lost connection and cleanly release the message event
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-15 18:30:55 -08:00
Ralph Castain
223495325d Fix binding policy bug and support pe=1 modifier
Allow someone to specify the "pe=N" modifier to a mapping policy when N=1. This equates to just "bind-to core", but helps people who use a script to set the PE policy. Fix a bug where setting the binding policy left a lingering "if-supported" flag that shouldn't be there.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-15 14:55:17 -08:00
Ralph Castain
9cd7349d7c Instead of completely free'ing the event base, pause the PMIx progress thread before tearing down the infrastructure, and then release the event base at the end of the procedure. This allows any infrastructure objects holding events to delete them prior to free'ing the event base.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-15 05:02:43 -08:00
Ralph Castain
f7fe2f7189 Merge pull request #2977 from rhc54/topic/spawn
Fix comm_spawn by registering nspace info only when needed
2017-02-15 04:31:54 -08:00
Ralph Castain
68b53e2179 Fix comm_spawn by registering nspace info only when needed - either when we have local procs, or when job-level info is required by connecting jobs
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-14 19:47:56 -08:00
Ralph Castain
404fe327be Merge pull request #2973 from rhc54/topic/cleanups
Update to newest PMIx master (includes configuration cleanups). Silence trivial Coverity warning in hwloc base.
2017-02-14 17:38:18 -08:00
Ralph Castain
0c8609ca16 Update to newest PMIx master (includes configuration cleanups). Silence trivial Coverity warning in hwloc base.
Cleanup a race condition segfault during finalize by ensuring the PMIx progress thread is stopped prior to starting to tear down the messaging components

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-14 15:14:00 -08:00
Nathan Hjelm
8562b87ad3 Merge pull request #2967 from hjelmn/auto_bool
mca/base: add new base enumerator (auto_bool)
2017-02-14 12:25:56 -07:00
Nathan Hjelm
5683e7836f Merge pull request #2965 from hjelmn/deprecated_fix
mca/base: fix deprecated variable help message
2017-02-14 12:22:11 -07:00
Nathan Hjelm
3b912ea2a7 pmix/cray: performance improvements and cleanup
Do not use opal_output_verbose inside O(n) loops. This was causing us
to make O(n) calls to snprintf which was greatly slowing launch at
scale.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 11:13:10 -07:00
Nathan Hjelm
9e692ce264 mca/base: add new base enumerator (auto_bool)
This commit adds a new base enumerator type for variables that take of
the values -1, 0, and 1. These values are mapped to the strings auto,
false, true. This commit updates the mpi_leave_pinned MCA variable to
use the new enumerator.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 10:21:45 -07:00
Nathan Hjelm
33676c9960 mca/base: fix deprecated variable help message
Actually print out the original variable name.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-02-14 09:55:43 -07:00
Ralph Castain
35578b4009 Update to lastest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-02-13 23:19:26 -08:00
Jeff Squyres
a8247a76c9 Merge pull request #2948 from jsquyres/pr/update-warn-component-unused
help btl base: tell how to disable the warning
2017-02-09 21:10:01 -05:00
Jeff Squyres
e272250531 help btl base: tell how to disable the warning
As reported in
https://www.mail-archive.com/users@lists.open-mpi.org/msg30607.html,
give instructions in the show_help message how to disable the
warning.  Thanks to Susan Schwarz for reporting the issue.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-09 15:51:30 -08:00
Gilles Gouaillardet
be26152839 Merge pull request #2939 from ggouaillardet/topic/pmix2x_6ed27be839e3f17a2b93885321e15fb26d802e93
pmix2x: Update to latest PMIx master
2017-02-08 16:40:57 +09:00
Gilles Gouaillardet
3d0541f2bf mpool/memkind: add a missing include file
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-08 16:06:22 +09:00
Gilles Gouaillardet
7acef4833e pmix2x: Update to latest PMIx master
pmix/master@6ed27be839

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-08 13:23:27 +09:00
KAWASHIMA Takahiro
4b2eba34a6 Merge pull request #2933 from kawashima-fj/pr/dstore-config-desc
pmix/pmix2x: Correct configure option description
2017-02-08 13:03:27 +09:00
George Bosilca
bc2890ed11
Upon a new connection go over all available ifaces.
Add a verbose to show all the failed attempts to match the
remote interfaces based on the modex info.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-02-07 19:15:49 -05:00
Jeff Squyres
100b112d3c pmix: fix zlib protection macro usage
It's possible that we can have zlib.h but still not have zlib support.
Use the correct macro to protect the usage of calling zlib functions.

This fixes 32-bit MTT builds at Cisco (e.g.,
https://mtt.open-mpi.org/index.php?do_redir=2389).

Submitted upstream to PMIX: https://github.com/pmix/master/pull/290

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-02-07 05:52:32 -08:00
KAWASHIMA Takahiro
750406f67b pmix/pmix2x: Correct configure option description
`--enable-pmix-dstore` option was enabled by default in f4a5511.

Signed-off-by: KAWASHIMA Takahiro <t-kawashima@jp.fujitsu.com>
2017-02-07 11:52:56 +09:00
Gilles Gouaillardet
c62498ab3d btl/tcp: remove reference to just removed tcp_local
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-07 09:32:09 +09:00
Jeff Squyres
368ab4d9a5 Merge pull request #2684 from bosilca/topic/tcp_fixes
Remove the tcp_local field from the TCP component.
2017-02-06 16:32:06 -05:00
bosilca
c331e6794c Allow all tuned MCA parameters to be modified programatically. (#2829)
Fix a comment in the MCA header.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-01-31 21:47:36 -05:00
Ralph Castain
edcfdf2365 Update to latest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-31 08:01:37 -08:00
Gilles Gouaillardet
b078e57e73 pmix/ext1x: fix misc memory leaks in namespace registration
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 10:52:42 +09:00
Gilles Gouaillardet
f51fc293a2 ext1x/pmix1x_client: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 10:52:42 +09:00
Gilles Gouaillardet
022cca79ea pmix/ext1x: plug a memory leak in opal_lkupcbfunc()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 10:52:36 +09:00
Gilles Gouaillardet
f485d12a82 pmix: rename the ext11 component into ext1x
also use the same naming scheme thann pmix/ext2x

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 10:52:35 +09:00
Gilles Gouaillardet
dccb1899e6 pmix/ext11: correctly use PMIx_server_register_nspace()
PMIx_server_register_nspace() is an asynchronous operation, so
the pmix glue wait for it completes before returning.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 09:23:19 +09:00
Gilles Gouaillardet
6955e1e25c pmix/ext11: fix compilation
the argc field from the opal_pmix_app_t struct was removed,
so adjust the pmix/ext11 glue accordingly.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-30 09:23:18 +09:00
Howard Pritchard
fca45a2742 mca help: fix typo found by user
Fix typo found by @pozdneev

Fixes #2821

bot:notest

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-01-28 09:37:43 -07:00
Ralph Castain
3302864a7d Cleanup a typo that can cause a segfault - use a local variable name different than the one passed into the function
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-27 16:49:25 -08:00
Josh Hursey
2e64bf42fb Merge pull request #2810 from jjhursey/fix/ibm/stdiag-to-stdout
Extend options for stddiag routing
2017-01-26 14:29:16 -06:00
Josh Hursey
770c41f493 Merge pull request #2807 from jjhursey/fix/ibm/event-external
libevent/external: Add opal_event_include to this component
2017-01-26 14:26:50 -06:00
Jeff Squyres
2c277a66fd Merge pull request #2772 from jjhursey/topic/stacktrace-improv
master: opal/stacktrace improvements
2017-01-26 10:48:41 -08:00
Joshua Hursey
6d98559be9 stacktrace: Add flexibility in stacktrace ouptut
- New MCA option: opal_stacktrace_output
   - Specifies where the stack trace output stream goes.
   - Accepts: none, stdout, stderr, file[:filename]
   - Default filename 'stacktrace'
     - Filename will be `stacktrace.PID`, or if VPID is available,
       then the filename will be `stacktrace.VPID.PID`
 - Update util/stacktrace to allow for different output avenues
   including files. Previously this was hardcoded to 'stderr'.
 - Since opal_backtrace_print needs to be signal safe, passing it a
   FILE object that actually represents a file stream is difficult. This
   is because we cannot open the file in the signal handler using
   `fopen` (not safe), but have to use `open` (safe). Additionally, we
   cannot use `fdopen` to convert the `int fd` to a `FILE *fh` since it
   is also not signal safe.
   - I did not want to break the backtrace.h API so I introduced a new
     rule (documented in `backtrace.c`) that if the `FILE *file`
     argument is `NULL` then look for the `opal_stacktrace_output_fileno`
     variable to tell you which file descriptor to use for output.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-26 11:55:32 -06:00
Nathan Hjelm
fe1c6bd881 Merge pull request #2840 from hjelmn/event_fix
verbs: remove extra event user increment/decrement operation
2017-01-26 07:30:24 -08:00
Gilles Gouaillardet
896434b1bd pmix/ext2x: plug a memory leak in opal_lkupcbfunc()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-26 14:07:15 +09:00
Gilles Gouaillardet
6b8e1c217c pmix/ext2x: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-26 14:06:58 +09:00
Nathan Hjelm
9f28c0af39 verbs: remove extra event user increment/decrement operation
Since the oob and connections systems do not work the same way they
did in older versions of Open MPI these operations are no longer
necessary. At best they do nothing and at worst they hurt performance
by making us enter the event library more often in opal_progress().

Fixes #2839

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-25 18:37:06 -07:00
Gilles Gouaillardet
142b95df87 pmix/ext2x: plug misc memory leaks regarding opal_pmix2x_event_chain_t handling
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:17:10 +09:00
Gilles Gouaillardet
7a3d39f079 pmix/ext2x: plug a memory leak in _reg_nspace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-25 16:17:01 +09:00
Joshua Hursey
dcd9801f7c orte/iof: Add orte_map_stddiag_to_stdout option
* Similar to `orte_map_stddiag_to_stderr` except it redirects `stddiag`
   to `stdout` instead of `stderr`.
 * Add protection so that the user canot supply both:
   - `orte_map_stddiag_to_stderr`
   - `orte_map_stddiag_to_stdout`

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:22:59 -06:00
Joshua Hursey
d6b306d716 libevent/external: Add opal_event_include to this component
* Adds a parameter to adjust the method used by libevent.
   - Matches that of the libevent2022 component.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-24 16:03:09 -06:00
Ralph Castain
ef86707fbe Deprecate the --slot-list paramaeter in favor of --cpu-list. Remove the --cpu-set param (mark it as deprecated) and use --cpu-list instead as it was confusing having the two params. The --cpu-list param defines the cpus to be used by procs of this job, and the binding policy will be overlayed on top of it.
Note: since the discovered cpus are filtered against this list, #slots will be set to the #cpus in the list if no slot values are given in a -host or -hostname specification.

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-24 13:33:22 -08:00
Josh Hursey
c6595c2289 Merge pull request #2792 from jjhursey/topic/libevent-conf2
libevent2022: Fix broken configure AC_LANG_PROGRAM
2017-01-24 08:31:46 -06:00
Gilles Gouaillardet
682f5116aa Merge pull request #2781 from ggouaillardet/topic/misc_fixes_and_plugs
fix misc bugs and plug misc memory leaks
2017-01-24 14:41:45 +09:00
Joshua Hursey
72ac812039 libevent2022: Fix broken configure AC_LANG_PROGRAM
* Similar to commit 029964a748
   This removes an extra `int main` during configure.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-23 21:47:59 -06:00
Gilles Gouaillardet
189da7fdab pmix2x: plug a memory leak in _event_hdlr()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:30 +09:00
Gilles Gouaillardet
acbc32d3b2 pmix2x: plug a memory leak in opal_lkupcbfunc()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
b5b21043c4 pmix2x: plug a memory leak in _reg_nspace()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Gilles Gouaillardet
0f47310a75 pmix2x/pmix2x_client: plug misc memory leaks
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-24 09:13:29 +09:00
Joshua Hursey
029964a748 libevent2022: Fix broken configure AC_LANG_PROGRAM
* The AC_LANG_PROGRAM macro adds the `main()` so it is erroneous
   to add it to the test program.
 * This was detected with the XL compilers which will fail to
   build the program in this situation. The GNU compiler does not
   error out or warn, but successfully compiles the program.

Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
2017-01-23 13:44:12 -06:00
Ralph Castain
8c960bae8d Update to latest PMIx master
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-23 07:07:40 -08:00
George Bosilca
999d4973a9
Fix an issue with extremely large data identified by tjb900.
Due to the conversion from ssize_t to int we were losing bytes, and
ended up writing outside the receiver buffer. Similarly on the send,
due to the conversion to a lesser type, we could missinterpret the
end of the fragment.
2017-01-18 10:33:12 -05:00
Nathan Hjelm
91c34c8df6 Merge pull request #2703 from hjelmn/rcache_fix
rcache/base: do not release vma stuctures in vma_tree_delete
2017-01-12 09:53:34 -07:00
Jeff Squyres
938ab01ad6 Merge pull request #2714 from hjelmn/timer_rollover
timer/linux: prevent 64-bit overflow
2017-01-12 06:40:52 -05:00
Nathan Hjelm
45c05880aa timer/linux: prevent 64-bit overflow
The linux timer code was multiplying the result of the x86 time stamp
counter by 1000000 before dividing by the cpu frequency. This can
cause us to overflow 64 bits if the time stamp counter grows larger
than ~ 1.8e13 (about 8400 seconds after boot). To fix the issue the
units of opal_timer_linux_freq have been changed to MHz.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-11 20:03:10 -07:00
Gilles Gouaillardet
aeee48357a btl/sm: correctly handle nodes with zero NUMA hwloc object
the hwloc topology might not contain a NUMA object with hwloc < v2
if the node is not NUMA, so force the NUMA object count to one
in order to correctly allocate mca_btl_sm_component.sm_mpools.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-12 11:45:29 +09:00
George Bosilca
c2cd717f82 Don't refcount the predefined datatypes.
Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-01-11 16:48:59 -05:00
Ralph Castain
31a8476223 Merge pull request #2702 from rhc54/topic/cov
Silence Coverity CID 1398541
2017-01-10 17:50:23 -08:00
Nathan Hjelm
79cabc92fd rcache/base: do not release vma stuctures in vma_tree_delete
This commit fixes a deadlock that can occur when the libc version
holds a lock when calling munmap. In this case we could end up calling
free() from vma_tree_delete which would in turn try to obtain the lock
in libc. To avoid the issue put any deleted vma's in a new list on the
vma module and release them on the next call to vma_tree_insert. This
should be safe as this function is not called from the memory hooks.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-10 16:58:07 -07:00
Ralph Castain
e568b211e4 Silence Coverity CID 1398541
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2017-01-10 15:30:50 -08:00
Jeff Squyres
b980e334dc usnic: add completion stats
This should probably not go to the v2.x branch, since it changes the
output format of the usnic stats.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
706f53bb01 usnic: ensure that stats string is always truncated
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
1fdd0fe228 usnic: add missing params to show_help() call
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
7048adec04 usnic: add some assert()s
Add some run-time assert checks for debug builds.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
2d28ccb5fd usnic: add verbose output of queue lengths
Show the actual RX/TX and CQ length returned by libfabric in verbose
output.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
bd5b8ed754 usnic: ensure that queues are long enough
Double check the queue lengths that we get back from libfabric to
ensure that they are at least as long as we need.  They *should* never
be shorter than we need, but let's just check to be sure.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
53dc75a89c usnic: ensure to reset flags on returned frags
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
c4d7876ca0 usnic: check send credits on data channel for data frags
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
879d25e5df usnic: ensure to check send credits for ACKs
Don't just blindly send ACKs; ensure that we have send credits before
doing so.  If we don't have any send credits, just don't send the ACK
(it'll come again soon enough; it's not a tragedy if we don't send it
now).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
7787dad4db usnic: ensure CQs are long enough
The libfabric usnic provider may give you back TX/RX queues that are
longer than you asked for.  So just use the TX/RQ/CQ lengths that we
asked for, regardless of what length comes back.

Additionally, keep the length of the priority channel CQ separate from
the length of the data CQ.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
b02d8c48f5 usnic: make the releasing safer
Since the usnic BTL is single-threaded in this area, there really is
no danger, but don't use one of the pointers hanging off the frag
after we return it to the freelist.  Instead, save the endpoint
pointer before returning the frag.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
e25b860627 usnic: clarify types
The types are technically typedef equivalent, but it's less confusing
to use the types that agree with the name of the constructor.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
40fe575132 usnic: trivial updates (no code/logic changes)
- Add more explanatory comments
- Trivial whitespace / style updates
- Rename opal_btl_usnic_force_retrans() -> opal_btl_usnic_fast_retrans()

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 10:40:02 -08:00
Gilles Gouaillardet
6d59b476de Merge pull request #2686 from ggouaillardet/topic/pmix2x_ptl_base_sendrecv
pmix2x: ptl/base: send header and message data together via writev()
2017-01-10 16:26:10 +09:00
Gilles Gouaillardet
44c1ff60f1 Merge pull request #2672 from ggouaillardet/topic/misc_memory_leaks
Plug misc memory leaks
2017-01-10 13:16:04 +09:00
Gilles Gouaillardet
a01960bee5 pmix2x: ptl/base: send header and message data together via writev()
on Linux, sending the header and then the message data does severely
impact performances of ptl/tcp :
on the receiver, reading the data can often result in an PMIX_ERR_RESOURCE_BUSY
or PMIX_ERR_WOULD_BLOCK, which ends up degrading performances)
this commit send both header and message data at the same time via writev()
and makes ptl/tcp virtually as efficient as ptl/usock.

Short writev generally occur when the kernel buffer is full, so there is no
point for retrying in this case.

fwiw, no such degradation was observed on OSX.

Refs open-mpi/ompi#2657

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-10 13:07:39 +09:00
Nathan Hjelm
d6bd69dc93 mca/base: account for NULL string_value in verbose set
The MCA variable code calls the string from value function with a NULL
string to verify values. The verbosity enumerator was not correctly
checking for a non-NULL value before trying to set the string.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-09 11:52:31 -07:00