1
1
Граф коммитов

887 Коммитов

Автор SHA1 Сообщение Дата
George Bosilca
bc2890ed11
Upon a new connection go over all available ifaces.
Add a verbose to show all the failed attempts to match the
remote interfaces based on the modex info.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-02-07 19:15:49 -05:00
Gilles Gouaillardet
c62498ab3d btl/tcp: remove reference to just removed tcp_local
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-02-07 09:32:09 +09:00
Jeff Squyres
368ab4d9a5 Merge pull request #2684 from bosilca/topic/tcp_fixes
Remove the tcp_local field from the TCP component.
2017-02-06 16:32:06 -05:00
Nathan Hjelm
9f28c0af39 verbs: remove extra event user increment/decrement operation
Since the oob and connections systems do not work the same way they
did in older versions of Open MPI these operations are no longer
necessary. At best they do nothing and at worst they hurt performance
by making us enter the event library more often in opal_progress().

Fixes #2839

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2017-01-25 18:37:06 -07:00
George Bosilca
999d4973a9
Fix an issue with extremely large data identified by tjb900.
Due to the conversion from ssize_t to int we were losing bytes, and
ended up writing outside the receiver buffer. Similarly on the send,
due to the conversion to a lesser type, we could missinterpret the
end of the fragment.
2017-01-18 10:33:12 -05:00
Gilles Gouaillardet
aeee48357a btl/sm: correctly handle nodes with zero NUMA hwloc object
the hwloc topology might not contain a NUMA object with hwloc < v2
if the node is not NUMA, so force the NUMA object count to one
in order to correctly allocate mca_btl_sm_component.sm_mpools.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-12 11:45:29 +09:00
Jeff Squyres
b980e334dc usnic: add completion stats
This should probably not go to the v2.x branch, since it changes the
output format of the usnic stats.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
706f53bb01 usnic: ensure that stats string is always truncated
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
1fdd0fe228 usnic: add missing params to show_help() call
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:54 -08:00
Jeff Squyres
7048adec04 usnic: add some assert()s
Add some run-time assert checks for debug builds.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
2d28ccb5fd usnic: add verbose output of queue lengths
Show the actual RX/TX and CQ length returned by libfabric in verbose
output.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
bd5b8ed754 usnic: ensure that queues are long enough
Double check the queue lengths that we get back from libfabric to
ensure that they are at least as long as we need.  They *should* never
be shorter than we need, but let's just check to be sure.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:32 -08:00
Jeff Squyres
53dc75a89c usnic: ensure to reset flags on returned frags
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
c4d7876ca0 usnic: check send credits on data channel for data frags
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
879d25e5df usnic: ensure to check send credits for ACKs
Don't just blindly send ACKs; ensure that we have send credits before
doing so.  If we don't have any send credits, just don't send the ACK
(it'll come again soon enough; it's not a tragedy if we don't send it
now).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:06:31 -08:00
Jeff Squyres
7787dad4db usnic: ensure CQs are long enough
The libfabric usnic provider may give you back TX/RX queues that are
longer than you asked for.  So just use the TX/RQ/CQ lengths that we
asked for, regardless of what length comes back.

Additionally, keep the length of the priority channel CQ separate from
the length of the data CQ.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
b02d8c48f5 usnic: make the releasing safer
Since the usnic BTL is single-threaded in this area, there really is
no danger, but don't use one of the pointers hanging off the frag
after we return it to the freelist.  Instead, save the endpoint
pointer before returning the frag.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
e25b860627 usnic: clarify types
The types are technically typedef equivalent, but it's less confusing
to use the types that agree with the name of the constructor.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 12:03:53 -08:00
Jeff Squyres
40fe575132 usnic: trivial updates (no code/logic changes)
- Add more explanatory comments
- Trivial whitespace / style updates
- Rename opal_btl_usnic_force_retrans() -> opal_btl_usnic_fast_retrans()

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2017-01-10 10:40:02 -08:00
George Bosilca
cfeeecd381 Remove the tcp_local field from the TCP component.
Instead use the OPAL process name to get the name
of the local process.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2017-01-07 13:24:18 -05:00
Gilles Gouaillardet
c2ddb1e2fc mca/base: plug a memory leak
register mca_base_var_enum_value_flag_t so they can be free'd
upon finalize

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 13:46:36 +09:00
Gilles Gouaillardet
7e5da7382e btl/tcp: plug leaks when closing component
remove tcp_local from the tcp_procs table, and release it

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-06 11:35:59 +09:00
Ralph Castain
fe68f23099 Only instantiate the HWLOC topology in an MPI process if it actually will be used.
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:

* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
  instead of the topology itself, if available, thus avoiding instantiating the topology

* openib BTL. This uses the distance matrix. At present, I haven't developed a method
  for replacing that reference. Thus, this component will instantiate the topology

* usnic BTL. Uses the distance matrix.

* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
  the topology

* ess base functions. If a process is direct launched and not bound at launch, this
  code attempts to bind it. Thus, procs in this scenario will instantiate the
  topology

Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.

Fix pernode binding policy

Properly handle the unbound case

Correct pointer usage

Do not free static error messages!

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-29 10:33:29 -08:00
Ralph Castain
3a2d6a5ab6 Begin to reduce reliance of application procs on the topology tree itself by having the daemon provide more detailed info. In this case, provide the topology description string so that procs can readily determine the number of types of objects on the node, and a "locality" string that describes which objects this process is executing upon. The latter allows a process to compute the objects of overlap between itself and another proc without consulting the topology tree.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-28 09:14:26 -08:00
Gilles Gouaillardet
54c84196a6 btl/vader: plug a memory leak
as reported by Coverity with CID 1362691

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-22 16:04:36 +09:00
Gilles Gouaillardet
3a76a78bff btl/openib: plug a memory leak in btl_openib_register_mca_params()
Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-12-01 14:24:30 +09:00
Gilles Gouaillardet
1a279c4ee9 btl/self: fix fragment segment length in mca_btl_self_prepare_src()
opal_convertor_pack() might pack less bytes than requested,
so always set frag->segments[0].seg_len.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-24 10:44:56 +09:00
Howard Pritchard
09f47fcf8e btl/ugni:vader swat some compiler warnings
Swat some compiler warnings.

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2016-11-21 14:58:34 -06:00
Joshua Ladd
4907085c6f Add the ConnectX-5 device ID to openib BTL.
Signed-off-by: Joshua Ladd <jladd.mlnx@gmail.com>
2016-11-16 21:42:37 +02:00
George Bosilca
d0dddef53d
Protect the tcp_endpoints list from concurrent accesses.
Thanks Gilles for your help.

Signed-off-by: George Bosilca <bosilca@icl.utk.edu>
2016-11-11 00:06:03 -05:00
Gilles Gouaillardet
a49422fe84 btl/tcp: get rid of the MCA_BTL_TCP_SUPPORT_PROGRESS_THREAD macro
since pthreads are now mandatory, the MCA_BTL_TCP_SUPPORT_PROGRESS_THREAD
is always true and hence can be safely removed

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2016-11-08 14:00:05 +09:00
Jeff Squyres
a4ffa590c8 Merge pull request #2308 from hjelmn/vader_mem
btl/vader: reduce memory footprint when using xpmem
2016-11-02 10:28:26 -04:00
Steve Wise
7050969d47 openib btl: remove BTL_OPENIB_FAILOVER_ENABLED code
Remove BTL_OPENIB_FAILOVER_ENABLED code in the openib btl source.

Remove the failover-specific files from the openib btl.

Update the openib/Makefile.am accordingly.

Remove the -enable-openib-failover config logic.

Signed-off-by: Steve Wise <swise@opengridcomputing.com>
2016-11-01 14:45:36 -07:00
Jeff Squyres
149b660666 btl/usnic: fix compiler warning
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-28 07:36:20 -07:00
Nathan Hjelm
9d92075e60 btl/self: rewrite to decrease memory usage (#2307)
This commit rewrites much of the btl/self component to fix a long
standing memory usage bug. Before this commit the prepare_src path
would always allocate a max send fragment (256kB). This caused the
rank to allocate 32 * 256k useless buffers from one send. This commit
makes the following changes:

 - Add the MCA_BTL_FLAGS_GET flag by default. No reason not to set it.

 - Reduce the eager limit, max send size, buffers per allocation, and
   maximum buffer count per fragment size. These changes should have
   no noticible affect on performance but should greatly reduce the
   memory usage of the component.

 - Implement the sendi function. This should reduce self send latency
   somewhat.

 - Rewrite prepare_src to never allocate a eager or max send fragment
   for contiguous data.

 - add_procs needs to return something in the peer array for the proc
   self not just set the reachability bit. Now stores (void *) 1.

 - Various cleanups. Removed and unused file.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-10-27 12:34:54 -04:00
Nathan Hjelm
a652a193ea btl/vader: reduce memory footprint when using xpmem
The vader btl kept a per-peer registration cache to keep track of
attachments. This is not really a problem with small numbers of local
ranks but can be a problem with large SMP machines. To reduce the
footprint there is now one registration cache for all xpmem
attachments. This will probably increase the lookup time for large
transfers but is a worthwhile trade-off.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-10-27 10:09:43 -06:00
Potnuri Bharat Teja
29f1aa836f btl/openib: remove unwanted ompi header inclusion in opal code.
OMPI header cannot be included in OPAL source code, hence removed it.
Fixes: (740b636db) btl/openib: Disqualify rdmacm CPC if
MPI_THREAD_MULTIPLE.

Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2016-10-13 16:21:36 +05:30
Jeff Squyres
bcbf0bc4f9 usnic: s/OMPI/OPAL/
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-11 16:43:35 -07:00
Gilles Gouaillardet
c92e9a5406 use the new OPAL_HASH_TABLE_FOREACH convenience macro 2016-10-08 16:58:20 +09:00
Jeff Squyres
67684be7c9 usnic: fix one last stray fabric_attr->name --> linux_device_name
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-04 18:17:38 -07:00
Jeff Squyres
8b77359cac usnic: remove some legacy libfabric 1.0/1.1 code
We only support running with libfabric v1.3 or greater.  So it's safe
to remove the legacy/adaptive cq_readerr() behavior.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-03 11:59:41 -07:00
Jeff Squyres
345c07a252 usnic: require libfabric >= v1.3 at run time
There are critical usnic libfabric AV insert bugs before v1.3, so
don't allow any version prior to v1.3 at run time (still allow
*compiling* with earlier versions, though, since the ABI guarantees
allow us to compile with an earlier libfabric and run with a later
libfabric).

Switch to using fi_version() to check the version (instead of calling
fi_getinfo()) as a potentially lighter-weight / simpler solution.
This allows us to only call fi_getinfo() once.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-03 11:59:41 -07:00
Jeff Squyres
b13813810f usnic: print a helpful message invoke PML error callback
The previous message was unhelpful / confusing.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-10-03 11:59:41 -07:00
Jeff Squyres
545d8f2e66 usnic cagent: correctly compute the "large" ping message size
The (effective) "+42" computation was, in fact, the incorrect answer
in this case (gasp!).

We should just take the max_msg_size from the command (which came from
the libfabric endpoint max_msg_size attribute in the client) and
subtract off the max header size: 68 (which is explained in the
comment).  This will result in a "large" message size which is likely
slightly smaller than the MTU, but still right up near the MTU, and
therefore good enough.

Note: the old computation (i.e., -(68-42)) worked fine when we asked
for Libfabric API v1.1 because the usnic provider would return a
max_msg_size that was already less than the MTU due to FI_PREFIX
behavior shenanigans.  Once we started asking for Libfabric API v1.4,
the usnic Libfabric provider started returning (MTU + prefix_size),
and the -(68-42) computation started giving a value that was over the
MTU.  This caused sendto() on the connectivity checker UDP socket
to fail.

This commit also removes an old/misleading comment.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-30 17:01:05 -07:00
Joshua Hursey
f6f24a4f67 build: Custom libmpi(_FOO) name option in configure
* Add a configure time option to rename libmpi(_FOO).*
   - `--with-libmpi-name=STRING`
 * This commit only impacts the installed libraries.
   Internal, temporary libraries have not been renamed to limit the
   scope of the patch to only what is needed.

For example:
```shell
shell$ ./configure --with-libmpi-name=wookie
...
shell$ find . -name "libmpi*"
shell$ find . -name "libwookie*"
./lib/libwookie.so.0.0.0
./lib/libwookie.so.0
./lib/libwookie.so
./lib/libwookie.la
./lib/libwookie_mpifh.so.0.0.0
./lib/libwookie_mpifh.so.0
./lib/libwookie_mpifh.so
./lib/libwookie_mpifh.la
./lib/libwookie_usempi.so.0.0.0
./lib/libwookie_usempi.so.0
./lib/libwookie_usempi.so
./lib/libwookie_usempi.la
shell$
```
2016-09-29 21:47:24 -05:00
Jeff Squyres
1a5a5fb400 Merge pull request #1861 from bharatpotnuri/master
btl/openib: Disqualify rdmacm CPC if MPI_THREAD_MULTIPLE
2016-09-27 13:03:35 -04:00
Potnuri Bharat Teja
740b636dbe btl/openib: Disqualify rdmacm CPC if MPI_THREAD_MULTIPLE
The rdmacm CPC in the openib BTL is not thread safe. The rdmacm CPC
should disqualify itself (instead of failing in random ways) if
MPI_THREAD_MULTIPLE is the thread level.

Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2016-09-27 14:20:59 +05:30
George Bosilca
93fa94f96f Re-enable support for local addresses.
This patch is based on the "RFC: Reenabling the TCP BTL over local
interfaces (when specifically requested)". It removes the hardcoded
exception for the local devices that has been enforced by the
TCP BTL. Instead, we exclude the local interface only via the
exclude MCA (both IPv4 and IPv6 local addresses are already in the
default if_exclude), which is also the behavior currently described in
our README file.
2016-09-23 13:04:33 -04:00
Nathan Hjelm
a681837ba8 btl/tcp: fix double list remove
This commit fixes an abort during finalize because pending events were
removed from the list twice.

References #2030

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-13 09:23:12 -06:00
Jeff Squyres
527efec4fb Merge pull request #2050 from jsquyres/pr/btl-tcp-help-messages
Add a show_help message to TCP BTL when peer unexpectedly disconnects
2016-09-06 09:40:31 -04:00
Jeff Squyres
1953e3406f btl/tcp: add show_help message when peer hangs up
We commonly see messages on the users list where a peer has hung up
because it has crashed.  Instead of having just a BTL_ERROR message,
make this a real opal_show_help() message that tells the user that the
peer unexpectedly hung up, and they should look into *why* that peer
hung up.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-06 09:40:03 -04:00
Gilles Gouaillardet
4b208e4463 btl/tcp: make mca_btl_tcp_proc_insert re-entrant
otherwise bad things happen with
 --mca btl_tcp_progress_thread 1 (non default)
and
 --mca mpi_add_procs_cutoff 0 (default)
2016-09-05 15:57:34 +09:00
Jeff Squyres
95c6f6cfc0 btl/tcp: fix help message
It looks like one help message was accidentally pasted in the middle
of another.  Disentangle the two messages from each other, and
slightly tweak the one message to say that the job may also crash (in
addition to hanging).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-09-02 17:14:22 -04:00
Nathan Hjelm
f93c1f2106 btl/ugni: fix erroneous warning message
This commit prevents the connection code from trying to connect an
endpoint if the directed datagram has been posted but not received.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-09-02 09:17:44 -06:00
Jeff Squyres
87a5ccc060 usnic: show the local UDP ports
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-26 12:25:18 -07:00
Jeff Squyres
9ae51a09f2 Merge pull request #1989 from jsquyres/pr/update-usnic-to-libfabric-v1.4
Update usnic BTL to libfabric v1.4
2016-08-26 09:53:07 -04:00
Jeff Squyres
f56b16f079 usnic: remove unused variable
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
9717bcb7e6 btl/usnic: remove stale comment
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:18 -07:00
Jeff Squyres
6f5e377fe0 btl/usnic: update for libfabric v1.4
With libfabric v1.4, the usnic provider changed the values of its
fabric and domain name strings (compared to libfabric <v1.4).  Update
the Open MPI usNIC BTL to handle both pre-v1.4 and v1.4 fabric/domain
names.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-08-25 03:53:17 -07:00
George Bosilca
3adff9d323 Fixes #1793.
Reshape the tearing down process (connection close) to prevent race
conditions between the main thread and the progress thread.

Minor cleanups.
2016-08-24 22:45:19 -04:00
Nathan Hjelm
83062db7cb btl/ugni: actually make the endpoint lock recursive
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-24 10:36:08 -06:00
Potnuri Bharat Teja
9b7f9ece20 Add Chelsio T6 adapter device parameters.
Signed-off-by: Potnuri Bharat Teja <bharat@chelsio.com>
2016-08-23 10:38:13 +05:30
Nathan Hjelm
adb668209b btl/ugni: fix another connection race
This commit fixes a race that can occur when two threads are in the
ugni progress function at the same time. This race occurs when one
thread calls GNI_PostDataProbeById then goes to sleep then another
thread calls GNI_PostDataProbeById then GNI_EpPostDataWaitById before
the other thread wakes up. If this happens the first thread will print
a warning on GNI_EpPostDataWaitById about no matching post.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-08 15:38:11 -06:00
Todd Kordenbrock
b90da992c8 Merge pull request #1895 from PDeveze/Patchs-on-btl-portals4
btl/portals4: Take into account the limitation of portals4 (max_msg_s…
2016-08-08 15:12:50 -05:00
Nathan Hjelm
14b36d4503 btl/ugni: protect against re-entry and races in connections
This commit fixes two issues that can occur during a connection:

 - Re-entry to connection progress from modex lookup. Added an
   additional endpoint state that will keep the code from re-entering
   the common endpoint create.

 - Fixed a race between a process posting a directed datagram through
   a send and a connection being progressed through opal_progress().
   The progress code was not obtaining the endpoint lock before
   attempting to update the endpoint. To limit the amount of code
   changed for 2.0.1 this commit makes the endpoint lock recursive. In
   a future update this may be changed.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-08-04 16:08:01 -06:00
Nathan Hjelm
5e13e1ab7d btl/openib: set send flags only after endpoint is connected
The max inline send size on a queue pair is not available until after
the endpoint is connected. Before this commit the send flags
(including the inline flag) were set before this value was
initialized. This commit moves setting the send_flags down to
mca_btl_openib_put_internal which is only called after the endpoint is
connected. This fixes a bug when using osc/rdma.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-07-26 16:01:11 -06:00
Gilles Gouaillardet
91ccec342c btl/openib: remove some dead code
remove useless call to opal_mem_hooks_support_level() and the value local variable.
2016-07-22 09:26:33 +09:00
Gilles Gouaillardet
1b3be0ac8c configury + btl/openib: fix a typo
test for existence of struct ibv_exp_device_attr.exp_atomic_cap.
That was previously mistyped struct ibv_exp_device_attr.ext_atomic_cap
2016-07-22 09:26:33 +09:00
Pascal Deveze
6d6ec66705 btl/portals4: Take into account the limitation of portals4 (max_msg_size) 2016-07-19 15:19:29 +02:00
Nathan Hjelm
01d6da31af btl/openib: fix rdmacm locking bug
This commit fixes a long standing bug in rdmacm. It is required that
the thread that calls mca_btl_openib_endpoint_cpc_complete holds the
endpoint lock. This was not the case for rdmacm. This causes debug
builds to abort. This change also required changing
mca_btl_openib_endpoint_send_cts to require the endpoint lock to be
held when calling.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-30 15:50:07 -06:00
Nathan Hjelm
960fcd292c btl/openib: fix rdma hang
This commit is an attempt to fix a hang in finalize of rdmacm. This fixes
a path where no rdmacm client is found for an endpoint.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-29 20:31:26 -06:00
Jeff Squyres
f18d6606da Merge pull request #1824 from hjelmn/rdmacm_fix
btl/openib: fix segmentation fault
2016-06-28 18:10:35 -04:00
Nathan Hjelm
8128c8eb29 btl/openib: fix segmentation fault
This commit fixes a segmentation fault that occurs if a device can be
initialized but not used. In this case the devices_count is not equal
to the number of usable devices in the devices pointer array.

Thanks to @artpol84 for tracking this down.

Fixes open-mpi/ompi#1823

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-28 10:31:32 -06:00
Nathan Hjelm
dac9201f3b Merge pull request #1770 from hjelmn/rdma_wth
btl/openib: fix rdmacm
2016-06-24 22:46:53 -06:00
Thananon Patinyasakdikul
afe07cd5d5 Fixed common symbol in btl/usnic
- This commit fixes the accidental common symbol btl_usnic_lock
- It also moves the btl_usnic_lock declaration to btl_usnic.h
2016-06-20 10:05:44 -07:00
Jeff Squyres
7a8d7fb948 openib: fix compiler warnings
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-06-18 07:15:11 -07:00
Thananon Patinyasakdikul
7bd18214a7 Fix btl/usnic deadlock when the connectivity check is turned off. 2016-06-15 07:42:55 -07:00
Thananon Patinyasakdikul
ee85204c12 Added MPI_THREAD_MULTIPLE support for btl/usnic. 2016-06-13 13:47:06 -07:00
Nathan Hjelm
17ae1aceeb btl/openib: fix rdmacm
The rdma_disconnect function specifies that both the server and client
should call rdma_disconnect. The code was not calling rdma_disconnect
on an endpoint if the event came before the endpoint finalization.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-07 17:53:58 -06:00
Nathan Hjelm
dd519c55b1 btl/openib: fix cq resize calculation
Before dynamic add_procs the openib_btl_size_queues was called exactly
once for non-dynamic jobs. Now the function is called on each new
connection so the calculation was wrong. Re-wrote the function to
correctly calculate the CQ size and only attempt to adjust the CQ if
the requested size has changed. This fixes a bug when using the openib
btl on psm2 hardware that is caused by the time needed to resize a
CQ. The overhead was causing udcm to timeout and fail.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-06-07 16:05:56 -06:00
Nathan Hjelm
6169d03ea3 btl: adjust values of new atomic flags
Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-06-02 19:21:34 -06:00
Nathan Hjelm
9f43b23725 Merge pull request #1710 from hjelmn/ugni_atomics
Additional ugni atomics
2016-06-02 18:25:49 -06:00
Nathan Hjelm
ceb2912838 Merge pull request #1736 from hjelmn/ugni_fixes
ugni BTL fixes
2016-06-01 14:59:55 -06:00
Gilles Gouaillardet
57978a75d0 Merge pull request #1717 from ggouaillardet/topic/lex_cleanup
configury: clean the flex generated .c files
2016-06-01 13:06:21 +09:00
Nathan Hjelm
5d4bcce042 Merge pull request #1700 from shamisp/topic/cma_config
CMA: Fixing logic for CMA system call detection
2016-05-31 20:33:48 -06:00
Nathan Hjelm
340152a635 Merge pull request #1720 from shamisp/topic/vader/max_addr
VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms.
2016-05-31 20:33:28 -06:00
Gilles Gouaillardet
5f565dfec3 configury: clean the flex generated .c files 2016-06-01 11:13:31 +09:00
Nathan Hjelm
bf10d79914 btl/ugni: remove erroneous unlock
The endpoint lock was being released twice in mca_btl_ugni_get_ep.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-31 16:52:53 -06:00
Nathan Hjelm
cc96097873 btl/ugni: fix bug when attempting unaligned get on aries
This commit fixes a programming error when using an aries nic. The
documentation of ugni shows that only the local alignment restriction
for get was lifted on aries. There is still a remote address alignment
restriction.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-31 16:52:09 -06:00
George Bosilca
d2abff583e Fix race condition during BTL TCP tear-down.
bot🏷️bug
bot:assign:@hjelmn
2016-05-30 10:47:14 -05:00
Nathan Hjelm
28dfa36a3f btl/ugni: fix bug when attempting unaligned get on aries
This commit fixes a programming error when using an aries nic. The
documentation of ugni shows that only the local alignment restriction
for get was lifted on aries. There is still a remote address alignment
restriction.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-27 08:22:13 -06:00
Nathan Hjelm
c19426ac1b btl/ugni: add support for additional atomic operations
This commit adds support for Cray Aries atomic operations. This
includes 32-bit and floating point support.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-27 08:22:13 -06:00
Nathan Hjelm
23fe19a956 btl: add support for more atomics
This commit add support for more atomic operations and type. The
operations added are logical and, logical or, logical xor, swap, min,
and max. New types are 32-bit int by using the
MCA_BTL_ATOMIC_FLAG_32BIT flag, 64-bit float by using the
MCA_BTL_ATOMIC_FLAG_FLOAT flag, and 32-bit float by using both
flags. Floating point numbers are supported by packing the number in
as an int64_t or int32_t. We will update the btl interface in the
future to make this less confusing.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-27 08:22:13 -06:00
Nathan Hjelm
8c9292d5d1 Merge pull request #1721 from hjelmn/xrc_fix
btl/openib: fix XRC WQE calculation
2016-05-26 17:00:31 -06:00
Nathan Hjelm
56bdcd0888 btl/openib: fix XRC WQE calculation
Before dynamic add_procs support was committed to master we called
add_procs with every proc in the job. The XRC code in the openib btl
was taking advantage of this and setting the number of work queue
entries (WQE) based on all the procs on a remote node. Since that is
no longer the case we can not simply increment the sd_wqe field on the
queue pair. To fix the issue a new field has been added to the xrc
queue pair structure to keep track of how many wqes there are total on
the queue pair. If a new endpoint is added that increases the number
of wqes and the xrc queue pair is already connected the code will
attempt to modify the number of wqes on the queue pair. A failure is
ignored because all that will happen is the number of active send work
requests on an XRC queue pair will be more limited.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-26 15:58:31 -06:00
Aurelien Bouteiller
49bd28d0ac Merge pull request #1714 from hjelmn/scif_exclusivity
btl/scif: reduce default exclusivity
2016-05-26 17:53:11 -04:00
Pavel Shamis (Pasha)
60fd25f3fb VADER: Adjusting VADER_MAX_ADDRESS for non x86 platforms.
The original VADER_MAX_ADDRESS was tunned for x86_64 platforms only.
For non x86_64 platforms we can use XPMEM_MAXADDR_SIZE.

Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
2016-05-26 16:38:04 -05:00
Nathan Hjelm
99627319f0 btl/ugni: reduce overhead of progress function
This commit reduces the overhead of calling the ugni progress
function. It does the following:

 - Check for new connections once every eight calls.

 - Do not call remote smsg progress unless we are connected to at
   least one remote peer.

 - Do not call rdma progress unless at least one rdma fragment is
   outstanding.

 - Check endpoint wait list size before obtaining a lock.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 14:27:34 -06:00
Nathan Hjelm
5caf12cd9b btl/scif: reduce default exclusivity
This commit reduces the default exclusivity so that btl/scif is not
used for send/recv over other shared memory transports.

Fixes open-mpi/ompi#1712

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-25 14:25:07 -06:00
Pavel Shamis (Pasha)
d984b4b3f9 CMA: Fixing logic for CMA system call detection
The OPAL_CMA_NEED_SYSCALL_DEFS is always defined/set to 0 or 1.  Therefore
instead of checking if the macro is defined, we have to look at the value
itself.

Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
2016-05-24 14:53:25 -05:00
Gilles Gouaillardet
d5a2ac6f2f btl/openib: fix #if vs #ifdef 2016-05-23 14:27:33 +09:00
Gilles Gouaillardet
5a8cbe5a8f btl/openib: remove obsolete reference to MEMORY_LINUX_MALLOC_ALIGN_ENABLED macro 2016-05-23 14:12:21 +09:00
Jeff Squyres
66f53ec29a Merge pull request #1628 from kmroz/wip-btl-tcp-ethtool-speed
btl/tcp: autodetect bandwidth and latency if unset by the user
2016-05-18 12:12:55 -04:00
Karol Mroz
ca6ddf3270 btl/tcp: autodetect bandwidth and latency if unset
Fixes open-mpi/ompi#120

Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Karol Mroz
b9c6c43c6b btl/tcp: add default defines for bandwidth and latency
Signed-off-by: Karol Mroz <mroz.karol@gmail.com>
2016-05-18 16:25:52 +02:00
Nathan Hjelm
ab8ed177f5 rcache: fix deadlock in multi-threaded environments
This commit fixes several bugs in the registration cache code:

 - Fix a programming error in the grdma invalidation function that can
   cause an infinite loop if more than 100 registrations are
   associated with a munmapped region. This happens because the
   mca_rcache_base_vma_find_all function returns the same 100
   registrations on each call. This has been fixed by adding an
   iterate function to the vma tree interface.

 - Always obtain the vma lock when needed. This is required because
   there may be other threads in the system even if
   opal_using_threads() is false. Additionally, since it is safe to do
   so (the vma lock is recursive) the vma interface has been made
   thread safe.

 - Avoid calling free() while holding a lock. This avoids race
   conditions with locks held outside the Open MPI code.

Fixes open-mpi/ompi#1654.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-17 09:02:40 -06:00
Gilles Gouaillardet
456b73da69 btl/openib: fix error path in init_one_device()
do not explicitly release ib verbs components since they will
be released in the object destructor

Thanks Durga for the report
2016-05-13 09:03:48 +09:00
Ralph Castain
08022d7af1 Some minor cleanups of warnings from gcc 6.0.0. Update s1/s2 pmix to get max_procs as required. 2016-05-05 15:28:13 -07:00
Nathan Hjelm
0f54a95408 Merge pull request #1626 from hjelmn/vader_32
btl/vader: fix compilation on 32-bit systems
2016-05-03 16:39:46 -06:00
Nathan Hjelm
e7ccbdee27 btl/vader: fix compilation on 32-bit systems
This commit fixes a compile/link issue caused by vader. The vader btl
was using OPAL_THREAD_ADD64 to increment a counter which may not be
available on 32-bit systems. Changed to use OPAL_THREAD_ADD_SIZE_T
which will be 64-bit or 32-bit depending on the system.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-03 10:14:44 -06:00
Nathan Hjelm
a65af6d079 btl/openib: fix check for exp verbs struct members
This commit fixes a compilation issue with some versions of exp
verbs. In some cases struct ibv_exp_device_attr does not have either
the exp_atom or exp_atomic_cap fields. It is fine to drop one check
and fall back to the non-exp attribute check on the other.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-05-02 17:13:33 -06:00
George Bosilca
3445577f4c Avoid race conditions during BTP TCP handshake.
In some rare cases when a process receives the connect ack while
locally updating the peer endpoint structure, we could drop the
incomming connect ack due to the fact that the send handler is
protected with a try lock (on the endpoint) and our initial send
event was not persistent. Making the send event persistent solves
all issues.
2016-05-01 14:19:29 -04:00
George Bosilca
702f80ad7e Remove "signed vs. unsigned" warnings. 2016-05-01 11:45:48 -04:00
Nathan Hjelm
03f4a854cb btl/tcp: fix add_procs race condition
This commit fixes a race between a thread calling the tcp btl's
add_procs and a thread processing an incomming connection. The race
occured because the add_procs thread adds a newly created proc object
to the hash table *before* the object is fully initialized. The
connection thread then attempts to use the object before the endpoints
array on the object has beeen allocation. The fix is to only add the
proc to the hash table after it has been completely initialized.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-27 10:24:39 -06:00
Jeff Squyres
dc18c32437 usnic: fix resource check
The math for checking the number of QPs and CQs per usNIC/VF was
incorrect, allowing you to run MPI processes even when usNICs (i.e.,
VIC VFs) had fewer QPs and CQs than were necessary.  This led to a
confusing error later when fi_enable(3) failed (because we lazily
create QPs).  Fixing the math here ensure that we actually print a
helpful error message telling the user specifically what is wrong.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-04-22 15:58:27 -07:00
Jeff Squyres
6800ef9ec0 m4: rename OMPI_SUMMARY_* macros to OPAL_SUMMARY_*
These macros should really be named OPAL_SUMMARY_*; they're used in
all projects, and therefore should be in the lowest later project (OPAL).

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-04-20 08:40:00 -07:00
Nathan Hjelm
1e6b4f2f55 Merge pull request #1495 from hjelmn/new_hooks
Add new patcher memory hooks
2016-04-13 18:19:23 -06:00
Nathan Hjelm
c2b6fbb124 opal/memory: move initialization to first rcache creation
Because of the removal of the linux memory component it is no longer
necessary to initialize the memory component in opal_init(). This
commit moves the initialization to the creation of the first rcache
component.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-13 17:21:46 -06:00
Gilles Gouaillardet
c72688e8cf Merge pull request #1362 from ggouaillardet/topic/openib_warn_default_gid_prefix
btl/openib: correctly issue a warning when two btls or more are in th…
2016-04-11 13:22:48 +09:00
Thananon Patinyasakdikul
92290b94e0 Fixed Coverity reports 1358014-1358018 (DEADCODE and CHECK_RETURN) 2016-04-07 12:52:17 -04:00
Nathan Hjelm
9efd465539 Merge pull request #1517 from hjelmn/ugni_fixes
Gemini/Aries bug fixes
2016-04-05 07:23:18 -06:00
George Bosilca
26fc8533f8 Remove compiler warnings. 2016-04-04 16:34:23 -04:00
Nathan Hjelm
d7874920aa btl/ugni: set the frag reference count in the eager get path
This comit adds code that sets the fragment reg_cnt to 1 when sending
the completion message for an eager get. Without this the btl will
either hang or abort.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-04-02 12:10:22 -06:00
Ralph Castain
7eca2f9650 Add missing include 2016-03-30 01:34:01 -07:00
Gilles Gouaillardet
e852d85cc1 btl/tcp: add missing mca_btl_tcp_dump() subroutine 2016-03-30 16:10:15 +09:00
Ralph Castain
f70c5c495b Tsk...tsk...replace references to ompi values with opal 2016-03-29 13:35:43 -07:00
George Bosilca
d0165818b3 Initialize all common symbols. 2016-03-29 16:08:27 -04:00
Jeff Squyres
91c54d7a07 Merge pull request #1491 from ICLDisco/progress_thread
BTL TCP async progress
2016-03-29 06:26:10 -04:00
George Bosilca
f69eba1bc4 Update the copyright and cleanup the code.
Per @jsquyres suggestion remove all trailing spaces.
Credit to `sed -i.bak 's/ *$//' */[ch]`.
2016-03-28 14:41:01 -04:00
Thananon Patinyasakdikul
92062492b9 Enable Threading in the BTL TCP
Added mca parameter to turn progress thread on/off
Add a flag to check if we have btl progress thread.
Added macro for ob1 matching lock.
Update the AUTHORS file.
2016-03-28 14:41:01 -04:00
George Bosilca
32277db6ab Add support for async progress in the BTL TCP.
All BTL-only operations (basically all data movements
with the exception of the matching operation) can now
be handled for the TCP BTL by a progress thread.
2016-03-28 14:40:50 -04:00
Jeff Squyres
4a3c986a80 usnic: remove need for hwloc verbs helper
Haven't needed this for a while, but it got left in by accident.

Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-03-28 09:10:12 -07:00
Jeff Squyres
05e2423756 usnic: specify the cache name
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-03-28 09:02:52 -07:00
Gilles Gouaillardet
4a76f23f40 btl/openib: do not issue an error message if modex cannot retrieve openib info 2016-03-28 10:42:16 +09:00
Gilles Gouaillardet
861564df94 btl/openib: correctly issue a warning when two btls or more are in the default subnet gid
Thanks Matias Cabral for reporting this issue.

Fixes #1352
2016-03-28 09:17:29 +09:00
Nathan Hjelm
d6e90f24b1 Merge pull request #1483 from hjelmn/flag_enum_2
RFC: Add support for flag enumerators for MCA variables
2016-03-26 11:43:33 -06:00
Jeff Squyres
017f242b1b opal: remove some unused variables / compiler warnings
Signed-off-by: Jeff Squyres <jsquyres@cisco.com>
2016-03-26 03:50:57 -07:00
rhc54
ba8c8700aa Merge pull request #1493 from rhc54/topic/sing
Update singularity support to track changes in upstream Singularity code
2016-03-24 15:16:38 -07:00
Ralph Castain
8c14df2328 Revert "Modify singularity support per patch from Greg Kurtzer"
This reverts commit open-mpi/ompi@f7257a8310.

Ensure that we properly cleanup the session directory tree. Prior code had issues with symlinks, especially if the file that the link points to was already removed as we traverse the tree. Also found that the dirent checks for directory type weren't fully portable, and so fall back to the stat-based approach which is known to be portable.

Fix singularity singletons by detecting we are in a container and properly setting the pmix selection to pick the isolated component. Remove a stale restriction blocking use of the sm btl
2016-03-24 11:27:18 -07:00
Joshua Ladd
c2813a48e6 Merge pull request #1488 from alinask/topic/openib_enhance_verbose
btl/openib: enhance the verbosity level when using rdmacm without a first PP QP
2016-03-24 10:17:21 -04:00
George Bosilca
ab8008e9fe Nathan missed one reference to mpool. 2016-03-24 00:52:59 -04:00
Nathan Hjelm
478feaf622 btl/scif: update for mpool/rcache rewrite
This commit brings the scif btl up to date with changes made on master
to rework the mpool and rcache frameworks.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-23 16:30:10 -06:00
Alina Sklarevich
2e5a1372dd btl/openib: enhance the verbosity level when using rdmacm without a first PP QP. 2016-03-23 18:49:12 +02:00
Nathan Hjelm
7572c8b74f btl: use flag enumerator for btl_*_flags and btl_*_atomic_flags
This commit uses the new flag "enumerator" to support comma-delimited
lists of flags for both the btl and btl atomic flags. After this
commit is is valid to specify something like -mca btl_foo_flags
self,put,get,in-place. All non-deprecated flags are supported by the
enumerator.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 15:23:37 -06:00
Nathan Hjelm
b15a45088c mca: add support for flag enumerators
This commit adds a new type of enumerator meant to support flag
values. The enumerator parses comma-delimited strings and matches
each string or value to a list of valid flags. Additionally, the
enumerator does some basic checks to see if 1) a flag is valid in the
enumerator, and 2) if any conflicting flags are specified.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 15:20:56 -06:00
Nathan Hjelm
645bd9d9dd btl/openib: update rdmacm for dynamic add_procs
This commit adds the data necessesary for supporting dynamic add_procs
to the rdma message (opal_process_name_t). The endpoint lookup
function has been updated to match the code in udcm.

Closes open-mpi/ompi#1468.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-21 10:00:41 -06:00
Nathan Hjelm
4d4fa28f75 opal: fix coverity issues
Fix CID 1345825 (1 of 1): Dereference before null check (REVERSE_INULL):

ib_proc should not be NULL in this case. Removed the check and added a
check for NULL after OBJ_NEW.

CID 1269821 (1 of 1): Dereference null return value (NULL_RETURNS):

I labeled this one as a false positive (which it is) but the code in
question could stand be be cleaned up.

Fix CID 1356424 (1 of 1): Argument cannot be negative (NEGATIVE_RETURNS):

While trying to silence another Coverity issue another was
flagged. Protect the close of fd with if (fd >= 0).

CID 70772 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70773 (1 of 1): Dereference null return value (NULL_RETURNS):
CID 70774 (1 of 1): Dereference null return value (NULL_RETURNS):

None of these are errors and are intentional but now that we have a
list release function use that to make these go away. The cleanup is
similar to CID 1269821.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-18 15:56:08 -06:00
Mike Dubman
8f1838df4b Merge pull request #1459 from alinask/topic/openib_diff_subnets
btl/openib: enable connecting processes from different subnets.
2016-03-17 08:40:08 +02:00
Jeff Squyres
d44804f0c9 usnic: use version 1 of the API, not the current version 2016-03-16 16:03:51 -07:00
Jeff Squyres
e7ef711455 usnic: allow mpool_hints to be empty
Follow on to open-mpi/ompi@eac0b11
2016-03-16 15:04:39 -07:00
Alina Sklarevich
bbcbe3cacd btl/openib: enable connecting processes from different subnets.
+ Added an mca parameter to allow connecting processes from different
subnets. Its current default value is 'false' - don't allow, to keep the
current flow the way it is now.

+ rmdacm: when calling ibv_query_gid, use the gid index from
btl_openib_gid_index.
2016-03-16 10:52:06 +02:00
Nathan Hjelm
eac0b110b8 btl/usnic: update for mpool/rcache rewrite
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Jeff Squyres
48c650c47a configury: minor updates to config summary output 2016-03-10 13:02:52 -08:00
Jeff Squyres
97714716ec usnic: add some cagent verification checks
Add primitive magic number and version checking in the connectivity
checker protocol.  These checks doesn't *guarantee* to we won't get
false PINGs and ACKs, but they do significantly reduce the possibility
of interpretating random incoming fragments as PINGs or ACKs.
2016-03-09 13:25:00 -08:00
Nathan Hjelm
f8469de832 Merge pull request #1415 from hjelmn/configure_summary
configure: add a summary section at the end of configure output
2016-03-09 12:25:39 -07:00
Nathan Hjelm
fdebebc4c0 Merge pull request #1439 from hjelmn/btl_openib_send_size
btl/openib: fix inconsistency in the default settings
2016-03-09 09:31:18 -07:00
Jeff Squyres
584b80147d usnic: set MCA_BTL_FLAGS_SINGLE_ADD_PROCS
The btl_recv.h:lookup_sender() function uses the hashed ORTE proc name
to determine the sender of the packet.  With add_procs_cutoff>0, the
usnic BTL may not have knowledge of all the senders.

Until the usNIC BTL can be adjusted to do something like the
openib/ugni BTLs (i.e., use opal_proc_for_name() to lookup unknown
sender proc names), set MCA_BTL_FLAGS_SINGLE_ADD_PROCS, which means
that ob1 will only all add_procs() once -- with all the procs in it.

Also in this commit, adapt the connectivity checker to not rely on
knowing all the senders (which is a bit easier than adapting the main
BTL send path): the receiving connectivity agent will simply echo back
the same PING message (which contains the sender's IP address+UDP
port) back to the sender without checking that it knows who the sender
is.  If the sender receives the echoed PING back on the expexted
interface, it will find a match in the pending pings list.  If the
sender receives the echoed PING back an unexpected interface, a match
will not be found, and the incoming PING message will be dropped.

Fixes open-mpi/ompi#1440
2016-03-08 17:41:42 -08:00
Jeff Squyres
4975fdcd5c usnic: allow connect(2) to fail temporarily
When connecting the connectivity checker client to its agent fails
with ECONNREFUSED, just delay a little and try again a few more times.
2016-03-08 15:35:34 -08:00
Nathan Hjelm
2ef2763f72 btl/openib: fix inconsistency in the default settings
This commit fixes an inconsistency between btl_openib_receive_queues,
btl_openib_max_send_size and btl_openib_eager_limit. Before this
commit if the ini file specified a set of default receive queues that
happen to not contain one large enough for the default max_send_size
of eager_limit users would see an error like:

   WARNING: The largest queue pair buffer size specified in the
   btl_openib_receive_queues MCA parameter is smaller than the maximum
   send size (i.e., the btl_openib_max_send_size MCA parameter), meaning
   that no queue is large enough to receive the largest possible incoming
   message fragment.  The OpenFabrics (openib) BTL will therefore be
   deactivated for this run.

     Local host: somehost
     Largest buffer size: 65536
     Maximum send fragment size: 131072

This commit adds code that detects the source of the max_send_size and
eager_limit values and sets either or both of them to the size
supported by the largest queue pair if both 1) the value is larger
than the largest queue pair size, and 2) the value was not set by the
user or a MCA configuration file.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-08 16:17:33 -07:00
Nathan Hjelm
d2f5fca82a configure: add a summary section at the end of configure output
This commit adds two m4 macros: OPAL_SUMMARY_ADD, OPAL_SUMMARY_PRINT.
OPAL_SUMMARY_ADD adds an item to a section in the summary. For example
OPAL_SUMMARY_ADD([[Transports]],[[Foo]],...,[yes]) will add the
following to the summary:

Transports
-----------------------
Foo: yes

With this commit two sections are added: Transports, Resource Managers.

The OPAL_SUMMARY_PRINT macro is called after AC_OUTPUT and prints out
some information about the build (version, projects, etc) and then
the summarys sections. It will additionally print a warning if
internal debugging is enabled.

Example output:

Open MPI configuration:
-----------------------
Version: 3.0.0 a1
Build Open Platform Abstration project: yes
Build Open Runtime project: yes
Build Open MPI project: yes
Build Open SHMEM project: no
MPI C++ bindings (deprecated): no
MPI Fortran bindings: mpif.h, use mpi, use mpi_f08
Debug build: yes

Transports
-----------------------
Cray uGNI (Gemini/Aries): no
Intel Omnipath (PSM2): no
KNEM Shared Memory: no
Linux CMA IPC: no
Mellanox MXM: no
Open UCX: no
OpenFabrics libfabric: no
OpenFabrics Verbs: no
portals4: no
QLogic Infinipath (PSM): no
tcp: yes
XPMEM Shared Memory: no

Resource Managers
-----------------------
Cray Alps: no
Grid Engine: no
LSF: no
Slurm: yes
Torque: yes

INTERNAL DEBUGGING IS ENABLED. DO NOT USE THIS BUILD FOR PERFORMANCE MEASUREMENTS!

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-03-08 10:04:15 -07:00
Nathan Hjelm
2a0b3a5700 btl/vader: various threading fixes
This commit fixes several threading bugs:

 - Add an additional lock to the btl_base_endpoint_t structure to lock
   the list of pending frags. This allows the progress function to
   attempt to send pending frags without needing to drop/reaquire the
   lock. This should provide a small improvement in performance and
   fixes a potential race between adding an removing items from the
   pending list.

 - Ensure fast boxes are only set up once by updating the send count
   using atomics when needed and do not set the fast box buffer
   pointer until the fast box is set up.

Closes open-mpi/ompi#1408

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-02 10:50:59 -07:00
Gilles Gouaillardet
477991b5aa btl/openib: fix abstraction violation and use opal_memory->memoryc_set_alignment 2016-02-24 09:50:13 +09:00
Nathan Hjelm
2031bb6f01 btl/openib: XRC save SRQ#s on the loopback endpoint
This commit fixes a bug that can occur when communicating via XRC to
peers on the same node. UDCM was not saving the SRQ numbers on the
loopback endpoint (which shares its ib_addr info with all local peers)
so any messages to local peers use an invalid SRQ number.

Fixes open-mpi/ompi#1383

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2016-02-18 20:59:11 -07:00
Nathan Hjelm
371df45bf8 btl/openib: fix locking bugs with XRC ib_addr lock
This bug fixes two issue with the ib_addr lock:

 - The ib_addr lock must always be obtained regardless of
   opal_using_threads() as the CPC is run in a seperate thread.

 - The ib_addr lock is held in mca_btl_openib_endpoint_connected when
   calling back into the CPC start_connect on any pending
   connections. This will attempt to obtain the ib_addr lock
   again. Since this is not a performance-critical part of the code
   the lock has been changed to be recursive.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-18 15:55:34 -07:00
Nathan Hjelm
4dc73d7765 btl/openib: XRC fix bug that could cause an invalid SRQ# to be used
This commit fixes a bug that occurs when attempting a get or put
operation on an endpoint that is not already connected. In this case
the remote_srqn may be set to an invalid value as the rem_srqs array
on the endpoint is not populated. This commit moves the usage of the
rem_srqs array to the internal put/get functions where it is
guaranteed this array is populated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-18 15:44:29 -07:00
Nathan Hjelm
4f4ea96940 btl/openib/udcm: fix local XRC connections
This commit ensures ib_addr->remote_xrc_rcv_qp_num value is set when
creating the loopback queue pair. This is needed when communicating
with any other local peer.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-17 14:54:19 -07:00
Nathan Hjelm
bf8360388f btl/openib/udcm: fix XRC support
This commit fixes two bugs in XRC support

 - When dynamic add_procs support was added to master the remote
   process name was added to the non-XRC request structure. The same
   value was not added to the XRC xconnect structure. This error was
   not caught because the send/recv code was incorrectly using the
   wrong structure member. This commmit adds the member and ensure the
   xconnect code uses the correct structure.

 - XRC loopback QP support has been fixed. It was 1) not setting the
   correct fields on the endpoint structure, 2) calling
   udcm_xrc_recv_qp_connect, and 3) was not initializing the endpoint
   data.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-16 16:49:04 -07:00
Nathan Hjelm
201c280e6c btl/openib: fix error in param check in mca_btl_openib_put
mca_btl_openib_put incorrectly checks the qp inline max before
allowing an inline put. This check will always fail for an endpoint
that has not been connected. The commit changes the check to use the
btl_put_local_registration_threshold instead.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-16 16:46:32 -07:00
Nathan Hjelm
123a39ac3c btl/openib: fix regression in XRC support
Commit open-mpi/ompi@400af6c52d
introduced a regression in XRC support. The commit reversed the
ordering of shared receive queue (SRQ) and completion queue (CQ)
completion. CQ creation must always preceed SRQ creation when using
XRC as the CQs are needed to create the SRQs. This commit fixes the
ordering so that CQs are always created before SRQs.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-16 16:46:20 -07:00
igor-ivanov
d9eefefa74 Merge pull request #1351 from igor-ivanov/pr/issue-1336
opal/memory: Move Memory Allocation Hooks usage from openib
2016-02-15 14:07:36 +04:00
Ralph Castain
aa9e5a1a27 Add support for Singularity containers, including a .m4 file for checking if Singularity is available and an orte/schizo component for setting the proper support if a container was given as the executable
Cleanup the configury so we properly check for Singularity under the various typical use-cases

Bring the Singularity support online. We have to turn "off" the sm BTL as it segfaults from inside the container - root cause remains unclear. Also turned "off" the various OPAL shmem components in case they are involved and someone else tries to use them. Happily, the vader BTL works just fine!
2016-02-13 04:40:22 -08:00
Igor Ivanov
8b05f308f9 opal/memory: Move Memory Allocation Hooks usage from openib
These changes fix issue https://github.com/open-mpi/ompi/issues/1336

- improve abstractions: opal/memory/linux component should be single place that opeartes with
Memory Allocation Hooks.
- avoid collisions in case dynamic component open/close: it is safe because it is linked statically.
- does not change original behaivour.
2016-02-11 14:46:35 +02:00
Jeff Squyres
8d0a592563 usnic: update a few verbose reachability messages 2016-02-06 03:28:48 -08:00
Jeff Squyres
87dbe6ce01 usnic: add high-verbose reachability messages 2016-02-06 03:28:47 -08:00
Jeff Squyres
dac2fe1589 usnic: ensure to use ntohl() for network-order values 2016-02-06 03:28:47 -08:00
Jeff Squyres
51240394a7 usnic: ensure to init module->av_eq_num 2016-02-06 03:28:47 -08:00
Jeff Squyres
89eea51075 usnic: fix calculation for number of blocks 2016-02-02 16:56:34 -08:00
Nathan Hjelm
cd11fc3081 btl/ugni: fix race condition that causes completions to be dropped
The send code in the ugni btl has an optimization that enables it to
return 1 (fragment gone) in some cases. This optimization involved
removing the btl ownership and callback flags to ensure the fragment
stuck around long enough for its completion flag to be checked. This
works fine for the single-threaded case but not in the multi-threaded
case. It is possible that a fragment will be completed by another
thread while a thread is in mca_btl_ugni_send. This competition can
lead to a leaked fragment, missed callback, or both. To fix the issue
without removing the optimization a reference count has been added to
the fragment. Callbacks and fragment release will not be made until
the fragment reference count has reach 0. The count is incremented
before sending the frag and decremented after the completion flag has
been checked. The fix has been verified to work using a multi-threaded
RMA benchmark with the osc/pt2pt component.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-02 12:14:31 -07:00
Nathan Hjelm
14704201e2 btl/ugni: fix race condition when adding endpoint to wait list
This commit fixes a race condition that can cause an endpoint to be
added to the wait list multiple times. To fix the issue an additional
check has been added to ensure the endpoint is not on the wait list
after the wait list lock is held. The wait list processing code has
also been updated to keep the wait list lock until all wait listed
endpoints have been handled. This reduces the chance that an endpoint
that is being processed by the wait list code is not re-added to the
list by a competing send.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-02-02 12:13:49 -07:00
Jeff Squyres
9f3ed00125 usnic: minor updates from code review
Three minor updates from the code review of
https://github.com/open-mpi/ompi-release/pull/933:

* Remove an extra blank line a show_help message
* We no longer allow -1 for the MCA param btl_usnic_av_eq_num, so
  change the flag to REGINT_GE_ONE
* Change "num_blocks" definition to be in terms of block_len (not
  eq_size)
2016-02-01 11:14:30 -08:00
Jeff Squyres
c2615a4732 usnic: change retrans timeout to 5ms
A bunch of empirical testing has shown that increasing the retranmit
timeout from 1ms to 5ms doesn't adversely affect performance, yet
decreases the number of gratuitious retransmissions.
2016-01-30 10:49:14 -08:00
Jeff Squyres
797d5026c8 usnic: better av_eq_num default value handling 2016-01-30 10:46:14 -08:00
Jeff Squyres
db825abc00 usnic: don't overrun the fi_av_insert() EQ
Add endpoints in a blocked manner so that we don't overrun the
fi_av_insert() event queue.  Also make the AV EQ length an MCA param,
and report it in mca_btl_base_verbose >=5 output.
2016-01-30 08:33:48 -08:00
Jeff Squyres
d624e0d60f usnic: fix wraparound sequence number issue
Sequence numbers will wrap around; it is not sufficient to check for
(seq-1) -- must use the SEQ_DIFF macro to properly handle the
wraparound.

This bug wasn't serious; it just meant we might retransmit one or two
extra times when retransmits were triggerd and the sequence numbers
wrapped around their sliding windows.
2016-01-30 08:32:13 -08:00
Jeff Squyres
4de4a263f5 usnic: ensure all messages are sent on the data channel
Messages should go on the data channel, even if they're short.  Only
ACKs go on the priority channel.
2016-01-30 08:31:21 -08:00
Jeff Squyres
348ac507c2 usnic: explain why we still have OPAL_HAVE_HWLOC
Put in a comment explaining why btl_usnic_compat.h still defines
OPAL_HAVE_HWLOC, even though master/v2.x no longer does.
2016-01-16 04:11:05 -08:00
Jeff Squyres
0f5fcf9029 usnic: fix common symbol 2016-01-16 03:55:27 -08:00
Jeff Squyres
60ffe713b8 common syms: whitelist bison-generated common symbols
Bison generates some common symbols that we can't do anything about,
so whitelist them.
2016-01-16 03:53:14 -08:00
Artem Polyakov
84e4fb308b Fix race condition in UDCM where service thread sees that
`cm_message_event_active == 1` but main thread has already stopped
processing messages and thus we will have the situation where one
message was left unhandled leading to a hang.
2016-01-08 23:56:21 +06:00
Jeff Squyres
6d073a8da4 btl_sm: add a comment explaining why we rename(2)
Per open-mpi/ompi#1230, add a comment explaining why we write to a
temporary file and then rename(2) the file, just so that future code
maintainers don't wonder why we do this seemingly-useless step.
2016-01-04 14:51:52 -05:00
Artem Polyakov
2abb2972ac Fix Mellanox copyrights with respect to the following PRs:
* https://github.com/open-mpi/ompi/pull/1184
* https://github.com/open-mpi/ompi/pull/1188
* https://github.com/open-mpi/ompi/pull/1197
* https://github.com/open-mpi/ompi/pull/1202
* https://github.com/open-mpi/ompi/pull/1210
* https://github.com/open-mpi/ompi/pull/1216
* https://github.com/open-mpi/ompi/pull/1236
* https://github.com/open-mpi/ompi/pull/1237
* https://github.com/open-mpi/ompi/pull/1248
* https://github.com/open-mpi/ompi/pull/1260
* https://github.com/open-mpi/ompi/pull/1264
2015-12-30 00:12:19 +06:00
Gilles Gouaillardet
fec973efda configury: test portability
replace test ... -o ... with test ... || test ...
and test ... -a ... with test ... && test ...
2015-12-28 13:58:45 +09:00
Nathan Hjelm
700a21022a Merge pull request #1260 from artpol84/openib_proc_account_fix
Openib proc accounting fix
2015-12-27 15:19:52 -07:00
Artem Polyakov
a20826e6b4 Fix vader resource leak.
This nasty bug was nicely masked. It was causing `mca_btl_vader_component.vader_frags_user`
overflow and as the result rear hangs of ompi-test-suite.
2015-12-28 00:41:45 +06:00
Gilles Gouaillardet
2d9aa38e6a btl/openib: fix heterogeneous support 2015-12-25 16:31:35 +09:00
Artem Polyakov
3031affdb7 Fix openib process accounting if procs was dynamically added. 2015-12-24 17:56:35 +06:00
Artem Polyakov
400af6c52d openib addproc improvements:
1. finer grained locks;
2. separate srq creation from cq adjustments.
2015-12-24 17:56:35 +06:00
Artem Polyakov
41c325f15a Shift common code for calculating a port count and btl_rank in openib
into the static function
2015-12-24 17:56:35 +06:00
Gilles Gouaillardet
5fa63f086a btl/tcp: add missing #include <unistd.h>
Thanks Marco Atzeri for contributing the original patch
2015-12-24 14:41:46 +09:00
Gilles Gouaillardet
15ed7ad9f5 btl/sm: add missing #include <unistd.h>
Thanks Marco Atzeri for contributing the original patch
2015-12-24 14:41:41 +09:00
Gilles Gouaillardet
42313acd58 btl/usnic: add missing #include <alloca.h> 2015-12-24 14:33:58 +09:00
Nathan Hjelm
84d890b7e7 Merge pull request #1248 from artpol84/openib_proc_init_race
Openib dynamic add proc race conditions
2015-12-22 21:48:05 -07:00
Artem Polyakov
08ad8357a8 Fix local process accounting in openib when dynamic add_proc is on. 2015-12-22 22:44:46 +06:00
Artem Polyakov
3c2f6d5560 Protect openib_btl->device data with explicit opal_mitex locks. 2015-12-22 18:33:26 +06:00
Gilles Gouaillardet
607d7c7545 btl/sm: rename file after file descriptor has been closed.
Thanks George for spotting this.
2015-12-22 13:56:53 +09:00
Artem Polyakov
e06bffe213 Fix ib_proc locking 2015-12-21 18:52:31 +06:00
Artem Polyakov
3eb4756a17 Force locking regardles to the opal_using_threads() setting. 2015-12-21 18:52:31 +06:00
Artem Polyakov
11b72d9add Make important fields of ib_proc volatile. 2015-12-21 18:52:31 +06:00
Artem Polyakov
86c0c3ec52 Provide additional information: whether ib_proc was newly created or
it was already existing.
2015-12-21 18:52:31 +06:00
Artem Polyakov
9325bd3d69 Protect device initialization 2015-12-21 18:52:31 +06:00
Artem Polyakov
0f77bc7ea7 Perform endpoint initialization atomically. 2015-12-21 18:52:31 +06:00
Artem Polyakov
afaf9c9ea6 Shift ib_proc initialization to the separate function. 2015-12-21 18:52:31 +06:00
Artem Polyakov
3c9fd567b6 Fix openib race condition when direct modex is used.
The problem was in mca_btl_openib_proc_create. This function may be called
from several places simultaneously:
* from the main thread when somebody wants to do `MPI_Send()` (for example) for
the first time;
* from udcm if the counterpart peer is trying to connect and `mca_btl_openib_get_ep()`
is called.

In this case one of the threads may add an uninitialized proc structure
to the `mca_btl_openib_component.ib_procs` and the other will read it and
treat as initialized.

This commit turns ib_proc initialization into a single atomic operation.
2015-12-21 18:52:30 +06:00
Gilles Gouaillardet
db4f483653 btl/sm: fix race condition
write to file and then rename, so when the file is open for read, its content is known to have been written.

Fixes open-mpi/ompi#1230
2015-12-21 16:37:51 +09:00
Nathan Hjelm
e77199fd4f Merge pull request #1235 from ggouaillardet/topic/ibv_exp_fixes
btl/openib: do not mix exp and non exp verbs
2015-12-17 08:36:09 -07:00
Gilles Gouaillardet
994a627f82 btl/openib: do not mix exp and non exp verbs 2015-12-17 16:45:43 +09:00
Artem Polyakov
0951a34e95 Fix openib memory registration limit calculation if cutoff = 0. 2015-12-17 13:45:19 +06:00
Jeff Squyres
2b9341a38a usnic: fix embarrissing typo 2015-12-15 19:01:19 -08:00
Jeff Squyres
944d5061a6 usnic: sendto() can return EPERM if we send too fast
If we send too fast, sendto() can run out of resources and return
EPERM.  So delay a little and try again.
2015-12-15 15:31:29 -08:00
Jeff Squyres
ab1bbca5b9 usnic: improve error message
When sendto() fails, it would be helpful to see the errno value.
2015-12-15 15:04:25 -08:00
Jeff Squyres
c1a6beac8d usnic: fix error message
There were too many "%s" instances.  Re-order the output so that we
show file, line, and then the error message.
2015-12-15 14:48:38 -08:00
Nathan Hjelm
c98086f028 Merge pull request #1223 from hjelmn/ib_use_srq
btl/openib: use only SRQ on ib by default
2015-12-15 14:04:19 -08:00
Nathan Hjelm
00da520fd5 Merge pull request #1222 from hjelmn/vader_fix
btl/vader: do not attempt to munmap opal/shmem pointer
2015-12-15 09:06:50 -08:00
Nathan Hjelm
b24b3a4ae4 btl/openib: use only SRQ on ib by default
It was decided some time ago that there is no benefit to using any
per-peer receive queues on infiniband. At the time we decided not to
change the default but that objection has been dropped. This commit
changes the 128 message queue to use SRQ instead of PP. This has no
impact on iWarp which sets the default in a different way.

Closes open-mpi/ompi#1156

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-15 09:48:03 -07:00
Nathan Hjelm
60591ae753 btl/vader: do not attempt to munmap opal/shmem pointer
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-15 08:48:04 -07:00
Todd Kordenbrock
7b97963669 btl-portals4: remove unnecessary PtlMDBind result check
When PtlMDBind was removed, the result check was left in which
causes intermittent failures depending on the junk value found in
the 'ret' variable.  The commit removes the result check.
2015-12-14 12:09:01 -06:00
Nathan Hjelm
f692576f1e btl/openib: add check for IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG
Mofed 2.2 does not have the IBV_EXP_QP_INIT_ATTR_ATOMICS_ARG attribute
flag. Add a check to fix compilation for mofed 2.2. This commit only
fixes complilation with the older mofed. It will not allow an Open MPI
compiled with mofed 2.3 or newer to work on a machine with mofed 2.2.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-09 17:02:36 -07:00
Todd Kordenbrock
2b7e983989 btl-portals4: set endpoint rank even if endpoint already exists
If btl-portals4 is configured to use logical mapping of ranks to
physical nodes, then the endpoint must have the rank field set.
This commit fixes a bug that caused the endpoint to have the
nid/pid instead of the rank if the endpoint already exists.
2015-12-08 12:29:00 -06:00
Nathan Hjelm
c9382f23e9 mlx5: need to set comp_mask to get experimental verbs attributes
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-12-08 10:34:16 -07:00
Nathan Hjelm
191aebb9c8 btl/openib: fix compile problems when using experimental verbs
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-30 22:21:26 -07:00
Nathan Hjelm
bb8e347371 btl/openib: update experimental verbs support
This update adds an additional check (if supported) to see if 8-byte
atomics are supported by the hardware. If 8-byte atomics are not
supported the atomics support is disabled.

This commit also includes some cleanup.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-30 12:32:04 -07:00
Nathan Hjelm
02a6c6856d btl/openib: add support for mlx5 atomic operations
This commit adds support for fetch-and-add and compare-and-swap when
using the mlx5 driver. The support is only enabled if the expanded
verbs interface is detected. This is required because mlx5 HCAs return
the atomic result in network byte order. This support may need to be
tweaked if Mellanox commits their changes into upstream verbs.

Closes open-mpi/ompi#1077

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-11-23 16:07:12 -07:00
Francois WELLENREITER
251009e0aa BTL portals4: remove useless PtlMDBind PtlMDRelease calls for RMDA 2015-11-19 14:51:00 +01:00
Matias A Cabral
254a05dbbb Default values for Intel HFI1 (OmniPath gen1 device) in openib btl 2015-11-11 12:35:35 -08:00
Jeff Squyres
b35b708979 tcp BTL: fix inconsistent whitespace problems
No code/logic changes.
2015-11-06 12:41:13 -08:00
Jeff Squyres
300cff2b89 usnic: fix/update the usnic stats
1. Fix: old v1.6-era code reset the stats-emitting event to fire twice
   for each time period.
1. Add the usNIC device name to the output for differentiating the
   output in multi-rail scenarios.
2015-11-06 12:05:34 -08:00
Nathan Hjelm
4ddbdad772 btl/openib: fix access flags
Per spec for ibv_reg_mr if remote write or remote atomic is requested also
need to specify local write.

Signed-off-by: Nathan Hjelm <hjelmn@me.com>
2015-11-04 15:23:11 -07:00
Rolf vandeVaart
2e2e175f13 Fix a few more places that utilized CUDA 4.1 checks 2015-10-30 09:43:24 -04:00
Nathan Hjelm
e10afcd354 udcm: fix bugs
This commit fixes the following bugs:

 - On send failure release newly allocated message.

 - In the destructor for udcm_message_sent_t always remove the send
   timeout event from the event base. Failure to do this can lead to
   memory corruption since the destructor may be called from an event
   callback.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-21 12:54:14 -06:00
Nathan Hjelm
55d24ee7a3 btl/openib: fix argument type for internal atomic function
This was fixed on my btl 3.0 branch but the changeset got lost in a
rebase. Fixes issues with lock ups when using osc/rdma.

References open-mpi/ompi#1010

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-20 13:47:28 -06:00
Howard Pritchard
eaba98ce5d btl/ugni: fix very poor aries bw problem
The handling of RDMA get alignment in ugni BTL for Aries
(cray xc) was wrong, resulting in very poor bandwidth
for ugni BTL on aries.

Verified using osu_bw now gives sensible bandwidth on
Aries.

Fixes #1005

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2015-10-13 16:01:17 -05:00
Nathan Hjelm
90db00e37f Merge pull request #996 from hjelmn/openib_progress_thread
btl/openib: remove extra threads
2015-10-08 07:31:27 -06:00
Nathan Hjelm
b8af310efa btl/openib: remove extra threads
This commit removes the service and async event threads from the
openib btl. Both threads are replaced by opal progress thread
support. The run_in_main function is now supported by allocating an
event and adding it to the sync event base. This ensures that the
requested function is called as part of opal_progress.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-07 12:30:41 -06:00
Nathan Hjelm
59aa93e1b6 opal/mpool: add support for passing access flags to register
This commit adds a access_flags argument to the mpool registration
function. This flag indicates what kind of access is being requested:
local write, remote read, remote write, and remote atomic. The values
of the registration access flags in the btl are tied to the new flags
in the mpool. All mpools have been updated to include the new argument
but only the grdma and udreg mpools have been updated to make use of
the access flags. In both mpools existing registrations are checked
for sufficient access before being returned. If a registration does
not contain sufficient access it is marked as invalid and a new
registration is generated.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-10-05 13:53:55 -06:00
Nathan Hjelm
3c33a8e94b btl/ugni: adjust exclusivity below sm and vader
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-29 15:40:35 -06:00
Nathan Hjelm
12bd300c40 Merge pull request #929 from hjelmn/add_procs
Update add_procs support
2015-09-28 17:29:13 -06:00
Todd Kordenbrock
3e63a3458c portals4: add support for dynamic add_procs() to all Portals4 components
In the default mode of operation, the Portals4 components support
dynamic add_procs().

The Portals4 components have two alternate modes (flow control and
logical-to-physical) that require knowledge of all procs at startup.
In these modes, mtl-portals4 sets the MCA_MTL_BASE_FLAG_REQUIRE_WORLD
flag and btl-portals4 sets the MCA_BTL_FLAGS_SINGLE_ADD_PROCS flag
to tell the PML that we need all the procs in one add_procs() call.
2015-09-24 22:12:57 -05:00
Todd Kordenbrock
3afac9e37d btl-portals4: fix PMIx integration problem
After PMIx integration, the thrid parameter to OPAL_MODEX_RECV() is
opal_process_name_t instead of opal_proc_t.  This commit replaces
proc with &proc->proc_name.
2015-09-24 21:53:20 -05:00
Nathan Hjelm
60f3dbd160 btl/openib: fix udcm coverity errors
Fix CID 1312120: Uninitialized scalar variable

The response type will always be set unless a message of another type is passed to this function. To make sure that error is caught I am adding an assert.

Fix CID 1312116: Dereference after null check

This is a potential bug. If there is no endpoint data for an incoming connection a rejection should be sent. In this case we would just SEGV.

Fix CID 1312115: Dereference after null check

Clear error in the error message. Use the queue pair number that was passed in.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-09-23 17:08:27 -06:00