1
1

27 Коммитов

Автор SHA1 Сообщение Дата
Howard Pritchard
782f1bb9af btl/sm: swat a compiler warning
gnu 6.3.1 complaining about uninitialized variable

Signed-off-by: Howard Pritchard <howardp@lanl.gov>
2017-04-21 10:02:56 -05:00
Gilles Gouaillardet
aeee48357a btl/sm: correctly handle nodes with zero NUMA hwloc object
the hwloc topology might not contain a NUMA object with hwloc < v2
if the node is not NUMA, so force the NUMA object count to one
in order to correctly allocate mca_btl_sm_component.sm_mpools.

Signed-off-by: Gilles Gouaillardet <gilles@rist.or.jp>
2017-01-12 11:45:29 +09:00
Ralph Castain
fe68f23099 Only instantiate the HWLOC topology in an MPI process if it actually will be used.
There are only five places in the non-daemon code paths where opal_hwloc_topology is currently referenced:

* shared memory BTLs (sm, smcuda). I have added a code path to those components that uses the location string
  instead of the topology itself, if available, thus avoiding instantiating the topology

* openib BTL. This uses the distance matrix. At present, I haven't developed a method
  for replacing that reference. Thus, this component will instantiate the topology

* usnic BTL. Uses the distance matrix.

* treematch TOPO component. Does some complex tree-based algorithm, so it will instantiate
  the topology

* ess base functions. If a process is direct launched and not bound at launch, this
  code attempts to bind it. Thus, procs in this scenario will instantiate the
  topology

Note that instantiating the topology on complex chips such as KNL can consume
megabytes of memory.

Fix pernode binding policy

Properly handle the unbound case

Correct pointer usage

Do not free static error messages!

Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-29 10:33:29 -08:00
Ralph Castain
3a2d6a5ab6 Begin to reduce reliance of application procs on the topology tree itself by having the daemon provide more detailed info. In this case, provide the topology description string so that procs can readily determine the number of types of objects on the node, and a "locality" string that describes which objects this process is executing upon. The latter allows a process to compute the objects of overlap between itself and another proc without consulting the topology tree.
Signed-off-by: Ralph Castain <rhc@open-mpi.org>
2016-12-28 09:14:26 -08:00
Pavel Shamis (Pasha)
d984b4b3f9 CMA: Fixing logic for CMA system call detection
The OPAL_CMA_NEED_SYSCALL_DEFS is always defined/set to 0 or 1.  Therefore
instead of checking if the macro is defined, we have to look at the value
itself.

Signed-off-by: Pavel Shamis (Pasha) <pasharesearch@gmail.com>
2016-05-24 14:53:25 -05:00
Nathan Hjelm
d4afb16f5a opal: rework mpool and rcache frameworks
This commit rewrites both the mpool and rcache frameworks. Summary of
changes:

 - Before this change a significant portion of the rcache
   functionality lived in mpool components. This meant that it was
   impossible to add a new memory pool to use with rdma networks
   (ugni, openib, etc) without duplicating the functionality of an
   existing mpool component. All the registration functionality has
   been removed from the mpool and placed in the rcache framework.

 - All registration cache mpools components (udreg, grdma, gpusm,
   rgpusm) have been changed to rcache components. rcaches are
   allocated and released in the same way mpool components were.

 - It is now valid to pass NULL as the resources argument when
   creating an rcache. At this time the gpusm and rgpusm components
   support this. All other rcache components require non-NULL
   resources.

 - A new mpool component has been added: hugepage. This component
   supports huge page allocations on linux.

 - Memory pools are now allocated using "hints". Each mpool component
   is queried with the hints and returns a priority. The current hints
   supported are NULL (uses posix_memalign/malloc), page_size=x (huge
   page mpool), and mpool=x.

 - The sm mpool has been moved to common/sm. This reflects that the sm
   mpool is specialized and not meant for any general
   allocations. This mpool may be moved back into the mpool framework
   if there is any objection.

 - The opal_free_list_init arguments have been updated. The unused0
   argument is not used to pass in the registration cache module. The
   mpool registration flags are now rcache registration flags.

 - All components have been updated to make use of the new framework
   interfaces.

As this commit makes significant changes to both the mpool and rcache
frameworks both versions have been bumped to 3.0.0.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2016-03-14 10:50:41 -06:00
Gilles Gouaillardet
15ed7ad9f5 btl/sm: add missing #include <unistd.h>
Thanks Marco Atzeri for contributing the original patch
2015-12-24 14:41:41 +09:00
Gilles Gouaillardet
d5af5d106c btl/sm: mca_btl_sm_sendi: do not set *descriptor when descriptor is NULL 2015-09-14 14:04:40 +09:00
Ralph Castain
d97bc29102 Remove OPAL_HAVE_HWLOC qualifier and error out if --without-hwloc is given 2015-09-04 16:54:40 -07:00
Ralph Castain
869041f770 Purge whitespace from the repo 2015-06-23 20:59:57 -07:00
Adrian Reber
f45dd069bd FT: fix compilation using --with-ft (1/5)
Enabling the FT code breaks compilation (again). This series
tries to fix the compiler errors. This is again only fixing
the compiler errors without any warranty that the result
might actually support FT again.

This first patch moves orte_cr_continue_like_restart from ORTE
to opal_cr_continue_like_restart in OPAL. This only leaves three
calls from OPAL to ORTE in the FT code. As it is not yet 100%
clear how to handle these calls the code orte_sstore.set_attr()
has been #ifdef'd out for now.
2015-03-11 14:23:33 +01:00
Nathan Hjelm
5f1254d710 Update code base to use the new opal_free_list_t
Use of the old ompi_free_list_t and ompi_free_list_item_t is
deprecated. These classes will be removed in a future commit.

This commit updates the entire code base to use opal_free_list_t and
opal_free_list_item_t.

Notes:

OMPI_FREE_LIST_*_MT -> opal_free_list_* (uses opal_using_threads ())

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-24 10:05:45 -07:00
Nathan Hjelm
ed78553512 Update opal_free_list_t usage to reflect new class interface.
Please verify your components have been updated correctly. Keep in
mind that in terms of threading:

OPAL_FREE_LIST_GET -> opal_free_list_get_st
OPAL_FREE_LIST_RETURN -> opal_free_list_return_st

I used the opal_using_threads() variant anytime it appeared multiple
threads could be operating on the free list. If this is not the case
update to _st. If multiple threads are always in use change to _mt.
2015-02-24 10:05:44 -07:00
Gilles Gouaillardet
28714b60cb btl/sm: fix misc errors
as reported by Coverity as CIDs 711636 and 1269847
2015-02-18 17:05:19 +09:00
George Bosilca
a7a4d6335e Various cleanups. 2015-02-15 11:39:09 -05:00
Nathan Hjelm
25176cad27 btl/sm: update for BTL 3.0 interface
Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
fc7397949c btl: require that btls handle descriptor = NULL in the btl_sendi function
The send inline optimization uses the btl_sendi function to achieve lower
latency and higher message rates. Before this commit BTLs were allowed to
assume the descriptor was non-NULL and were expected to return a valid
descriptor if the send could not be completed using btl_sendi. This
behavior was fine until the usage of btl_sendi was changed in ob1. This
commit allows the caller to specify NULL for the descriptor. The affected
btls have been updated to handle this case.

Signed-off-by: Nathan Hjelm <hjelmn@lanl.gov>
2015-02-13 11:46:36 -07:00
Nathan Hjelm
1b564f62bd Revert "Merge pull request #275 from hjelmn/btlmod"
This reverts commit ccaecf0fd6c862877e6a1e2643f95fa956c87769, reversing
changes made to 6a19bf85dde5306f559f09952cf3919d97f52502.
2014-11-19 23:22:43 -07:00
Nathan Hjelm
4ccb20b097 btl: fix warning about enum type and modify btl_sendi to allow the
value NULL for the descriptor

The send inline optimization uses the btl_sendi function to achieve
lower latency and higher message rates. The problem is the btl_sendi
function was allowed to return a descriptor to the caller. This is fine
for some paths but not ok for the send inline optimization. To fix
this the btl now must be able to handle descriptor = NULL.
2014-11-19 11:33:03 -07:00
Nathan Hjelm
249e5e009f Fix knem support in both sm and vader 2014-11-19 11:33:02 -07:00
Nathan Hjelm
e03956e099 Update the scif and openib btls for the new btl interface
Other changes:
 - Remove the registration argument from prepare_src since it no
   longer is meant for RDMA buffers.

 - Additional cleanup and bugfixes.
2014-11-19 11:33:02 -07:00
Nathan Hjelm
ec33374339 btl: remove des_remote/des_remote_count from the mca_btl_base_descriptor_t
structure

This structure member was originally used to specify the remote segment
for an RDMA operation. Since the new btl interface no longer uses
desriptors for RDMA this member no longer has a purpose. In addition
to removing these members the local segment information has been
renamed to des_segments/des_segment_count.
2014-11-19 11:33:02 -07:00
Ralph Castain
780c93ee57 Per the PR and discussion on today's telecon, extend the process name definition as a two-field struct of uint32_t's down to the OPAL layer. This resolves issues created by prior commits that impacted both heterogeneous and SPARC support. This also simplifies the OMPI code base by removing the need for frequent memcpy's when transitioning between the OMPI/ORTE layers and OPAL.
We recognize that this means other users of OPAL will need to "wrap" the opal_process_name_t if they desire to abstract it in some fashion. This is regrettable, and we are looking at possible alternatives that might mitigate that requirement. Meantime, however, we have to put the needs of the OMPI community first, and are taking this step to restore hetero and SPARC support.
2014-11-11 17:00:42 -08:00
Aurélien Bouteiller
e3be1fb9a5 Quick pass over the sm-knem code, indent fixes 2014-10-17 10:38:35 -04:00
George Bosilca
7541c03b4c Mark all instances where atomic operations are used but their return value is unnecessary 2014-10-15 21:47:32 -04:00
George Bosilca
f217661ee0 Use opal_process_info whenever possible. Some other minor cleanups.
This commit was SVN r32325.
2014-07-26 21:48:23 +00:00
Ralph Castain
552c9ca5a0 George did the work and deserves all the credit for it. Ralph did the merge, and deserves whatever blame results from errors in it :-)
WHAT:    Open our low-level communication infrastructure by moving all necessary components (btl/rcache/allocator/mpool) down in OPAL

All the components required for inter-process communications are currently deeply integrated in the OMPI layer. Several groups/institutions have express interest in having a more generic communication infrastructure, without all the OMPI layer dependencies.  This communication layer should be made available at a different software level, available to all layers in the Open MPI software stack. As an example, our ORTE layer could replace the current OOB and instead use the BTL directly, gaining access to more reactive network interfaces than TCP.  Similarly, external software libraries could take advantage of our highly optimized AM (active message) communication layer for their own purpose.  UTK with support from Sandia, developped a version of Open MPI where the entire communication infrastucture has been moved down to OPAL (btl/rcache/allocator/mpool). Most of the moved components have been updated to match the new schema, with few exceptions (mainly BTLs where I have no way of compiling/testing them). Thus, the completion of this RFC is tied to being able to completing this move for all BTLs. For this we need help from the rest of the Open MPI community, especially those supporting some of the BTLs.  A non-exhaustive list of BTLs that qualify here is: mx, portals4, scif, udapl, ugni, usnic.

This commit was SVN r32317.
2014-07-26 00:47:28 +00:00