1
1
openmpi/ompi/mca/btl
Jeff Squyres fb2e543a57 Refs trac:3275.
We ran into a case where the OMPI SVN trunk grew a new acceptable MCA
parameter value, but this new value was not accepted on the v1.6
branch (hwloc_base_mem_bind_failure_action -- on the trunk it accepts
the value "silent", but on the older v1.6 branch, it doesn't).  If you
set "hwloc_base_mem_bind_failure_action=silent" in the default MCA
params file and then accidentally ran with the v1.6 branch, every OMPI
executable (including ompi_info) just failed because hwloc_base_open()
would say "hey, 'silent' is not a valid value for
hwloc_base_mem_bind_failure_action!".  Kaboom.

The only problem is that it didn't give you any indication of where
this value was being set.  Quite maddening, from a user perspective.

So we changed the ompi_info handles this case.  If any framework open
function return OMPI_ERR_BAD_PARAM (either because its base MCA params
got a bad value or because one of its component register/open
functions return OMPI_ERR_BAD_PARAM), ompi_info will stop, print out
a warning that it received and error, and then dump out the parameters
that it has received so far in the framework that had a problem.

At a minimum, this will show the user the MCA param that had an error
(it's usually the last one), and ''where it was set from'' (so that
they can go fix it).  

We updated ompi_info to check for O???_ERR_BAD_PARAM from each from
the framework opens.  Also updated the doxygen docs in mca.h for this
O???_BAD_PARAM behavior.  And we noticed that mca.h had MCA_SUCCESS
and MCA_ERR_??? codes.  Why?  I think we used them in exactly one
place in the code base (mca_base_components_open.c).  So we deleted
those and just used the normal OPAL_* codes instead.

While we were doing this, we also cleaned up a little memory
management during ompi_info/orte-info/opal-info finalization.
Valgrind still reports a truckload of memory still in use at ompi_info
termination, but they mostly look to be components not freeing
memory/resources properly (and outside the scope of this fix).

This commit was SVN r27306.

The following Trac tickets were found above:
  Ticket 3275 --> https://svn.open-mpi.org/trac/ompi/ticket/3275
2012-09-11 20:47:24 +00:00
..
base Backout the ORCA commit. :( 2012-06-27 01:28:28 +00:00
mx Make it compile! 2012-07-08 00:06:13 +00:00
ofud 1. Adding 2 new components: 2012-07-02 15:20:12 +00:00
openib Refs trac:3275. 2012-09-11 20:47:24 +00:00
portals Timeout! Per RFC update the BTL interface to hide segment keys. All BTLs (with the exception of wv), all relevant PMLs, and osc/rdma have been updated for the new interface. 2012-06-21 17:09:12 +00:00
sctp Remove use of ompi_ptr_ltop in BTLs. This fixes a crash seen on big-endian 32-bit platforms with MPI one-sided. 2012-07-10 16:18:53 +00:00
self Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments. 2012-07-13 21:19:16 +00:00
sm Getting out of bed this morning was a bad idea... Reverting the sm update once more because it breaks direct launch. Will address this issue and commit the update once it has all been tested. Sorry everyone! 2012-08-10 22:20:38 +00:00
smcuda Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments. 2012-07-13 21:19:16 +00:00
tcp Windows doesn't need to exclude any interface by default. This will avoid tcp warnings. 2012-09-11 15:39:37 +00:00
template Add a new missing field to the template BTL module that was causing a 2012-07-24 12:55:12 +00:00
udapl Backout the ORCA commit. :( 2012-06-27 01:28:28 +00:00
ugni Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments. 2012-07-13 21:19:16 +00:00
vader Fix a bug on 32-bit systems introduced by r26626. This fix ensures that all supported btls (with exception of wv-- shiqing will need to help bring that one up to date with r26626) set the lval in prepare_src/dst when preparing a put or get segment. This fix also ensures a consistent use of lval in put and get for both local and remote segments. 2012-07-13 21:19:16 +00:00
wv update the wv btl component. 2012-07-26 15:35:01 +00:00
btl.h Timeout! Per RFC update the BTL interface to hide segment keys. All BTLs (with the exception of wv), all relevant PMLs, and osc/rdma have been updated for the new interface. 2012-06-21 17:09:12 +00:00
Makefile.am Fix mistake that came in via the ompi-agen tree in r23764. The mistake wasn't part of the core autogen upgrade; it was an additional 'bonus' cleanup. Oops. The mistake will always create a set of directories under installdir, even if you do not --with-devel-headers. The set of directories will be empty, but still -- they should not be there at all. This commit fixes that -- the directories are not created at all if you do not --with-devel-headers 2010-09-24 22:53:28 +00:00